Covariance Matrix#
Historical context#
Generally, the covariance matrix is a matrix that describes the relationship between the elements of a multivariate normal distribution. The covariance matrix is a square matrix of size [n x n] where n is the number of elements in the multivariate normal distribution. The covariance matrix is a symmetric matrix and is defined as follows:
The covariance matrix for the material balance sequence is of particular importance for several statistical tests, including SITMUF and GEMUF. It’s impossible to know the exact matrix in practice, but it can be estimated by knowing expected sensor performance.
Recall that the muf sequence is defined as follows:
With
It’s generally assumed that since the error models are normally distributed, individual muf values (i.e., mbp\(_i\)) and the muf sequence (i.e., muf) will also be normally distributed. Consequently, the muf sequence can be thought of as a multivariate normal distribution such that:
The covariance matrix contains the covariance between different material balances in the sequence. For example, consider the entry \(\sigma_{2n}^2\) of the covariance matrix below. This term is the variance between material balance \(n\) and \(2\).
The covariance matrix itself is often calculated using relative standard deviations, similar to the calculation for \(\sigma\)muf. In fact, the diagonal terms (i.e., \(\Sigma_{1,1}, \Sigma_{2,2}, ... \)) are the variance of the material balance (the covariance of material balance \(i\) with itself is the variance). There’s two key expressions; one for the covariance diagonals (i.e., \(\sigma_{x,x}\)) and one for the covariance off diagonals (i.e., \(\sigma_{x,x^{\prime}}\)). These are calculated by using various variance and covariance rules and propagating the terms.
Recall the expression that was derived from \(\sigma\)muf; which is used for the diagonals for the covariance matrix:
The off-diagonal is calculated in a similar manner, but has more terms. The off-diagonal is the covariance between material balance \(i\) and \(j\).
Following the same rules used to derive \(\sigma\)muf leads to the expression for the covariance off diagonal:
Where
Implementation#
The covariance matrix is a \(NxN\) matrix at the N-th material balance period. Since MAPIT adopts the “NRTA” style calculation, all \(NxN\) entries must be updated at each balance period which simulates the arrival of new information. The MAPIT SITMUF calculation is not well vectorized as the \(NxN\) must be resized and calculated at each balance. The calculation starts by looping over balance periods and each entry in the covariance matrix at balance \(P\):
The covariance matrix is a \(NxN\) matrix at the N-th material balance period. Since MAPIT adopts the “NRTA” style calculation, all \(NxN\) entries must be updated at each balance period which simulates the arrival of new information. The MAPIT SITMUF calculation is not well vectorized as the \(NxN\) must be resized and calculated at each balance. The calculation starts by looping over balance periods and each entry in the covariance matrix at balance \(P\):
AuxFunctions.py
389    for currentMB in range(1, int(totalMBPs)):
390        for j in range(0, currentMB):
The variables for the different times have a different meaning than in the expressions that were defined above. This is for legacy purposes and to improve alignment with the papers. The following table describes the mapping between the derived expressions and associated code:
Model Component  | 
Code Expression  | 
|---|---|
Balance \(i\)  | 
  | 
Balance \(j\)  | 
  | 
For simplicity, the diagonal and off-diagonal terms are broken into multiple components.
Diagonal terms#
Term 1 
AuxFunctions.py
Term 1 
AuxFunctions.py
422                for k in range(len(inputAppliedError)):
423
424                    logicalInterval = np.logical_and(processedInputTimes[k] >= IPrevious_time,processedInputTimes[k] <= I_time).reshape((-1,))  #select the indices for the relevant time
425
426                    if inputTypes[k] == 'continuous':
427                        term1 += trapSum(logicalInterval,processedInputTimes[k],inputAppliedError[k]) **2 * (ErrorMatrix[k, 0]**2 + ErrorMatrix[k, 1]**2)
428                    elif inputTypes[k] == 'discrete':
429                        term1 += (inputAppliedError[k][:, logicalInterval].sum(axis=1)**2 * (ErrorMatrix[k, 0]**2 + ErrorMatrix[k, 1]**2 ))
430                    else:
431                        raise Exception("inputTypes[j] is not 'continuous' or 'discrete'")                      
Term 2 
AuxFunctions.py
Term 2 
AuxFunctions.py
438                for k in range(len(outputAppliedError)):
439
440                    logicalInterval = np.logical_and(processedOutputTimes[k] >= IPrevious_time,processedOutputTimes[k] <= I_time).reshape((-1,))
441                    locMatrixRow = k + len(inputAppliedError) + len(inventoryAppliedError)                      
442
443                    if outputTypes[k] == 'continuous':
444                        term2 += trapSum(logicalInterval,processedOutputTimes[k],outputAppliedError[k])**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
445                    elif outputTypes[k] == 'discrete':
446                        term2 += (outputAppliedError[k][:, logicalInterval].sum(axis=1)**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2 ))
447                    else:
448                        raise Exception("outputTypes[j] is not 'continuous' or 'discrete'")
Term 3 
AuxFunctions.py
Term 3 
AuxFunctions.py
453                for k in range(len(inventoryAppliedError)):
454                    locMatrixRow = k+len(inputAppliedError)
455
456                    startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin()
457                    endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
458
459                    term3 += inventoryAppliedError[k][:,endIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
460
461
462                if j != 0:
463                    for k in range(len(inventoryAppliedError)):
464                        locMatrixRow = k + len(inputAppliedError)
465                        startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -IPrevious_time).argmin()
466                        endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
467
468                        term4 += inventoryAppliedError[k][:,startIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
469                        term5 += inventoryAppliedError[k][:,startIdx] * inventoryAppliedError[k][:,endIdx] * ErrorMatrix[locMatrixRow, 1]**2
470
471                covmatrix[:,j,j] = term1 + term2 + term3 + term4 - 2 * term5
Term 4
AuxFunctions.py
Term 4 
AuxFunctions.py
453                for k in range(len(inventoryAppliedError)):
454                    locMatrixRow = k+len(inputAppliedError)
455
456                    startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin()
457                    endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
458
459                    term3 += inventoryAppliedError[k][:,endIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
460
461
462                if j != 0:
463                    for k in range(len(inventoryAppliedError)):
464                        locMatrixRow = k + len(inputAppliedError)
465                        startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -IPrevious_time).argmin()
466                        endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
467
468                        term4 += inventoryAppliedError[k][:,startIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
469                        term5 += inventoryAppliedError[k][:,startIdx] * inventoryAppliedError[k][:,endIdx] * ErrorMatrix[locMatrixRow, 1]**2
470
471                covmatrix[:,j,j] = term1 + term2 + term3 + term4 - 2 * term5
Term 5 
AuxFunctions.py
Term 5 
 
AuxFunctions.py
453                for k in range(len(inventoryAppliedError)):
454                    locMatrixRow = k+len(inputAppliedError)
455
456                    startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin()
457                    endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
458
459                    term3 += inventoryAppliedError[k][:,endIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
460
461
462                if j != 0:
463                    for k in range(len(inventoryAppliedError)):
464                        locMatrixRow = k + len(inputAppliedError)
465                        startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -IPrevious_time).argmin()
466                        endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
467
468                        term4 += inventoryAppliedError[k][:,startIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
469                        term5 += inventoryAppliedError[k][:,startIdx] * inventoryAppliedError[k][:,endIdx] * ErrorMatrix[locMatrixRow, 1]**2
470
471                covmatrix[:,j,j] = term1 + term2 + term3 + term4 - 2 * term5
Off-diagonal terms#
Term 1 
AuxFunctions.py
Term 1 
AuxFunctions.py
500                for k in range(len(inputAppliedError)):
501                    logicalInterval = np.logical_and(processedInputTimes[k] >= IPrevious_time, processedInputTimes[k] <= I_time).reshape((-1,))  #select the indices for the relevant time
502                    logicalInterval2 = np.logical_and(processedInputTimes[k] >= IPrimePrevious_time, processedInputTimes[k] <= IPrime_time).reshape((-1,))  #select the indices for the relevant time
503                    
504                    if inputTypes[k] == 'continuous':
505                        A = trapSum(logicalInterval, processedInputTimes[k],inputAppliedError[k])
506                        B = trapSum(logicalInterval2, processedInputTimes[k],inputAppliedError[k])
507                    elif inputTypes[k] == 'discrete': 
508                        A = inputAppliedError[k][:, logicalInterval].sum(axis=1)
509                        B = inputAppliedError[k][:, logicalInterval2].sum(axis=1)
510                    else:
511                        raise Exception("inputTypes[j] is not 'continuous' or 'discrete'")                   
512
513                    C = ErrorMatrix[k, 1]**2
514                    term1 += (A*B*C)
Term 2 
AuxFunctions.py
Term 2 
AuxFunctions.py
519                for k in range(len(outputAppliedError)):
520                    logicalInterval = np.logical_and(processedOutputTimes[k] >= IPrevious_time,processedOutputTimes[k] <= I_time).reshape((-1,))  #select the indices for the relevant time
521                    logicalInterval2 = np.logical_and(processedOutputTimes[k] >= IPrimePrevious_time,processedOutputTimes[k] <= IPrime_time).reshape((-1,))  #select the indices for the relevant time
522                    locMatrixRow = k + len(inputAppliedError) + len(inventoryAppliedError)
523
524                    
525
526                    if outputTypes[k] == 'continuous':
527                        A = trapSum(logicalInterval, processedOutputTimes[k],outputAppliedError[k])
528                        B = trapSum(logicalInterval2, processedOutputTimes[k],outputAppliedError[k])
529                    elif outputTypes[k] == 'discrete': 
530                        A = outputAppliedError[k][:, logicalInterval].sum(axis=1)
531                        B = outputAppliedError[k][:, logicalInterval2].sum(axis=1)
532                    else:
533                        raise Exception("outputTypes[j] is not 'continuous' or 'discrete'")  
534                    
535                    C = ErrorMatrix[locMatrixRow, 1]**2
536                    term2 += (A*B*C)
Term 3 
AuxFunctions.py
Term 3 
AuxFunctions.py
541                for k in range(len(inventoryAppliedError)):
542                    startIdx =  np.abs(processedInventoryTimes[k].reshape((-1,)) -  I_time).argmin() #I
543                    endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -  IPrime_time).argmin() 
544                    startIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin() 
545                    endIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrimePrevious_time).argmin() 
546                    locMatrixRow = k + len(inputAppliedError)
547
548                    term3a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx]
549                    term3b = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx2]
550                    term3c = ErrorMatrix[locMatrixRow, 1]**2
551                    term3 += (term3a+term3b)*term3c
552
553                    term4a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx2]
554                    term4b = ErrorMatrix[locMatrixRow, 1]**2
555                    if IPrime-1 == I:
556                        term4c = ErrorMatrix[locMatrixRow, 0]**2
557                    else:
558                        term4c = np.zeros((iterations,))
559
560
561                    term4 += term4a*(term4b+term4c)
562
563                    term5a = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx]
564                    term5b = ErrorMatrix[locMatrixRow, 1]**2
565
566                    if I - 1 == IPrime:
567                        term5c = ErrorMatrix[locMatrixRow, 0]**2
568                    else:
569                        term5c = np.zeros((iterations,))
570
571                    term5 += term5a*(term5b+term5c)
572                    
573                    
574
575                covmatrix[:,j,currentMB-1] = term1+term2+term3-term4-term5
576                covmatrix[:,currentMB-1,j] = term1+term2+term3-term4-term5
Term 4
AuxFunctions.py
Term 4 
AuxFunctions.py
541                for k in range(len(inventoryAppliedError)):
542                    startIdx =  np.abs(processedInventoryTimes[k].reshape((-1,)) -  I_time).argmin() #I
543                    endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -  IPrime_time).argmin() 
544                    startIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin() 
545                    endIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrimePrevious_time).argmin() 
546                    locMatrixRow = k + len(inputAppliedError)
547
548                    term3a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx]
549                    term3b = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx2]
550                    term3c = ErrorMatrix[locMatrixRow, 1]**2
551                    term3 += (term3a+term3b)*term3c
552
553                    term4a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx2]
554                    term4b = ErrorMatrix[locMatrixRow, 1]**2
555                    if IPrime-1 == I:
556                        term4c = ErrorMatrix[locMatrixRow, 0]**2
557                    else:
558                        term4c = np.zeros((iterations,))
559
560
561                    term4 += term4a*(term4b+term4c)
562
563                    term5a = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx]
564                    term5b = ErrorMatrix[locMatrixRow, 1]**2
565
566                    if I - 1 == IPrime:
567                        term5c = ErrorMatrix[locMatrixRow, 0]**2
568                    else:
569                        term5c = np.zeros((iterations,))
570
571                    term5 += term5a*(term5b+term5c)
572                    
573                    
574
575                covmatrix[:,j,currentMB-1] = term1+term2+term3-term4-term5
576                covmatrix[:,currentMB-1,j] = term1+term2+term3-term4-term5
Term 5 
AuxFunctions.py
Term 5 
 
AuxFunctions.py
541                for k in range(len(inventoryAppliedError)):
542                    startIdx =  np.abs(processedInventoryTimes[k].reshape((-1,)) -  I_time).argmin() #I
543                    endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -  IPrime_time).argmin() 
544                    startIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin() 
545                    endIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrimePrevious_time).argmin() 
546                    locMatrixRow = k + len(inputAppliedError)
547
548                    term3a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx]
549                    term3b = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx2]
550                    term3c = ErrorMatrix[locMatrixRow, 1]**2
551                    term3 += (term3a+term3b)*term3c
552
553                    term4a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx2]
554                    term4b = ErrorMatrix[locMatrixRow, 1]**2
555                    if IPrime-1 == I:
556                        term4c = ErrorMatrix[locMatrixRow, 0]**2
557                    else:
558                        term4c = np.zeros((iterations,))
559
560
561                    term4 += term4a*(term4b+term4c)
562
563                    term5a = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx]
564                    term5b = ErrorMatrix[locMatrixRow, 1]**2
565
566                    if I - 1 == IPrime:
567                        term5c = ErrorMatrix[locMatrixRow, 0]**2
568                    else:
569                        term5c = np.zeros((iterations,))
570
571                    term5 += term5a*(term5b+term5c)
572                    
573                    
574
575                covmatrix[:,j,currentMB-1] = term1+term2+term3-term4-term5
576                covmatrix[:,currentMB-1,j] = term1+term2+term3-term4-term5