Covariance Matrix#

Historical context#

Generally, the covariance matrix is a matrix that describes the relationship between the elements of a multivariate normal distribution. The covariance matrix is a square matrix of size [n x n] where n is the number of elements in the multivariate normal distribution. The covariance matrix is a symmetric matrix and is defined as follows:

Σ=[σ11σ12σ13σ1nσ21σ22σ23σ2nσ31σ32σ33σ3nσn1σn2σn3σnn]

The covariance matrix for the material balance sequence is of particular importance for several statistical tests, including SITMUF and GEMUF. It’s impossible to know the exact matrix in practice, but it can be estimated by knowing expected sensor performance.

Recall that the muf sequence is defined as follows:

muf={muf0,muf1,...mufn}

With

mufi=ll0t=MBPi1MBPiIt,lll1t=MBPi1MBPiOt,lll2(Ci,lCi1,l)

It’s generally assumed that since the error models are normally distributed, individual muf values (i.e., mbpi) and the muf sequence (i.e., muf) will also be normally distributed. Consequently, the muf sequence can be thought of as a multivariate normal distribution such that:

mufN(μ,Σ)

The covariance matrix contains the covariance between different material balances in the sequence. For example, consider the entry σ2n2 of the covariance matrix below. This term is the variance between material balance n and 2.

(1)#Σ=[σ112σ122σ1n2σ212σ222σ2n2σn12σn22σnn2]=[Σi1σi1σi1Tσi,i]

The covariance matrix itself is often calculated using relative standard deviations, similar to the calculation for σmuf. In fact, the diagonal terms (i.e., Σ1,1,Σ2,2,...) are the variance of the material balance (the covariance of material balance i with itself is the variance). There’s two key expressions; one for the covariance diagonals (i.e., σx,x) and one for the covariance off diagonals (i.e., σx,x). These are calculated by using various variance and covariance rules and propagating the terms.

Recall the expression that was derived from σmuf; which is used for the diagonals for the covariance matrix:

σi,i2ll0((t=MBPi1MBPiIl,t)2((δR,l)2+(δS,l)2))+ll2((Ci,l)2((δR,l)2+(δS,l)2))+ll2((Ci1,l)2((δR,l)2+(δS,l)2))+ll1((t=MBPi1MBPiOl,t)2((δR,l)2+(δS,l)2))

The off-diagonal is calculated in a similar manner, but has more terms. The off-diagonal is the covariance between material balance i and j.

σi,j2cov(ll0t=MBPi1MBPiIt,lll1t=MBPi1MBPiOt,lll2(Ci,lCi1,l),ll0t=MBPj1MBPjIt,lll1t=MBPj1MBPjOt,lll2(Cj,lCj1,l))

Following the same rules used to derive σmuf leads to the expression for the covariance off diagonal:

σi,j2ll2((Ci,lCj,l+Ci1,lCj1,l)(δS,l)2)ll2((Ci,lCj1,l)((δS,l)2+P(j1==i)(δR,l)2))ll2((Ci1,lCj,l)((δS,l)2+P(i1==j)(δR,l)2))+ll0((t=MBPi1MBPjIt,l)(t=MBPj1MBPjIt,l)(δS,l)2)+ll1((t=MBPi1MBPjOt,l)(t=MBPj1MBPjOt,l)(δS,l)2)

Where

[P]{0P is false1P is true

Implementation#

The covariance matrix is a NxN matrix at the N-th material balance period. Since MAPIT adopts the “NRTA” style calculation, all NxN entries must be updated at each balance period which simulates the arrival of new information. The MAPIT SITMUF calculation is not well vectorized as the NxN must be resized and calculated at each balance. The calculation starts by looping over balance periods and each entry in the covariance matrix at balance P:

The covariance matrix is a NxN matrix at the N-th material balance period. Since MAPIT adopts the “NRTA” style calculation, all NxN entries must be updated at each balance period which simulates the arrival of new information. The MAPIT SITMUF calculation is not well vectorized as the NxN must be resized and calculated at each balance. The calculation starts by looping over balance periods and each entry in the covariance matrix at balance P:

AuxFunctions.py

365    for currentMB in range(1, int(totalMBPs)):
366        for j in range(0, currentMB):

The variables for the different times have a different meaning than in the expressions that were defined above. This is for legacy purposes and to improve alignment with the papers. The following table describes the mapping between the derived expressions and associated code:

Model Component

Code Expression

Balance i

I

Balance j

IPrime

For simplicity, the diagonal and off-diagonal terms are broken into multiple components.

Diagonal terms#

σi,i2ll0((t=MBPi1MBPiIl,t)2((δR,l)2+(δS,l)2))+ll1((t=MBPi1MBPiOl,t)2((δR,l)2+(δS,l)2))+ll2((Ci,l)2((δR,l)2+(δS,l)2))+ll2((Ci1,l)2((δR,l)2+(δS,l)2))ll2(2Ci1,lCi,l(δS,l)2)

Term 1
AuxFunctions.py

398                for k in range(len(inputAppliedError)):
399
400                    logicalInterval = np.logical_and(processedInputTimes[k] >= IPrevious_time,processedInputTimes[k] <= I_time).reshape((-1,))  #select the indices for the relevant time
401
402                    if inputTypes[k] == 'continuous':
403                        term1 += trapSum(logicalInterval,processedInputTimes[k],inputAppliedError[k]) **2 * (ErrorMatrix[k, 0]**2 + ErrorMatrix[k, 1]**2)
404                    elif inputTypes[k] == 'discrete':
405                        term1 += (inputAppliedError[k][:, logicalInterval].sum(axis=1)**2 * (ErrorMatrix[k, 0]**2 + ErrorMatrix[k, 1]**2 ))
406                    else:
407                        raise Exception("inputTypes[j] is not 'continuous' or 'discrete'")                      

Term 2
AuxFunctions.py

414                for k in range(len(outputAppliedError)):
415
416                    logicalInterval = np.logical_and(processedOutputTimes[k] >= IPrevious_time,processedOutputTimes[k] <= I_time).reshape((-1,))
417                    locMatrixRow = k + len(inputAppliedError) + len(inventoryAppliedError)                      
418
419                    if outputTypes[k] == 'continuous':
420                        term2 += trapSum(logicalInterval,processedOutputTimes[k],outputAppliedError[k])**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
421                    elif outputTypes[k] == 'discrete':
422                        term2 += (outputAppliedError[k][:, logicalInterval].sum(axis=1)**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2 ))
423                    else:
424                        raise Exception("outputTypes[j] is not 'continuous' or 'discrete'")

Term 3
AuxFunctions.py

429                for k in range(len(inventoryAppliedError)):
430                    locMatrixRow = k+len(inputAppliedError)
431
432                    startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin()
433                    endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
434
435                    term3 += inventoryAppliedError[k][:,endIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
436
437
438                if j != 0:
439                    for k in range(len(inventoryAppliedError)):
440                        locMatrixRow = k + len(inputAppliedError)
441                        startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -IPrevious_time).argmin()
442                        endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
443
444                        term4 += inventoryAppliedError[k][:,startIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
445                        term5 += inventoryAppliedError[k][:,startIdx] * inventoryAppliedError[k][:,endIdx] * ErrorMatrix[locMatrixRow, 1]**2
446
447                covmatrix[:,j,j] = term1 + term2 + term3 + term4 - 2 * term5

Term 4
AuxFunctions.py

429                for k in range(len(inventoryAppliedError)):
430                    locMatrixRow = k+len(inputAppliedError)
431
432                    startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin()
433                    endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
434
435                    term3 += inventoryAppliedError[k][:,endIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
436
437
438                if j != 0:
439                    for k in range(len(inventoryAppliedError)):
440                        locMatrixRow = k + len(inputAppliedError)
441                        startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -IPrevious_time).argmin()
442                        endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
443
444                        term4 += inventoryAppliedError[k][:,startIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
445                        term5 += inventoryAppliedError[k][:,startIdx] * inventoryAppliedError[k][:,endIdx] * ErrorMatrix[locMatrixRow, 1]**2
446
447                covmatrix[:,j,j] = term1 + term2 + term3 + term4 - 2 * term5

Term 5
AuxFunctions.py

The factor 2 is included later when the terms are added and assigned to the covariance matrix.#
429                for k in range(len(inventoryAppliedError)):
430                    locMatrixRow = k+len(inputAppliedError)
431
432                    startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin()
433                    endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
434
435                    term3 += inventoryAppliedError[k][:,endIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
436
437
438                if j != 0:
439                    for k in range(len(inventoryAppliedError)):
440                        locMatrixRow = k + len(inputAppliedError)
441                        startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -IPrevious_time).argmin()
442                        endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
443
444                        term4 += inventoryAppliedError[k][:,startIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
445                        term5 += inventoryAppliedError[k][:,startIdx] * inventoryAppliedError[k][:,endIdx] * ErrorMatrix[locMatrixRow, 1]**2
446
447                covmatrix[:,j,j] = term1 + term2 + term3 + term4 - 2 * term5

Off-diagonal terms#

σi,j2ll0((t=MBPi1MBPjIt,l)(t=MBPj1MBPjIt,l)(δS,l)2)+ll1((t=MBPi1MBPjOt,l)(t=MBPj1MBPjOt,l)(δS,l)2)+ll2((Ci,lCj,l+Ci1,lCj1,l)(δS,l)2)ll2((Ci,lCj1,l)((δS,l)2+P(j1==i)(δR,l)2))ll2((Ci1,lCj,l)((δS,l)2+P(i1==j)(δR,l)2))

Term 1
AuxFunctions.py

476                for k in range(len(inputAppliedError)):
477                    logicalInterval = np.logical_and(processedInputTimes[k] >= IPrevious_time, processedInputTimes[k] <= I_time).reshape((-1,))  #select the indices for the relevant time
478                    logicalInterval2 = np.logical_and(processedInputTimes[k] >= IPrimePrevious_time, processedInputTimes[k] <= IPrime_time).reshape((-1,))  #select the indices for the relevant time
479                    
480                    if inputTypes[k] == 'continuous':
481                        A = trapSum(logicalInterval, processedInputTimes[k],inputAppliedError[k])
482                        B = trapSum(logicalInterval2, processedInputTimes[k],inputAppliedError[k])
483                    elif inputTypes[k] == 'discrete': 
484                        A = inputAppliedError[k][:, logicalInterval].sum(axis=1)
485                        B = inputAppliedError[k][:, logicalInterval2].sum(axis=1)
486                    else:
487                        raise Exception("inputTypes[j] is not 'continuous' or 'discrete'")                   
488
489                    C = ErrorMatrix[k, 1]**2
490                    term1 += (A*B*C)

Term 2
AuxFunctions.py

495                for k in range(len(outputAppliedError)):
496                    logicalInterval = np.logical_and(processedOutputTimes[k] >= IPrevious_time,processedOutputTimes[k] <= I_time).reshape((-1,))  #select the indices for the relevant time
497                    logicalInterval2 = np.logical_and(processedOutputTimes[k] >= IPrimePrevious_time,processedOutputTimes[k] <= IPrime_time).reshape((-1,))  #select the indices for the relevant time
498                    locMatrixRow = k + len(inputAppliedError) + len(inventoryAppliedError)
499
500                    
501
502                    if outputTypes[k] == 'continuous':
503                        A = trapSum(logicalInterval, processedOutputTimes[k],outputAppliedError[k])
504                        B = trapSum(logicalInterval2, processedOutputTimes[k],outputAppliedError[k])
505                    elif outputTypes[k] == 'discrete': 
506                        A = outputAppliedError[k][:, logicalInterval].sum(axis=1)
507                        B = outputAppliedError[k][:, logicalInterval2].sum(axis=1)
508                    else:
509                        raise Exception("outputTypes[j] is not 'continuous' or 'discrete'")  
510                    
511                    C = ErrorMatrix[locMatrixRow, 1]**2
512                    term2 += (A*B*C)

Term 3
AuxFunctions.py

517                for k in range(len(inventoryAppliedError)):
518                    startIdx =  np.abs(processedInventoryTimes[k].reshape((-1,)) -  I_time).argmin() #I
519                    endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -  IPrime_time).argmin() 
520                    startIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin() 
521                    endIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrimePrevious_time).argmin() 
522                    locMatrixRow = k + len(inputAppliedError)
523
524                    term3a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx]
525                    term3b = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx2]
526                    term3c = ErrorMatrix[locMatrixRow, 1]**2
527                    term3 += (term3a+term3b)*term3c
528
529                    term4a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx2]
530                    term4b = ErrorMatrix[locMatrixRow, 1]**2
531                    if IPrime-1 == I:
532                        term4c = ErrorMatrix[locMatrixRow, 0]**2
533                    else:
534                        term4c = np.zeros((iterations,))
535
536
537                    term4 += term4a*(term4b+term4c)
538
539                    term5a = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx]
540                    term5b = ErrorMatrix[locMatrixRow, 1]**2
541
542                    if I - 1 == IPrime:
543                        term5c = ErrorMatrix[locMatrixRow, 0]**2
544                    else:
545                        term5c = np.zeros((iterations,))
546
547                    term5 += term5a*(term5b+term5c)
548                    
549                    
550
551                covmatrix[:,j,currentMB-1] = term1+term2+term3-term4-term5
552                covmatrix[:,currentMB-1,j] = term1+term2+term3-term4-term5

Term 4
AuxFunctions.py

517                for k in range(len(inventoryAppliedError)):
518                    startIdx =  np.abs(processedInventoryTimes[k].reshape((-1,)) -  I_time).argmin() #I
519                    endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -  IPrime_time).argmin() 
520                    startIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin() 
521                    endIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrimePrevious_time).argmin() 
522                    locMatrixRow = k + len(inputAppliedError)
523
524                    term3a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx]
525                    term3b = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx2]
526                    term3c = ErrorMatrix[locMatrixRow, 1]**2
527                    term3 += (term3a+term3b)*term3c
528
529                    term4a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx2]
530                    term4b = ErrorMatrix[locMatrixRow, 1]**2
531                    if IPrime-1 == I:
532                        term4c = ErrorMatrix[locMatrixRow, 0]**2
533                    else:
534                        term4c = np.zeros((iterations,))
535
536
537                    term4 += term4a*(term4b+term4c)
538
539                    term5a = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx]
540                    term5b = ErrorMatrix[locMatrixRow, 1]**2
541
542                    if I - 1 == IPrime:
543                        term5c = ErrorMatrix[locMatrixRow, 0]**2
544                    else:
545                        term5c = np.zeros((iterations,))
546
547                    term5 += term5a*(term5b+term5c)
548                    
549                    
550
551                covmatrix[:,j,currentMB-1] = term1+term2+term3-term4-term5
552                covmatrix[:,currentMB-1,j] = term1+term2+term3-term4-term5

Term 5
AuxFunctions.py

517                for k in range(len(inventoryAppliedError)):
518                    startIdx =  np.abs(processedInventoryTimes[k].reshape((-1,)) -  I_time).argmin() #I
519                    endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -  IPrime_time).argmin() 
520                    startIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin() 
521                    endIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrimePrevious_time).argmin() 
522                    locMatrixRow = k + len(inputAppliedError)
523
524                    term3a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx]
525                    term3b = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx2]
526                    term3c = ErrorMatrix[locMatrixRow, 1]**2
527                    term3 += (term3a+term3b)*term3c
528
529                    term4a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx2]
530                    term4b = ErrorMatrix[locMatrixRow, 1]**2
531                    if IPrime-1 == I:
532                        term4c = ErrorMatrix[locMatrixRow, 0]**2
533                    else:
534                        term4c = np.zeros((iterations,))
535
536
537                    term4 += term4a*(term4b+term4c)
538
539                    term5a = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx]
540                    term5b = ErrorMatrix[locMatrixRow, 1]**2
541
542                    if I - 1 == IPrime:
543                        term5c = ErrorMatrix[locMatrixRow, 0]**2
544                    else:
545                        term5c = np.zeros((iterations,))
546
547                    term5 += term5a*(term5b+term5c)
548                    
549                    
550
551                covmatrix[:,j,currentMB-1] = term1+term2+term3-term4-term5
552                covmatrix[:,currentMB-1,j] = term1+term2+term3-term4-term5

Further reading#