Covariance Matrix#
Historical context#
Generally, the covariance matrix is a matrix that describes the relationship between the elements of a multivariate normal distribution. The covariance matrix is a square matrix of size [n x n] where n is the number of elements in the multivariate normal distribution. The covariance matrix is a symmetric matrix and is defined as follows:
The covariance matrix for the material balance sequence is of particular importance for several statistical tests, including SITMUF and GEMUF. It’s impossible to know the exact matrix in practice, but it can be estimated by knowing expected sensor performance.
Recall that the muf sequence is defined as follows:
With
It’s generally assumed that since the error models are normally distributed, individual muf values (i.e., mbp\(_i\)) and the muf sequence (i.e., muf) will also be normally distributed. Consequently, the muf sequence can be thought of as a multivariate normal distribution such that:
The covariance matrix contains the covariance between different material balances in the sequence. For example, consider the entry \(\sigma_{2n}^2\) of the covariance matrix below. This term is the variance between material balance \(n\) and \(2\).
The covariance matrix itself is often calculated using relative standard deviations, similar to the calculation for \(\sigma\)muf. In fact, the diagonal terms (i.e., \(\Sigma_{1,1}, \Sigma_{2,2}, ... \)) are the variance of the material balance (the covariance of material balance \(i\) with itself is the variance). There’s two key expressions; one for the covariance diagonals (i.e., \(\sigma_{x,x}\)) and one for the covariance off diagonals (i.e., \(\sigma_{x,x^{\prime}}\)). These are calculated by using various variance and covariance rules and propagating the terms.
Recall the expression that was derived from \(\sigma\)muf; which is used for the diagonals for the covariance matrix:
The off-diagonal is calculated in a similar manner, but has more terms. The off-diagonal is the covariance between material balance \(i\) and \(j\).
Following the same rules used to derive \(\sigma\)muf leads to the expression for the covariance off diagonal:
Where
Implementation#
The covariance matrix is a \(NxN\) matrix at the N-th material balance period. Since MAPIT adopts the “NRTA” style calculation, all \(NxN\) entries must be updated at each balance period which simulates the arrival of new information. The MAPIT SITMUF calculation is not well vectorized as the \(NxN\) must be resized and calculated at each balance. The calculation starts by looping over balance periods and each entry in the covariance matrix at balance \(P\):
The covariance matrix is a \(NxN\) matrix at the N-th material balance period. Since MAPIT adopts the “NRTA” style calculation, all \(NxN\) entries must be updated at each balance period which simulates the arrival of new information. The MAPIT SITMUF calculation is not well vectorized as the \(NxN\) must be resized and calculated at each balance. The calculation starts by looping over balance periods and each entry in the covariance matrix at balance \(P\):
AuxFunctions.py
389 for currentMB in range(1, int(totalMBPs)):
390 for j in range(0, currentMB):
The variables for the different times have a different meaning than in the expressions that were defined above. This is for legacy purposes and to improve alignment with the papers. The following table describes the mapping between the derived expressions and associated code:
Model Component |
Code Expression |
|---|---|
Balance \(i\) |
|
Balance \(j\) |
|
For simplicity, the diagonal and off-diagonal terms are broken into multiple components.
Diagonal terms#
Term 1
AuxFunctions.py
Term 1
AuxFunctions.py
422 for k in range(len(inputAppliedError)):
423
424 logicalInterval = np.logical_and(processedInputTimes[k] >= IPrevious_time,processedInputTimes[k] <= I_time).reshape((-1,)) #select the indices for the relevant time
425
426 if inputTypes[k] == 'continuous':
427 term1 += trapSum(logicalInterval,processedInputTimes[k],inputAppliedError[k]) **2 * (ErrorMatrix[k, 0]**2 + ErrorMatrix[k, 1]**2)
428 elif inputTypes[k] == 'discrete':
429 term1 += (inputAppliedError[k][:, logicalInterval].sum(axis=1)**2 * (ErrorMatrix[k, 0]**2 + ErrorMatrix[k, 1]**2 ))
430 else:
431 raise Exception("inputTypes[j] is not 'continuous' or 'discrete'")
Term 2
AuxFunctions.py
Term 2
AuxFunctions.py
438 for k in range(len(outputAppliedError)):
439
440 logicalInterval = np.logical_and(processedOutputTimes[k] >= IPrevious_time,processedOutputTimes[k] <= I_time).reshape((-1,))
441 locMatrixRow = k + len(inputAppliedError) + len(inventoryAppliedError)
442
443 if outputTypes[k] == 'continuous':
444 term2 += trapSum(logicalInterval,processedOutputTimes[k],outputAppliedError[k])**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
445 elif outputTypes[k] == 'discrete':
446 term2 += (outputAppliedError[k][:, logicalInterval].sum(axis=1)**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2 ))
447 else:
448 raise Exception("outputTypes[j] is not 'continuous' or 'discrete'")
Term 3
AuxFunctions.py
Term 3
AuxFunctions.py
453 for k in range(len(inventoryAppliedError)):
454 locMatrixRow = k+len(inputAppliedError)
455
456 startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin()
457 endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
458
459 term3 += inventoryAppliedError[k][:,endIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
460
461
462 if j != 0:
463 for k in range(len(inventoryAppliedError)):
464 locMatrixRow = k + len(inputAppliedError)
465 startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -IPrevious_time).argmin()
466 endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
467
468 term4 += inventoryAppliedError[k][:,startIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
469 term5 += inventoryAppliedError[k][:,startIdx] * inventoryAppliedError[k][:,endIdx] * ErrorMatrix[locMatrixRow, 1]**2
470
471 covmatrix[:,j,j] = term1 + term2 + term3 + term4 - 2 * term5
Term 4
AuxFunctions.py
Term 4
AuxFunctions.py
453 for k in range(len(inventoryAppliedError)):
454 locMatrixRow = k+len(inputAppliedError)
455
456 startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin()
457 endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
458
459 term3 += inventoryAppliedError[k][:,endIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
460
461
462 if j != 0:
463 for k in range(len(inventoryAppliedError)):
464 locMatrixRow = k + len(inputAppliedError)
465 startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -IPrevious_time).argmin()
466 endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
467
468 term4 += inventoryAppliedError[k][:,startIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
469 term5 += inventoryAppliedError[k][:,startIdx] * inventoryAppliedError[k][:,endIdx] * ErrorMatrix[locMatrixRow, 1]**2
470
471 covmatrix[:,j,j] = term1 + term2 + term3 + term4 - 2 * term5
Term 5
AuxFunctions.py
Term 5
AuxFunctions.py
453 for k in range(len(inventoryAppliedError)):
454 locMatrixRow = k+len(inputAppliedError)
455
456 startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin()
457 endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
458
459 term3 += inventoryAppliedError[k][:,endIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
460
461
462 if j != 0:
463 for k in range(len(inventoryAppliedError)):
464 locMatrixRow = k + len(inputAppliedError)
465 startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) -IPrevious_time).argmin()
466 endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin()
467
468 term4 += inventoryAppliedError[k][:,startIdx]**2 * (ErrorMatrix[locMatrixRow, 0]**2 + ErrorMatrix[locMatrixRow, 1]**2)
469 term5 += inventoryAppliedError[k][:,startIdx] * inventoryAppliedError[k][:,endIdx] * ErrorMatrix[locMatrixRow, 1]**2
470
471 covmatrix[:,j,j] = term1 + term2 + term3 + term4 - 2 * term5
Off-diagonal terms#
Term 1
AuxFunctions.py
Term 1
AuxFunctions.py
500 for k in range(len(inputAppliedError)):
501 logicalInterval = np.logical_and(processedInputTimes[k] >= IPrevious_time, processedInputTimes[k] <= I_time).reshape((-1,)) #select the indices for the relevant time
502 logicalInterval2 = np.logical_and(processedInputTimes[k] >= IPrimePrevious_time, processedInputTimes[k] <= IPrime_time).reshape((-1,)) #select the indices for the relevant time
503
504 if inputTypes[k] == 'continuous':
505 A = trapSum(logicalInterval, processedInputTimes[k],inputAppliedError[k])
506 B = trapSum(logicalInterval2, processedInputTimes[k],inputAppliedError[k])
507 elif inputTypes[k] == 'discrete':
508 A = inputAppliedError[k][:, logicalInterval].sum(axis=1)
509 B = inputAppliedError[k][:, logicalInterval2].sum(axis=1)
510 else:
511 raise Exception("inputTypes[j] is not 'continuous' or 'discrete'")
512
513 C = ErrorMatrix[k, 1]**2
514 term1 += (A*B*C)
Term 2
AuxFunctions.py
Term 2
AuxFunctions.py
519 for k in range(len(outputAppliedError)):
520 logicalInterval = np.logical_and(processedOutputTimes[k] >= IPrevious_time,processedOutputTimes[k] <= I_time).reshape((-1,)) #select the indices for the relevant time
521 logicalInterval2 = np.logical_and(processedOutputTimes[k] >= IPrimePrevious_time,processedOutputTimes[k] <= IPrime_time).reshape((-1,)) #select the indices for the relevant time
522 locMatrixRow = k + len(inputAppliedError) + len(inventoryAppliedError)
523
524
525
526 if outputTypes[k] == 'continuous':
527 A = trapSum(logicalInterval, processedOutputTimes[k],outputAppliedError[k])
528 B = trapSum(logicalInterval2, processedOutputTimes[k],outputAppliedError[k])
529 elif outputTypes[k] == 'discrete':
530 A = outputAppliedError[k][:, logicalInterval].sum(axis=1)
531 B = outputAppliedError[k][:, logicalInterval2].sum(axis=1)
532 else:
533 raise Exception("outputTypes[j] is not 'continuous' or 'discrete'")
534
535 C = ErrorMatrix[locMatrixRow, 1]**2
536 term2 += (A*B*C)
Term 3
AuxFunctions.py
Term 3
AuxFunctions.py
541 for k in range(len(inventoryAppliedError)):
542 startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin() #I
543 endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrime_time).argmin()
544 startIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin()
545 endIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrimePrevious_time).argmin()
546 locMatrixRow = k + len(inputAppliedError)
547
548 term3a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx]
549 term3b = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx2]
550 term3c = ErrorMatrix[locMatrixRow, 1]**2
551 term3 += (term3a+term3b)*term3c
552
553 term4a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx2]
554 term4b = ErrorMatrix[locMatrixRow, 1]**2
555 if IPrime-1 == I:
556 term4c = ErrorMatrix[locMatrixRow, 0]**2
557 else:
558 term4c = np.zeros((iterations,))
559
560
561 term4 += term4a*(term4b+term4c)
562
563 term5a = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx]
564 term5b = ErrorMatrix[locMatrixRow, 1]**2
565
566 if I - 1 == IPrime:
567 term5c = ErrorMatrix[locMatrixRow, 0]**2
568 else:
569 term5c = np.zeros((iterations,))
570
571 term5 += term5a*(term5b+term5c)
572
573
574
575 covmatrix[:,j,currentMB-1] = term1+term2+term3-term4-term5
576 covmatrix[:,currentMB-1,j] = term1+term2+term3-term4-term5
Term 4
AuxFunctions.py
Term 4
AuxFunctions.py
541 for k in range(len(inventoryAppliedError)):
542 startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin() #I
543 endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrime_time).argmin()
544 startIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin()
545 endIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrimePrevious_time).argmin()
546 locMatrixRow = k + len(inputAppliedError)
547
548 term3a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx]
549 term3b = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx2]
550 term3c = ErrorMatrix[locMatrixRow, 1]**2
551 term3 += (term3a+term3b)*term3c
552
553 term4a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx2]
554 term4b = ErrorMatrix[locMatrixRow, 1]**2
555 if IPrime-1 == I:
556 term4c = ErrorMatrix[locMatrixRow, 0]**2
557 else:
558 term4c = np.zeros((iterations,))
559
560
561 term4 += term4a*(term4b+term4c)
562
563 term5a = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx]
564 term5b = ErrorMatrix[locMatrixRow, 1]**2
565
566 if I - 1 == IPrime:
567 term5c = ErrorMatrix[locMatrixRow, 0]**2
568 else:
569 term5c = np.zeros((iterations,))
570
571 term5 += term5a*(term5b+term5c)
572
573
574
575 covmatrix[:,j,currentMB-1] = term1+term2+term3-term4-term5
576 covmatrix[:,currentMB-1,j] = term1+term2+term3-term4-term5
Term 5
AuxFunctions.py
Term 5
AuxFunctions.py
541 for k in range(len(inventoryAppliedError)):
542 startIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - I_time).argmin() #I
543 endIdx = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrime_time).argmin()
544 startIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrevious_time).argmin()
545 endIdx2 = np.abs(processedInventoryTimes[k].reshape((-1,)) - IPrimePrevious_time).argmin()
546 locMatrixRow = k + len(inputAppliedError)
547
548 term3a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx]
549 term3b = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx2]
550 term3c = ErrorMatrix[locMatrixRow, 1]**2
551 term3 += (term3a+term3b)*term3c
552
553 term4a = inventoryAppliedError[k][:, startIdx] * inventoryAppliedError[k][:, endIdx2]
554 term4b = ErrorMatrix[locMatrixRow, 1]**2
555 if IPrime-1 == I:
556 term4c = ErrorMatrix[locMatrixRow, 0]**2
557 else:
558 term4c = np.zeros((iterations,))
559
560
561 term4 += term4a*(term4b+term4c)
562
563 term5a = inventoryAppliedError[k][:, startIdx2] * inventoryAppliedError[k][:, endIdx]
564 term5b = ErrorMatrix[locMatrixRow, 1]**2
565
566 if I - 1 == IPrime:
567 term5c = ErrorMatrix[locMatrixRow, 0]**2
568 else:
569 term5c = np.zeros((iterations,))
570
571 term5 += term5a*(term5b+term5c)
572
573
574
575 covmatrix[:,j,currentMB-1] = term1+term2+term3-term4-term5
576 covmatrix[:,currentMB-1,j] = term1+term2+term3-term4-term5