StatsTests#
Text
- MAPIT.core.StatsTests.CUMUF(MUF, GUIObject=None, doTQDM=True, ispar=False)#
This function performs the cumulative MUF test. This is simply the sum of all previous MUF values at a particular time.
\(\text{CUMUF}_t = \sum_{t=0}^t \text{MUF}_t\)
- Parameters:
MUF (ndarray) – MUF sequence with shape \([n,j]\) where \(n\) is the number of iterations and \(j\) is the temporal dimension. Expects a continuous valued MUF sequence that is similar in format to what is returned by core.StatsTests.MUF.
GUIParams (object, default=None) – An optional object that carries GUI related parameters when the API is used inside the MAPIT GUI.
doTQDM (bool, default=True) – Controls the use of TQDM progress bar for command line or notebook operation.
- Returns:
CUMUF sequence with identical shape to the input MUF.
- Return type:
ndarray
- MAPIT.core.StatsTests.GEMUF_V1(MUF, covmatrix, MBP, GUIObject=None, doTQDM=True, ispar=False)#
Function that calculates the V1 version of GEMUF. Here, only the current value of MUF is used to estimate the loss vector for the GEMUF test statistic.
- Parameters:
MUF (ndarray) – The previously calculated MUF array. Should have shape [M, T] where M is the number of iterations (realizations) calculated and T is the total number of timesteps.
covmatrix (ndarray) – The covariance matrix of the data. A symmetric [M, N, N] matrix where M is the number of iterations (realizations) N is the total number of balance periods calculated.
MBP (float) – The material balance period.
GUIObject (object, default=None) – An optional object that carries GUI related references when the API is used inside the MAPIT GUI.
doTQDM (bool, default=True) – Whether to use a tqdm progress bar for the calculation. Defaults to True.
ispar (bool, default=False) – Whether to use a parallel calculation for the calculation. Defaults to False.
- Returns:
GEMUF-V1 sequence with shape [M, T] where M is number of iterations and T is the total number of timesteps.
- Return type:
ndarray
- MAPIT.core.StatsTests.GEMUF_V5B3(MUF, covmatrix, MBP, GUIObject=None, doTQDM=True, ispar=False)#
Function that calculates the V5B3 version of GEMUF. A weighted window of MUF values are used to estimate the loss vector when calculating the test statistic. Note that the V5B3 version is only valid for only certain parts of the sequence. For example, at balance period 1 and balance period 2, V5B3 cannot be calculated as there isn’t two past values to weigh. Similarly, V5B3 can’t be calculated for the last two balance periods. Rather than modifying this from Seifert’s original paper, we represent those values as zero.
Important
The first and last two material balance periods have undefined GEMUF values when using V5B3. We represent these as zero, but they are not truely zero!
- Parameters:
MUF (ndarray) – The previously calculated MUF array. Should have shape [M, T] where M is the number of iterations (realizations) calculated and T is the total number of timesteps.
covmatrix (ndarray) – The covariance matrix of the data. A symmetric [M, N, N] matrix where M is the number of iterations (realizations) N is the total number of balance periods calculated.
MBP (float) – The material balance period.
GUIObject (object, default=None) – An optional object that carries GUI related references when the API is used inside the MAPIT GUI.
doTQDM (bool, default=True) – Whether to use a tqdm progress bar for the calculation. Defaults to True.
ispar (bool, default=False) – Whether to use a parallel calculation for the calculation. Defaults to False.
- Returns:
GEMUF-V5B3 sequence with shape [M, T] where M is number of iterations and T is the total number of timesteps.
- Return type:
ndarray
- MAPIT.core.StatsTests.MUF(inputAppliedError, processedInputTimes, inventoryAppliedError, processedInventoryTimes, outputAppliedError, processedOutputTimes, MBP, inputTypes, outputTypes, GUIObject=None, doTQDM=True, ispar=False)#
Function to calculate Material Unaccounted For (MUF), which is sometimes also called ID (inventory difference). Specifically calculates the material balance sequence given some input time series.
\(\text{MUF}_t = I_t - O_t - (C_t - C_{t-1})\)
\(I_t\) is input at time \(t\)
\(O_t\) is output at \(t\)
\(C_t\) is inventory at time \(t\) (note C is used to denote container to have clearer notation rather than using \(I\) with subscripts for both inventory and input)
Important
The lengths and shapes of appliedErrors and processedTimes should be the same. For example:
assert(len(inputAppliedError) == len(processedInputTimes) assert(inputAppliedError[0].shape == processedInputTimes[0].shape)
See the Input guide for more information.
- Parameters:
inputAppliedError (list of ndarrays) – A list of ndarrays that has length equal to the total number of input locations. Each array should be \([m,1]\) in shape where \(m\) is the number of samples. This array should reflect observed quantites (as opposed to ground truths). Inputs are assumed to be flows in units of \(\frac{1}{s}\) and will be integrated.
processedInputTimes (list of ndarrays) – A list of ndarrays that has length equal to the total number of input locations. Each array should be \([m,1]\) in shape where \(m\) is the number of samples. len(processedInputTimes) and the shape of each list entry (ndarray) should be the same as for inputAppliedError. Each entry in the ndarray should correspond to a timestamp indicating when the value was taken.
inventoryAppliedError (list of ndarrays) – A list of ndarrays that has length equal to the total number of inventory locations. Each array should be \([m,1]\) in shape where \(m\) is the number of samples. This array should reflect observed quantites. Inventories are assumed to be in units of mass and will not be integrated.
processedInventoryTimes (list of ndarrays) – A list of ndarrays that has length equal to the total number of inventory locations. Each array should be \([m,1]\) in shape where \(m\) is the number of samples. len(processedInventoryTimes) and shape of each list entry (ndarray) should be the same as for inventoryAppliedError. Each entry in the ndarray should corresond to a timestamp indicating when the value was taken.
outputAppliedError (list of ndarrays) – A list of ndarrays that has length equal to the total number of output locations. Each array should be \([m,1]\) in shape where \(m\) is the number of samples. This array should reflect observed quantites. Outputs are assumed to be in flows with units of \(\frac{1}{s}\) and will be integrated.
processedOutputTimes (list of ndarrays) – A list of ndarrays that has length equal to the total number of output locations. Each array should be \([m,1]\) in shape where \(m\) is the number of samples. len(processedOutputTimes) and shape of each list entry (ndarray) should be the same as for outputAppliedError. Each entry in the ndarray should correspond to a timestamp indicating when the value was taken.
MBP (float) – Defines the material balance period.
inputTypes (list of strings) – Defines the type of input. This should be a list of strings that is the same length as the number of input locations. The strings should be one of the following: ‘discrete’ or ‘continuous’.
outputTypes (list of strings) – Defines the type of output. This should be a list of strings that is the same length as the number of output locations. The strings should be one of the following: ‘discrete’ or ‘continuous’.
GUIObject (object, default=None) – An optional object that carries GUI related references when the API is used inside the MAPIT GUI.
GUIParams (object, default=None) – An optional object that carries GUI related parameters when the API is used inside the MAPIT GUI.
doTQDM (bool, default=True) – Controls the use of TQDM progress bar for command line or notebook operation.
- Returns:
MUF sequence with shape \([n,j]\) where \(n\) length equal to the maximum time based on the number of material balances that could be constructed given the user provided MBP and number of samples in the input data and \(j\) is the number of iterations given as input. The term \(n\) is calculated by finding the minimum of each of the provided input times.
For example:
import numpy as np time1[-1] = 400 time2[-1] = 300 time3[-1] = 800 n = np.floor( np.min( (time1,time2,time3)))
- Return type:
ndarray
- MAPIT.core.StatsTests.PageTrendTest(inQty, MBP, MBPs, K=0.5, GUIObject=None, doTQDM=True)#
Function for calculating Page’s trend test, which is commonly applied to the SITMUF sequence. Formally compares the null hypothesis that there is no trend versus the alternate trend where there is a trend.
- Parameters:
inQty (ndarray) – A ndarray with shape \([m,n]\) where \(m\) is the number of iterations and \(n\) is the total number of timesteps.
MBP (float) – A float expressing the material balance period.
MBPs (float) – The total number of material balance periods present in inQty.
K (float, default = 0.5) – Parameter in the trend test.
GUIObject (object, default=None) – An optional object that carries GUI related references when the API is used inside the MAPIT GUI.
GUIParams (object, default=None) – An optional object that carries GUI related parameters when the API is used inside the MAPIT GUI.
doTQDM (bool, default=True) – Controls the use of TQDM progress bar for command line or notebook operation.
- Returns:
The results of the trend test which has shape \([m,n]\).
- Return type:
ndarray
- MAPIT.core.StatsTests.SEMUF(inputAppliedError, processedInputTimes, inventoryAppliedError, processedInventoryTimes, outputAppliedError, processedOutputTimes, MBP, inputTypes, outputTypes, ErrorMatrix, GUIObject=None, doTQDM=True, ispar=False)#
Function for calculating standard error of the material balance sequence (often called SEID or Standard Error of Inventory Difference; \(\sigma _\text{ID}\)). This is accomplished by assuming the error incurred at each location (specified in the ErrorMatrix) rather than estimating it emperically, which is difficult in practice. The equation used here is suitable for most traditional bulk facilities such as enrichment or reprocessing where input and output flows are independent. This function should not be used for facilitiy types where there are more complex statistical dependencies between input, inventory, and output terms (e.g., molten salt reactors). See guide XX for more information.
- Parameters:
inputAppliedError (list of ndarrays) – A list of ndarrays that has length equal to the total number of input locations. Each array should be \([m,1]\) in shape where \(m\) is the number of samples. This array should reflect observed quantites (as opposed to ground truths). Inputs are assumed to be flows in units of \(\frac{1}{s}\) and will be integrated.
processedInputTimes (list of ndarrays) – A list of ndarrays that has length equal to the total number of input locations. Each array should be \([m,1]\) in shape where \(m\) is the number of samples. len(processedInputTimes) and the shape of each list entry (ndarray) should be the same as for inputAppliedError. Each entry in the ndarray should correspond to a timestamp indicating when the value was taken.
inventoryAppliedError (list of ndarrays) – A list of ndarrays that has length equal to the total number of inventory locations. Each array should be \([m,1]\) in shape where \(m\) is the number of samples. This array should reflect observed quantites. Inventories are assumed to be in units of mass and will not be integrated.
processedInventoryTimes (list of ndarrays) – A list of ndarrays that has length equal to the total number of inventory locations. Each array should be \([m,1]\) in shape where \(m\) is the number of samples. len(processedInventoryTimes) and shape of each list entry (ndarray) should be the same as for inventoryAppliedError. Each entry in the ndarray should corresond to a timestamp indicating when the value was taken.
outputAppliedError (list of ndarrays) – A list of ndarrays that has length equal to the total number of output locations. Each array should be \([m,1]\) in shape where \(m\) is the number of samples. This array should reflect observed quantites. Outputs are assumed to be in flows with units of \(\frac{1}{s}\) and will be integrated.
processedOutputTimes (list of ndarrays) – A list of ndarrays that has length equal to the total number of output locations. Each array should be \([m,1]\) in shape where \(m\) is the number of samples. len(processedOutputTimes) and shape of each list entry (ndarray) should be the same as for outputAppliedError. Each entry in the ndarray should correspond to a timestamp indicating when the value was taken.
MBP (float) – Defines the material balance period.
inputTypes (list of strings) – Defines the type of input. This should be a list of strings that is the same length as the number of input locations. The strings should be one of the following: ‘discrete’ or ‘continuous’.
outputTypes (list of strings) – Defines the type of output. This should be a list of strings that is the same length as the number of output locations. The strings should be one of the following: ‘discrete’ or ‘continuous’.
ErrorMatrix (ndarray) – mx1 A ndarray shaped \([M,2]\) where \(M\) is the total number of locations across inputs, inventories, and outputs stacked together (in that order) and 2 refers to the relative random and systematic errors. For example with 2 inputs, 2 inventories, and 2 outputs, ErrorMatrix[3,1] would be the relative systematic error of inventory 2. See guide XX for more information.
GUIObject (object, default=None) – An optional object that carries GUI related references when the API is used inside the MAPIT GUI.
GUIParams (object, default=None) – An optional object that carries GUI related parameters when the API is used inside the MAPIT GUI.
doTQDM (bool, default=True) – Controls the use of TQDM progress bar for command line or notebook operation.
- Returns:
tuple containing:
SEID (ndarray): sequence with shape \([n,j,1]\) where \(n\) is the number of material balances and \(j\) is the number of iterations given as input. The term \(n\) is calculated by finding the minimum of each of the provided input times.
SEMUFContribR (ndarray): the random contribution to the overall SEMUF with shape \([j,l,n]\) where \(j\) is the number of iterations given as input, \(l\) is the total number of locations stacked in the order [inputs, inventories, outputs] and \(n\) is the number of material balances.
SEMUFContribS (ndarray): the systematic contribution to the overall SEMUF with shape \([j,l,n]\) where \(j\) is the number of iterations given as input, \(l\) is the total number of locations stacked in the order [inputs, inventories, outputs] and \(n\) is the number of material balances.
ObservedValues (ndarray): the observed values used to calculate SEMUF with shape \([j,l,n]\) where \(j\) is the number of iterations given as input, \(l\) is the total number of locations stacked in the order [inputs, inventories, outputs] and \(n\) is the number of material balances.
- Return type:
(tuple)
- MAPIT.core.StatsTests.SITMUF(MUF, covmatrix, MBP, GUIObject=None, doTQDM=True, ispar=False)#
Function to calculate the standardized independent transformed MUF (SITMUF).
- Parameters:
MUF (ndarray) – The previously calculated MUF array. Should have shape [M, T] where M is the number of iterations (realizations) calculated and T is the total number of timesteps.
covmatrix (ndarray) – The covariance matrix of the data. A symmetric [M, N, N] matrix where M is the number of iterations (realizations) N is the total number of balance periods calculated.
MBP (float) – The material balance period.
GUIObject (object, default=None) – Optional object used by MAPIT GUI to warn users in the case of a Cholesky decomposition error. Defaults to None.
doTQDM (bool, default=True) – Whether to use a tqdm progress bar for the calculation. Defaults to True.
ispar (bool, default=False) – Whether to use a parallel calculation for the calculation. Defaults to False.
- Returns:
SITMUF sequence with shape [M, T] where M is number of iterations and T is the total number of timesteps.
- Return type:
ndarray