Preprocessing#
Text
- MAPIT.core.Preprocessing.SimErrors(rawData, ErrorMatrix, iterations, GUIObject=None, doTQDM=True, batchSize=10, dopar=False, bar=None, times=None, calibrationPeriod=None)#
Function to add simulated measurement error. Supports variable sample rates. Assumes the traditional multiplicative measurement error model:
\(M_{i,j} = T(1+R_{i,j}+S_j)\)
Random errors: \(R_{i,j} \sim \mathcal{N}(0,{\delta_R}_j^2)\)
Systematic errors: \(S_{j} \sim \mathcal{N}(0,{\delta_S}_j^2)\)
where \(i\) is the measurement time and \(j\) is the location
Example:
import numpy as np rawData = [np.random.rand(10, 1), np.random.rand(10, 1)] # [location1 (random, systematic), loction2 (random, systematic)] ErrorMatrix = np.array([[0.1, 0.2], [0.3, 0.4]]) iterations = 100 result = SimErrors(rawData, ErrorMatrix, iterations) print(result[0].shape) >>> (100, 10)
- Parameters:
rawData (list of ndarray) – Raw data to apply errors to, list of 2D ndarrays. Each entry in the list should correspond to a different location and the shape of ndarray in the list should be [MxN] where M is the sample dimension (number of samples) and N is the elemental dimension, if applicable. If only considering one element, each ndarray in the rawData list should be [Mx1].
ErrorMatrix (ndarray) – 2D ndarray of shape [Mx2] describing the relative standard deviation to apply to
rawData
. M sample dimension in each input array and should be identical to M described inrawData
. The second dimension (e.g., 2) refers to the random and systematic error respectively such thatErrorMatrix[0,0]
refers to the random relative standard deviation of the first location andErrorMatrix[0,1]
refers to the systematic relative standard deviation.iterations (int) – Number of iterations to calculate
GUIObject (obj, default=None) – GUI object for internal MAPIT use
doTQDM (bool, default=True) – Controls the use of TQDM progress bar for command line or notebook operation.
batchSize (int, default=10) – Batch size for parallel processing.
dopar (bool, default=False) – Controls the use of parallel processing.
times (list of ndarray, default=None) – List of ndarrays of shape [Mx1] describing the time of each sample in the rawData. Required if
calibrationPeriod
is provided.calibrationPeriod (list of float, default=None) – List of floats of length M describing the calibration period for each location in rawData. Required if
times
is provided.
- Returns:
List of arrays identical in shape to
rawData
. A list is returned so that each location can have a different sample rate.- Return type:
list
- MAPIT.core.Preprocessing.calcBatchError(calibrationPeriod, ErrorMatrix, batchSize, times, loc, dim0shape)#
Calculate batch error for a given location.
- Parameters:
calibrationPeriod (numpy array or None) – Calibration period for each location.
ErrorMatrix (numpy array) – Matrix containing error values (RSD) for each location.
batchSize (int) – Size of the batch.
times (numpy array) – Array of time values for each location.
loc (int) – Index of the current location.
dim0shape (int) – Shape of the first dimension of the raw data array.
- Returns:
Random error array. sysRSD (numpy array): Systematic error array.
- Return type:
randRSD (numpy array)