Developer Reference Guide¶
Drivers¶
-
src.prime_run.
main
(setupfile)[source]¶ Driver script to run MCMC for parameter inference for a multi-wave epidemic model. Currently limited to up to three infection curves.
To run this script:
python <path-to-this-directory>/prime_run.py <name-of-json-input-file>
- Parameters
- setupfile: string
json format input file with information on observations data, filtering options, MCMC options, and postprocessing options. See “setup_template.json” for a detailed example
-
src.prime_plot_data.
main
(setupfile)[source]¶ Plot raw and filtered data for the region specified in the setupfile.
- Parameters
- setupfile: string
json file (.json) including the region name. The “regionname.dat” should exist in the path accessible for this script
-
src.prime_plotKDE.
main
(filename)[source]¶ Plots 1D and 2D marginal kernel density estimates based on MCMC samples
- Parameters
- filename: string
json file (.json) including run setup information and postprocessing information for an MCMC run. It should specify the name of the file containing the MCMC chain
or
pickle file (.pkl) with a dictionary containing the KDE distributions.This file is generated by running this script with a json file (see above)
-
src.prime_compute_info_criteria.
main
(setupfile)[source]¶ This script postprocesses data from PRIME to compute statistical information including: - AIC: Akaike Information Criterion - BIC: Bayesian Information Criterion - CPRS: Continuous Rank Probability Score Results are saved in “info_criteria.txt”
- Parameters
- setupfile: string
json file (.json) including run setup information and postprocessing information for an MCMC run. It should specify the name of the file containing the MCMC chain
-
src.prime_compute_distance_correlation.
main
(setupfile)[source]¶ Computes and saves distance correlations based on samples. The distance correlation matrix is saved in “distanceCorr.txt”
- Parameters
- setupfile: string
json file (.json) including run setup information and postprocessing information for an MCMC run. It should specify the name of the file containing the MCMC chain
Epidemiological Model¶
-
src.prime_model.
modelPred
(state, params, is_cdf=False)[source]¶ Evaluates the PRIME model for a set of model parameters; specific model settings (e.g. date range, other control knobs, etc) are specified via the "params" dictionary
- Parameters
- state: python list or numpy array
model parameters
- params: dictionary
detailed settings for the epidemiological model
- is_cdf: boolean (optional, default False)
estimate the epidemiological curve based on the CDF of the incubation model (True) or via the formulation that employs the PDF of the icubation model (False)
- Returns
- Ncases: numpy array
daily counts for people turning symptomatic
-
src.prime_infection.
infection
(state, params)[source]¶ - Compute infection curve for multi-wave epidemics
this function is currently used by the post-processing script to push-forward the posterior into a set of infection curves that are consistent with the observed cases
- Parameters
- state: python list or numpy array
model parameters
- params: dictionary
detailed settings for the epidemiological model
- Returns
- dates: numpy array
list of dates for which the infection rates were computed
- infectons: numpy array
infection rate values corresponding to the list of dates
-
src.prime_infection.
infection_rate
(time, qshape, qscale, inftype)[source]¶ Infection rate (gamma or log-normal distribution)
- Parameters
- time: float, list, or numpy array
instances in time for the evaluation of the infection_rate model
- qshape: float
shape parameter
- qscale: float
scale parameter
- inftype: string
infection rate type (“gamma” for Gamma distribution, otherwise the Log-normal distribution)
- Returns
- vals: numpy array
infection rates corresponding to the time values provided as input parameters
-
src.prime_incubation.
incubation_fcn
(time, incubation_median, incubation_sigma, is_cdf=False)[source]¶ Computes the incubation rate
- Parameters
- time: float, list, or numpy array
instances in time for the evaluation of the incubation rate model
- incubation_median: float
median of the incubation rate model
- incubation_sigma: float
standard deviation of the incubation rate model
- is_cdf: boolean (optional, default False)
select either the CDF of the incubation rate model (True) or its PDF (False)
- Returns
- vals: numpy array
incubation rates corresponding to the time values provided as input parameters
Bayesian Inference¶
-
src.prime_posterior.
logpost
(state, params)[source]¶ Compute log-posterior density values; this function assumes the likelihood is a product of independent Gaussian distributions
- Parameters
- state: python list or numpy array
model parameters
- params: dictionary
detailed settings for the epidemiological model
- Returns
- llik: float
natural logarithm of the likelihood density
- lpri: float
natural logarithm of the prior density
-
src.prime_posterior.
logpost_negb
(state, params)[source]¶ Compute log-posterior density values; this function assumes the likelihood is a product of negative-binomial distributions
- Parameters
- state: python list or numpy array
model parameters
- params: dictionary
detailed settings for the epidemiological model
- Returns
- llik: float
natural logarithm of the likelihood density
- lpri: float
natural logarithm of the prior density
-
src.prime_posterior.
logpost_poisson
(state, params)[source]¶ Compute log-posterior density values; this function assumes the likelihood is a product of poisson distributions
- Parameters
- state: python list or numpy array
model parameters
- params: dictionary
detailed settings for the epidemiological model
- Returns
- llik: float
natural logarithm of the likelihood density
- lpri: float
natural logarithm of the prior density
-
src.prime_mcmc.
ammcmc
(opts, cini, likTpr, lpinfo)[source]¶ Adaptive Metropolis Markov Chain Monte Carlo
- Parameters
- optsdictionary of parameters
nsteps : no. of mcmc steps
nburn : no. of mcmc steps for burn-in (proposal fixed to initial covariance)
nadapt : adapt every nadapt steps after nburn
nfinal : stop adapting after nfinal steps
inicov : initial covariance
coveps : small additive factor to ensure covariance matrix is positive definite (only added to diagonal if covariance matrix is singular without it)
burnsc : factor to scale up/down proposal if acceptance rate is too high/low
gamma : factor to multiply proposed jump size with in the chain past the burn-in phase (Reduce this factor to get a higher acceptance rate. Defaults to 1.0)
spllo : lower bounds for chain samples
splhi : upper bounds for chain samples
rnseed : Optional seed for random number generator (needs to be integer >= 0) If not specified, then random number seed is not fixed and every chain will be different.
tmpchn : Optional; if present, will save chain state every ‘ofreq’ to ascii file. Filename is randomly generated if tmpchn is set to ‘tmpchn’, or set to the string passed through this option if not present, chain states are not saved during the MCMC progress
- cinistarting mcmc state
- likTprlog-posterior function; it takes two input parameters as follows
first parameter is a 1D array containing the chain state at which the posterior will to be evaluated
the second parameter contains settings the user can pass to this function; see below info for ‘lpinfo’
this function is expected to return log-Likelihood and log-Prior values (in this order)
- lpinfoinfo to be passed to the log-posterior function
this object can be of any type (e.g. None, scalar, list, array, dictionary, etc) as long as it is consistent with settings expected inside the ‘likTpr’ function
- Returns
- mcmcRes: results dictionary
‘chain’ : chain samples (nsteps x chain dimension)
‘cmap’ : MAP estimate
‘pmap’ : MAP log posterior
‘accr’ : overall acceptance rate
‘accb’ : fraction of samples inside bounds
‘rejAll’ : overall no. of samples rejected
rejOut’ : no. of samples rejected due to being outside bounds
‘minfo’ : meta_info, acceptance probability, log likelihood, log prior
‘final_cov’ : the covariance matrix at the end of the run
Statistical Utilities¶
-
src.prime_stats.
computeAICandBIC
(run_setup, verbose=0)[source]¶ Compute Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC)
- Parameters
- run_setup: dictionary with run settings; see the Examples section in the manual
- Returns
- AIC: float
- BIC: float
-
src.prime_stats.
computeCRPS
(run_setup)[source]¶ Compute Continuous Rank Predictive Score (CRPS)
- Parameters
- run_setup: dictionary with run settings; see the Examples section in the manual
- Returns
- CRPS: float
-
src.prime_stats.
distcorr
(spl)[source]¶ Compute distance correlation between random vectors
- Parameters
- spl: numpy array [number of samples x number of variables]
first dimension is the number of samples, second dimension is the number of random vectors
- Returns
- Returns a 2D array of distance correlations between pairs of random vectors;
only entries 0<=j<i<no. of random vectors are populated
- References:
-
src.prime_stats.
getKDE
(spl, nskip=0, nthin=1, npts=100, bwfac=1.0)[source]¶ Compute 1D and 2D marginal PDFs via Kernel Density Estimate
- Parameters
- spl: numpy array
MCMC chain [number of samples x number of parameters]
- nskip: int
number of initial samples to skip when sampling the MCMC chain
- nthin: int
use every ‘nthin’ samples
- npts: int
number of grid points
- bwfac: double
bandwidth factor
- Returns
- dict: dictionary with results
‘x1D’: list of numpy arrays with grids for the 1D PDFs; ‘p1D’: list of numpy arrays with 1D PDFs; ‘x2D’: list of numpy arrays of x-axis grids for the 2D PDFs; ‘y2D’: list of numpy arrays of y-axis grids for the 2D PDFs; ‘p2D’: list of numpy arrays containing 2D PDFs
General Utilities¶
-
src.prime_utils.
compute_error_weight
(error_info, days)[source]¶ Compute array with specified weighting for the daily cases data. The weights follow either linear of Gaussian expressions with higher weights for recent data and lower weights for older data
- Parameters
- error_info: list
(error_type,min_wgt,[tau]), error type is either ‘linear’ or ‘gaussian’, min_wgt is the minimum weight and tau is the standard deviation of the exponential term if a Gaussian formulation is chosen.
- days: int
lenght of the weights array
- Returns
- ——-
- error_weight: numpy array
array of weights