Developer Reference Guide¶

Drivers¶

src.prime_run.main(setupfile)[source]¶

Driver script to run MCMC for parameter inference for a multi-wave epidemic model. Currently limited to up to three infection curves.

To run this script:

python <path-to-this-directory>/prime_run.py <name-of-json-input-file>

Parameters

setupfile: string: json format input file with information on observations data, filtering options, MCMC options, and postprocessing options. See “setup_template.json” for a detailed example

src.prime_plot_data.main(setupfile)[source]¶

Plot raw and filtered data for the region specified in the setupfile.

Parameters

setupfile: string: json file (.json) including the region name. The “regionname.dat” should exist in the path accessible for this script

src.prime_plotKDE.main(filename)[source]¶

Plots 1D and 2D marginal kernel density estimates based on MCMC samples

Parameters

filename: string

json file (.json) including run setup information and postprocessing information for an MCMC run. It should specify the name of the file containing the MCMC chain

or

pickle file (.pkl) with a dictionary containing the KDE distributions.This file is generated by running this script with a json file (see above)

src.prime_compute_info_criteria.main(setupfile)[source]¶

This script postprocesses data from PRIME to compute statistical information including: - AIC: Akaike Information Criterion - BIC: Bayesian Information Criterion - CPRS: Continuous Rank Probability Score Results are saved in “info_criteria.txt”

Parameters

setupfile: string: json file (.json) including run setup information and postprocessing information for an MCMC run. It should specify the name of the file containing the MCMC chain

src.prime_compute_distance_correlation.main(setupfile)[source]¶

Computes and saves distance correlations based on samples. The distance correlation matrix is saved in “distanceCorr.txt”

Parameters

setupfile: string: json file (.json) including run setup information and postprocessing information for an MCMC run. It should specify the name of the file containing the MCMC chain

Epidemiological Model¶

src.prime_model.modelPred(state, params, is_cdf=False)[source]¶

Evaluates the PRIME model for a set of model parameters; specific model settings (e.g. date range, other control knobs, etc) are specified via the "params" dictionary

Parameters

state: python list or numpy array: model parameters
params: dictionary: detailed settings for the epidemiological model
is_cdf: boolean (optional, default False): estimate the epidemiological curve based on the CDF of the incubation model (True) or via the formulation that employs the PDF of the icubation model (False)

Returns

Ncases: numpy array: daily counts for people turning symptomatic

src.prime_infection.infection(state, params)[source]¶

Compute infection curve for multi-wave epidemics

this function is currently used by the post-processing script to push-forward the posterior into a set of infection curves that are consistent with the observed cases

Parameters

state: python list or numpy array: model parameters
params: dictionary: detailed settings for the epidemiological model

Returns

dates: numpy array: list of dates for which the infection rates were computed
infectons: numpy array: infection rate values corresponding to the list of dates

src.prime_infection.infection_rate(time, qshape, qscale, inftype)[source]¶

Infection rate (gamma or log-normal distribution)

Parameters

time: float, list, or numpy array: instances in time for the evaluation of the infection_rate model
qshape: float: shape parameter
qscale: float: scale parameter
inftype: string: infection rate type (“gamma” for Gamma distribution, otherwise the Log-normal distribution)

Returns

vals: numpy array: infection rates corresponding to the time values provided as input parameters

src.prime_incubation.incubation_fcn(time, incubation_median, incubation_sigma, is_cdf=False)[source]¶

Computes the incubation rate

Parameters

time: float, list, or numpy array: instances in time for the evaluation of the incubation rate model
incubation_median: float: median of the incubation rate model
incubation_sigma: float: standard deviation of the incubation rate model
is_cdf: boolean (optional, default False): select either the CDF of the incubation rate model (True) or its PDF (False)

Returns

vals: numpy array: incubation rates corresponding to the time values provided as input parameters

Bayesian Inference¶

src.prime_posterior.logpost(state, params)[source]¶

Compute log-posterior density values; this function assumes the likelihood is a product of independent Gaussian distributions

Parameters

state: python list or numpy array: model parameters
params: dictionary: detailed settings for the epidemiological model

Returns

llik: float: natural logarithm of the likelihood density
lpri: float: natural logarithm of the prior density

src.prime_posterior.logpost_negb(state, params)[source]¶

Compute log-posterior density values; this function assumes the likelihood is a product of negative-binomial distributions

Parameters

state: python list or numpy array: model parameters
params: dictionary: detailed settings for the epidemiological model

Returns

llik: float: natural logarithm of the likelihood density
lpri: float: natural logarithm of the prior density

src.prime_posterior.logpost_poisson(state, params)[source]¶

Compute log-posterior density values; this function assumes the likelihood is a product of poisson distributions

Parameters

state: python list or numpy array: model parameters
params: dictionary: detailed settings for the epidemiological model

Returns

llik: float: natural logarithm of the likelihood density
lpri: float: natural logarithm of the prior density

src.prime_mcmc.ammcmc(opts, cini, likTpr, lpinfo)[source]¶

Adaptive Metropolis Markov Chain Monte Carlo

Parameters

optsdictionary of parameters

nsteps : no. of mcmc steps
nburn : no. of mcmc steps for burn-in (proposal fixed to initial covariance)
nadapt : adapt every nadapt steps after nburn
nfinal : stop adapting after nfinal steps
inicov : initial covariance
coveps : small additive factor to ensure covariance matrix is positive definite (only added to diagonal if covariance matrix is singular without it)
burnsc : factor to scale up/down proposal if acceptance rate is too high/low
gamma : factor to multiply proposed jump size with in the chain past the burn-in phase (Reduce this factor to get a higher acceptance rate. Defaults to 1.0)
spllo : lower bounds for chain samples
splhi : upper bounds for chain samples
rnseed : Optional seed for random number generator (needs to be integer >= 0) If not specified, then random number seed is not fixed and every chain will be different.
tmpchn : Optional; if present, will save chain state every ‘ofreq’ to ascii file. Filename is randomly generated if tmpchn is set to ‘tmpchn’, or set to the string passed through this option if not present, chain states are not saved during the MCMC progress

cinistarting mcmc state

likTprlog-posterior function; it takes two input parameters as follows

first parameter is a 1D array containing the chain state at which the posterior will to be evaluated
the second parameter contains settings the user can pass to this function; see below info for ‘lpinfo’
this function is expected to return log-Likelihood and log-Prior values (in this order)

lpinfoinfo to be passed to the log-posterior function

this object can be of any type (e.g. None, scalar, list, array, dictionary, etc) as long as it is consistent with settings expected inside the ‘likTpr’ function

Returns

mcmcRes: results dictionary

‘chain’ : chain samples (nsteps x chain dimension)
‘cmap’ : MAP estimate
‘pmap’ : MAP log posterior
‘accr’ : overall acceptance rate
‘accb’ : fraction of samples inside bounds
‘rejAll’ : overall no. of samples rejected
rejOut’ : no. of samples rejected due to being outside bounds
‘minfo’ : meta_info, acceptance probability, log likelihood, log prior
‘final_cov’ : the covariance matrix at the end of the run

Statistical Utilities¶

src.prime_stats.computeAICandBIC(run_setup, verbose=0)[source]¶

Compute Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC)

Parameters

run_setup: dictionary with run settings; see the Examples section in the manual

Returns

AIC: float
BIC: float

src.prime_stats.computeCRPS(run_setup)[source]¶

Compute Continuous Rank Predictive Score (CRPS)

Parameters

run_setup: dictionary with run settings; see the Examples section in the manual

Returns

CRPS: float

src.prime_stats.distcorr(spl)[source]¶

Compute distance correlation between random vectors

Parameters

spl: numpy array [number of samples x number of variables]: first dimension is the number of samples, second dimension is the number of random vectors

Returns

Returns a 2D array of distance correlations between pairs of random vectors;: only entries 0<=j<i<no. of random vectors are populated
References:: http://en.wikipedia.org/wiki/Distance_correlation

src.prime_stats.getKDE(spl, nskip=0, nthin=1, npts=100, bwfac=1.0)[source]¶

Compute 1D and 2D marginal PDFs via Kernel Density Estimate

Parameters

spl: numpy array: MCMC chain [number of samples x number of parameters]
nskip: int: number of initial samples to skip when sampling the MCMC chain
nthin: int: use every ‘nthin’ samples
npts: int: number of grid points
bwfac: double: bandwidth factor

Returns

dict: dictionary with results: ‘x1D’: list of numpy arrays with grids for the 1D PDFs; ‘p1D’: list of numpy arrays with 1D PDFs; ‘x2D’: list of numpy arrays of x-axis grids for the 2D PDFs; ‘y2D’: list of numpy arrays of y-axis grids for the 2D PDFs; ‘p2D’: list of numpy arrays containing 2D PDFs

General Utilities¶

src.prime_utils.compute_error_weight(error_info, days)[source]¶

Compute array with specified weighting for the daily cases data. The weights follow either linear of Gaussian expressions with higher weights for recent data and lower weights for older data

Parameters

error_info: list: (error_type,min_wgt,[tau]), error type is either ‘linear’ or ‘gaussian’, min_wgt is the minimum weight and tau is the standard deviation of the exponential term if a Gaussian formulation is chosen.
days: int: lenght of the weights array
Returns
——-
error_weight: numpy array: array of weights

src.prime_utils.prediction_filename(run_setup)[source]¶

Generate informative name for hdf5 file with prediction data

Parameters

run_setup: dictionary: detailed settings for the epidemiological model

Returns

filename: string: file name ending with a .h5 extension

src.prime_utils.runningAvg(f, nDays)[source]¶

Apply nDays running average to the input f

Parameters

f: numpy array: array (with daily data for this project) to by filtered
nDays: int: window width for the running average

Returns

favg: numpy array: filtered data