Approximate Control Variate Monte Carlo

This tutorial builds upon Control Variate Monte Carlo and describes how to implement and deploy approximate control variate Monte Carlo (ACVMC) sampling to compute expectations of model output from multiple low-fidelity models with unknown means.

CVMC is often not useful for practical analysis of numerical models because typically the mean of the lower fidelity model, i.e. μκ, is unknown and the cost of the lower fidelity model is non trivial. These two issues can be overcome by using approximate control variate Monte Carlo.

Let the cost of the high fidelity model per sample be Cα and let the cost of the low fidelity model be Cκ. Now lets use N samples to estimate Qα,N and Qκ,N and these N samples plus another (r1)N samples to estimate μκ so that

Qα,N,rACV=Qα,N+η(Qκ,Nμκ,N,r)

and

μκ,N,r=1rNi=1rNQκ

With this sampling scheme we have

Qκ,Nμκ,N,r=1Ni=1Nfκ(i)1rNi=1rNfκ(i)=1Ni=1Nfκ(i)1rNi=1Nfκ(i)1rNi=NrNfκ(i)=r1rNi=1Nfκ(i)1rNi=NrNfκ(i)

where for ease of notation we write rκN and rκN interchangibly. Using the above expression yields

V[(Qκ,Nμκ,N,r)]=E[(r1rNi=1Nfκ(i)1rNi=NrNfκ(i))2]=(r1)2r2N2i=1NV[fκ(i)]+1r2N2i=NrNV[fκ(i)]=(r1)2r2N2NV[fκ]+1r2N2(r1)NV[fκ]=r1rV[fκ]N

where we have used the fact that since the samples used in the first and second term on the first line are not shared, the covariance between these terms is zero. Also we have

Cov[Qα,N,(Qκ,Nμκ,N,r)]=Cov[1Ni=1Nfα(i),r1rNi=1Nfκ(i)1rNi=NrNfκ(i)]

The correlation between the estimators 1Ni=1NQα and 1rNi=NrNQκ is zero because the samples used in these estimators are different for each model. Thus

Cov[Qα,N,(Qκ,Nμκ,N,r)]=Cov[1Ni=1Nfα(i),r1rNi=1Nfκ(i)]=r1rCov[fα,fκ]N

Recalling the variance reduction of the CV estimator using the optimal η is

γ=1Cov[Qα,N,(Qκ,Nμκ,N,r)]2V[(Qκ,Nμκ,N,r)]V[Qα,N]=1N2(r1)2r2Cov[fα,fκ]N1r1rV[fκ]N1V[fα]=1r1rCor[fα,fκ]2

which is found when

η=Cov[Qα,N,(Qκ,Nμκ,N,r)]V[(Qκ,Nμκ,N,r)]=N1r1rCov[fα,fκ]N1r1rV[fκ]=Cov[fα,fκ]V[fκ]

Lets setup the problem and compute an ACV estimate of E[f0]

import numpy as np
import matplotlib.pyplot as plt

from pyapprox.benchmarks import setup_benchmark

np.random.seed(1)
shifts = [.1, .2]
benchmark = setup_benchmark(
    "tunable_model_ensemble", theta1=np.pi/2*.95, shifts=shifts)
model = benchmark.fun
exact_integral_f0 = benchmark.means[0]

Before proceeding to estimate the mean using ACVMV we must first define how to generate samples to estimate Qα,N and μκ,N,r. To do so clearly we must first introduce some additional notation. Let Z0 be the set of samples used to evaluate the high-fidelity model and let Zα=Zα,1Zα,2 be the samples used to evaluate the low fidelity model. Using this notation we can rewrite the ACV estimator as

Qα,ZACV=Qα,Z0+η(Qκ,Zα,1μκ,Zα,2)

where Z=α=0MZα. The nature of these samples can be changed to produce different ACV estimators. Here we choose Zα,1Zα,2= and Zα,1=Z0. That is we use the set a common set of samples to compute the covariance between all the models and a second independent set to estimate the lower fidelity mean. The sample partitioning for M models is shown in the following Figure. We call this scheme the ACV IS sampling strategy where IS indicates that the second sample set Zα,2 assigned to each model are not shared.

../../_images/acv_is.png

ACV IS sampling strategy

The following code generates samples according to this strategy

from pyapprox import multifidelity
model_costs = 10.**(-np.arange(3))
est = multifidelity.get_estimator(
    "acvis", benchmark.model_covariance[:2, :2], model_costs[:2],
    benchmark.variable)
est.nsamples_per_model = np.array([10, 100])
samples_per_model, partition_indices_per_model = \
    est.generate_sample_allocations()

print(partition_indices_per_model)
samples_shared = samples_per_model[0]
samples_lf_only = samples_per_model[1][:, partition_indices_per_model[1] == 1]
[array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]), array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])]

Now lets plot the samples assigned to each model.

fig, ax = plt.subplots()
ax.plot(samples_shared[0, :], samples_shared[1, :], 'ro', ms=12,
        label=r'$\mathrm{Low\ and\  high\  fidelity\  models}$')
ax.plot(samples_lf_only[0, :], samples_lf_only[1, :], 'ks',
        label=r'$\mathrm{Low\  fidelity\  model\ only}$')
ax.set_xlabel(r'$z_1$')
ax.set_ylabel(r'$z_2$', rotation=0)
_ = ax.legend(loc='upper left')
plot approximate control variate monte carlo

The high-fidelity model is only evaluated on the red dots. Now lets use these samples to estimate the mean of f0.

values_per_model = []
for ii in range(len(samples_per_model)):
    values_per_model.append(model.models[ii](samples_per_model[ii]))

acv_mean = est.estimate_from_values_per_model(
    values_per_model, partition_indices_per_model)

print('MC difference squared =', (
    values_per_model[0].mean()-exact_integral_f0)**2)
print('ACVMC difference squared =', (acv_mean-exact_integral_f0)**2)
MC difference squared = 0.1663308327274871
ACVMC difference squared = 0.010697826122695035

Note here we have arbitrarily set the number of high fidelity samples N and the ratio r. In practice one should choose these in one of two ways: (i) for a fixed budget choose the free parameters to minimize the variance of the estimator; or (ii) choose the free parameters to achieve a desired MSE (variance) with the smallest computational cost. Note the cost of computing the two model ACV estimator is

Ccv=NCα+rκNCκ

Now lets compute the variance reduction for different sample sizes

from pyapprox import interface
model_ensemble = interface.ModelEnsemble(model.models[:2])
nhf_samples = 10
ntrials = 1000
nsample_ratios = np.array([10])
nsamples_per_model = np.hstack((1, nsample_ratios))*nhf_samples
target_cost = np.dot(model_costs[:2], nsamples_per_model)
means, numerical_var, true_var = \
    multifidelity.estimate_variance(
        model_ensemble, est, target_cost, ntrials, nsample_ratios)

print("Theoretical ACV variance", true_var)
print("Achieved ACV variance", numerical_var)
Theoretical ACV variance 0.014971109874541333
Achieved ACV variance 0.01617124648499629

Let us also plot the distribution of these estimators

fig, ax = plt.subplots()
ax.hist(means[:, 0], bins=ntrials//100, density=True, alpha=0.5,
        label=r'$Q_{0, N}$')
ax.hist(means[:, 1], bins=ntrials//100, density=True, alpha=0.5,
        label=r'$Q_{0, N, %d}^\mathrm{CV}$' % nsample_ratios[0])

nsample_ratios = np.array([100])
nsamples_per_model = np.hstack((1, nsample_ratios))*nhf_samples
target_cost = np.dot(model_costs[:2], nsamples_per_model)
means, numerical_var, true_var = \
    multifidelity.estimate_variance(
        model_ensemble, est, target_cost, ntrials, nsample_ratios)
ax.hist(means[:, 1], bins=ntrials//100, density=True, alpha=0.5,
        label=r'$Q_{0, N, %d}^\mathrm{CV}$' % nsample_ratios[0])
ax.axvline(x=0, c='k', label=r'$E[Q_0]$')
_ = ax.legend(loc='upper left')
plot approximate control variate monte carlo

For a fixed number of high-fidelity evaluations N the ACVMC variance reduction will converge to the CVMC variance reduction. Try changing N.

References

Total running time of the script: ( 0 minutes 0.664 seconds)

Gallery generated by Sphinx-Gallery