Monte Carlo Quadrature

This tutorial describes how to use Monte Carlo sampling to compute the expectations of the output of a model $f (z) : R^{D} \to R$ parameterized by a set of variables $z = [z_{1}, \dots, z_{D}]^{⊤}$ with joint density given by $ρ (z) : R^{d} \to R$ . Specifically, our goal is to approximate the integral

Q = \int_{Γ} f (z) ρ (z) d z

using Monte Carlo quadrature applied to an approximation $f_{α}$ of the function $f$ , e.g. a representing a finite element approximation to the solution of a set of governing equations, where $α$ is contols the accuracy of the approximation.

Monte Carlo quadrature approximates the integral

Q_{α} = \int_{Γ} f_{α} (z) ρ (z) d z \approx Q

by drawing $N$ random samples $Z_{N}$ of $z$ from $ρ$ and evaluating the function at each of these samples to obtain the data pairs ${(z^{(n)}, f_{α}^{(n)})}_{n = 1}^{N}$ , where $f_{α}^{(n)} = f_{α} (z^{(n)})$ and computing

Q_{α} (Z_{N}) = N^{- 1} \sum_{n = 1}^{N} f_{α}^{(n)}

This estimate of the mean,is itself a random quantity, which we call an estimator, because its value depends on the $Z_{N}$ realizations of the inputs $Z_{N}$ used to compute $Q_{α} (Z_{N})$ . Specifically, using two different sets $Z_{N}$ will produce to different values.

To demonstrate this phenomenon, we will estimate the mean of a simple algebraic function $f_{0}$ which belongs to an ensemble of models

\begin{aligned} f_{0} (z) & = A_{0} (z_{1}^{5} \cos θ_{0} + z_{2}^{5} \sin θ_{0}), \\ f_{1} (z) & = A_{1} (z_{1}^{3} \cos θ_{1} + z_{2}^{3} \sin θ_{1}) + s_{1}, \\ f_{2} (z) & = A_{2} (z_{1} \cos θ_{2} + z_{2} \sin θ_{2}) + s_{2} \end{aligned}

where $z_{1}, z_{2} \sim U (- 1, 1)$ and all $A$ and $θ$ coefficients are real. We choose to set $A = \sqrt{11}$ , $A_{1} = \sqrt{7}$ and $A_{2} = \sqrt{3}$ to obtain unitary variance for each model. The parameters $s_{1}, s_{2}$ control the bias between the models. Here we set $s_{1} = 1 / 10, s_{2} = 1 / 5$ . Similarly we can change the correlation between the models in a systematic way (by varying $θ_{1}$ . We will levarage this later in the tutorial.

First setup the example

import numpy as np
import matplotlib.pyplot as plt

from pyapprox.benchmarks import setup_benchmark

np.random.seed(1)
shifts = [.1, .2]
benchmark = setup_benchmark(
    "tunable_model_ensemble", theta1=np.pi/2*.95, shifts=shifts)

Now define a function that computes MC estimates of the mean using different sample sets $Z_{N}$ and plots the distribution the MC estimator $Q_{α} (Z_{N})$ , computed from 1000 different sets, and the exact value of the mean $Q_{α}$

def plot_estimator_histrogram(nsamples, model_id, ax):
    ntrials = 1000
    np.random.seed(1)
    means = np.empty((ntrials))
    model = benchmark.funs[model_id]
    for ii in range(ntrials):
        samples = benchmark.variable.rvs(nsamples)
        values = model(samples)
        means[ii] = values.mean()
    im = ax.hist(means, bins=ntrials//100, density=True, alpha=0.3,
                 label=r'$Q_{%d}(\mathcal{Z}_{%d})$' % (model_id, nsamples))[2]
    ax.axvline(x=benchmark.fun.get_means()[model_id], alpha=1,
               label=r'$Q_{%d}$' % model_id, c=im[0].get_facecolor())

Now lets plot the historgram of the MC estimator $Q_{0} (Z_{N})$ using $N = 100$ samples

nsamples = int(1e2)
model_id = 0
ax = plt.subplots(1, 1, figsize=(8, 6))[1]
plot_estimator_histrogram(nsamples, model_id, ax)
_ = ax.legend()

The variability of the MC estimator as we change $Z_{N}$ decreases as we increase $N$ . To see this, plot the estimator historgram using $N = 1000$ samples

model_id = 0
ax = plt.subplots(1, 1, figsize=(8, 6))[1]
nsamples = int(1e2)
plot_estimator_histrogram(nsamples, model_id, ax)
nsamples = int(1e3)
plot_estimator_histrogram(nsamples, model_id, ax)
_ = ax.legend()

Regardless of the value of $N$ the estimator $Q_{0} (Z_{N})$ is an unbiased estimate of $Q_{0}$ , that is

E [Q_{0} (Z_{N})] - Q_{0} = 0

Unfortunately, if the computational cost of evaluating a model is high, then one may not be able to make $N$ large using that model. Consequently, one will not be able to trust the MC estimate of the mean much because any one realization of the estimator, computed using a single sample set, may obtain a value that is very far from the truth. So often a cheaper less accurate model is used so that $N$ can be increased to reduce the variability of the estimator. The following compares the histograms of $Q_{0} (Z_{100})$ and $Q_{1} (Z_{1000})$ which uses the model $f_{1}$ which we assume is a cheap approximation of $f_{0}$

model_id = 0
ax = plt.subplots(1, 1, figsize=(8, 6))[1]
nsamples = int(1e2)
plot_estimator_histrogram(nsamples, model_id, ax)
model_id = 1
nsamples = int(1e3)
plot_estimator_histrogram(nsamples, model_id, ax)
_ = ax.legend()

However, using an approximate model means that the MC estimator is no longer unbiased. The mean of the histogram of $Q_{1} (Z_{1000})$ is no longer the mean of $Q_{0}$

Letting $Q$ denote the true mean we want to estimate, e.g. $Q = Q_{0}$ in the example we have used so far, the mean squared error (MSE) is typically used to quantify the quality of a MC estimator. The MSE can be expressed as

\begin{aligned} E [{(Q_{α} (Z_{N}) - Q)}^{2}] & = E [{(Q_{α} (Z_{N}) - E [Q_{α} (Z_{N})] + E [Q_{α} (Z_{N})] - Q)}^{2}] \\ = E [{(Q_{α} (Z_{N}) - E [Q_{α} (Z_{N})])}^{2}] + E [{(E [Q_{α} (Z_{N})] - Q)}^{2}] \\ + E [2 (Q_{α} (Z_{N}) - E [Q_{α} (Z_{N})]) (E [Q_{α} (Z_{N})] - Q)] \\ = V [Q_{α} (Z_{N})] + {(E [Q_{α} (Z_{N})] - Q)}^{2} \\ = V [Q_{α} (Z_{N})] + {(Q_{α} - Q)}^{2} \end{aligned}

where the expectation $E$ and variance $V$ are taken over different realization of the sample set $Z_{N}$ , we used that $E [(Q_{α} (Z_{N}) - E [Q_{α} (Z_{N})])] = 0$ so the third term on the second line is zero, and we used $E [Q_{α} (Z_{N})] = Q_{α}$ to get the final equality.

Now using the well known result that for random variable $X_{n}$

V [\sum_{n = 1}^{N} X_{n}] = \sum_{n = 1}^{N} V [X_{n}] + \sum_{n \neq p} C ov [X_{n}, X_{p}]

and the result for a scalar $a$

V [a X_{n}] = a^{2} V [X_{n}]

yields

V [Q_{α} (Z_{N})] = V [N^{- 1} \sum_{n = 1}^{N} f_{α}^{(n)}] = N^{- 2} \sum_{n = 1}^{N} V [f_{α}^{(n)}] = N^{- 1} V [f_{α}]

where $C ov [f^{(n)}, f^{(p)}] = 0, n \neq p$ because the samples are drawn independently.

Finally, substituting $V [Q_{α} (Z_{N})]$ into the expression for MSE yields

E [{(Q_{α} (Z_{N}) - E [Q])}^{2}] = \underset{I}{\underset{⏟}{N^{- 1} V [f_{α}]}} + \underset{I I}{\underset{⏟}{{(Q_{α} - Q)}^{2}}}

From this expression we can see that the MSE can be decomposed into two terms; a so called stochastic error (I) and a deterministic bias (II). The first term is the variance of the Monte Carlo estimator which comes from using a finite number of samples. The second term is due to using an approximation of $f_{0}$ . These two errors should be balanced, however in the vast majority of all MC analyses a single model $f_{α}$ is used and the choice of $α$ , e.g. mesh resolution, is made a priori without much concern for the balancing bias and variance.

Video

Click on the image below to view a video tutorial on Monte Carlo quadrature

Total running time of the script: ( 0 minutes 0.332 seconds)

Gallery generated by Sphinx-Gallery