.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_auto_tutorials_foundations_plot_monte_carlo.py>`     to download the full example code
    .. rst-class:: sphx-glr-example-title

    .. _sphx_glr_auto_tutorials_foundations_plot_monte_carlo.py:


Monte Carlo Quadrature
======================
This tutorial describes how to use Monte Carlo sampling to compute the expectations of the output of an model. The following tutorials on multi-fidelity Monate Carlo methods assume the reader has understanding of the material presented here.

We can approximate th integral :math:`Q_\alpha` using Monte Carlo quadrature by drawing :math:`N` random samples of :math:`\rv` from :math:`\pdf` and evaluating the function at each of these samples to obtain the data pairs :math:`\{(\rv^{(n)},f^{(n)}_\alpha)\}_{n=1}^N`, where :math:`f^{(n)}_\alpha=f_\alpha(\rv^{(n)})`

.. math::

   Q_{\alpha,N}=N^{-1}\sum_{n=1}^N f^{(n)}_\alpha

The mean squared error (MSE) of this estimator can be expressed as

.. math::
   
   \mean{\left(Q_{\alpha,N}-\mean{Q}\right)^2}&=\mean{\left(Q_{\alpha,N}-\mean{Q_{\alpha,N}}+\mean{Q_{\alpha,N}}-\mean{Q}\right)^2}\\
   &=\mean{\left(Q_{\alpha,N}-\mean{Q_{\alpha,N}}\right)^2}+\mean{\left(\mean{Q_{\alpha,N}}-\mean{Q}\right)^2}\\
   &\qquad\qquad+\mean{2\left(Q_{\alpha,N}-\mean{Q_{\alpha,N}}\right)\left(\mean{Q_{\alpha,N}}-\mean{Q}\right)}\\
   &=\var{Q_{\alpha,N}}+\left(\mean{Q_{\alpha,N}}-\mean{Q}\right)^2
   
Here we used that :math:`Q_{\alpha,N}` is an unbiased estimator, i.e. :math:`\mean{Q_{\alpha,N}}=\mean{Q}` so the third term on the second line is zero. Now using

.. math::

   \var{Q_{\alpha,N}}=\var{N^{-1}\sum_{n=1}^N f^{(n)}_\alpha}=N^{-1}\sum_{n=1}^N \var{f^{(n)}_\alpha}=N^{-1}\var{Q_\alpha}

yields

.. math::

   \mean{\left(Q_{\alpha}-\mean{Q}\right)^2}=\underbrace{N^{-1}\var{Q_\alpha}}_{I}+\underbrace{\left(\mean{Q_{\alpha}}-\mean{Q}\right)^2}_{II}

From this expression we can see that the MSE can be decomposed into two terms;
a so called stochastic error (I) and a deterministic bias (II). The first term is the variance of the Monte Carlo estimator which comes from using a finite number of samples. The second term is due to using an approximation of :math:`f`. These two errors should be balanced, however in the vast majority of all MC analyses a single model :math:`f_\alpha` is used and the choice of :math:`\alpha`, e.g. mesh resolution, is made a priori without much concern for the balancing bias and variance. 

Given a fixed :math:`\alpha` the modelers only recourse to reducing the MSE is to reduce the variance of the estimator. In the following we plot the variance of the MC estimate of a simple algebraic function :math:`f_1` which belongs to an ensemble of models

.. math::

   f_0(\rv) &= A_0 \left(\rv_1^5\cos\theta_0 +\rv_2^5\sin\theta_0\right), \\
   f_1(\rv) &= A_1 \left(\rv_1^3\cos\theta_1 +\rv_2^3\sin\theta_1\right)+s_1,\\
   f_2(\rv) &= A_2 \left(\rv_1  \cos\theta_2 +\rv_2  \sin\theta_2\right)+s_2


where :math:`\rv_1,\rv_2\sim\mathcal{U}(-1,1)` and all :math:`A` and :math:`\theta` coefficients are real. We choose to set :math:`A=\sqrt{11}`, :math:`A_1=\sqrt{7}` and :math:`A_2=\sqrt{3}` to obtain unitary variance for each model. The parameters :math:`s_1,s_2` control the bias between the models. Here we set :math:`s_1=1/10,s_2=1/5`. Similarly we can change the correlation between the models in a systematic way (by varying :math:`\theta_1`. We will levarage this later in the tutorial.

Lets setup the problem


.. code-block:: default

    import pyapprox as pya
    import numpy as np
    import matplotlib.pyplot as plt
    from pyapprox.tests.test_control_variate_monte_carlo import TunableModelEnsemble
    from scipy.stats import uniform

    np.random.seed(1)
    univariate_variables = [uniform(-1,2),uniform(-1,2)]
    variable = pya.IndependentMultivariateRandomVariable(univariate_variables)
    print(variable)
    shifts=[.1,.2]
    model = TunableModelEnsemble(np.pi/2*.95,shifts=shifts)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    I.I.D. Variable
    Number of variables: 2
    Unique variables and global id:
        uniform(loc=-1,scale=2): z0, z1


Now let us compute the mean of :math:`f_1` using Monte Carlo


.. code-block:: default

    nsamples = int(1e3)
    samples = pya.generate_independent_random_samples(
        variable,nsamples)
    values = model.m1(samples)
    pya.print_statistics(samples,values)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

                    z0           z1           y0
    count  1000.000000  1000.000000  1000.000000
    mean      0.001209     0.036225     0.159514
    std       0.577001     0.586788     1.026603
    min      -0.999771    -0.998471    -2.522199
    25%      -0.496131    -0.491685    -0.212117
    50%       0.015002     0.067357     0.105057
    75%       0.501343     0.560416     0.588887
    max       0.994646     0.997041     2.854412


We can compute the exact mean using sympy and compute the MC MSE


.. code-block:: default

    import sympy as sp
    z1,z2 = sp.Symbol('z1'),sp.Symbol('z2')
    ranges = [-1,1,-1,1]
    integrand_f1=model.A1*(sp.cos(model.theta1)*z1**3+sp.sin(model.theta1)*z2**3)+shifts[0]*0.25
    exact_integral_f1 = float(
        sp.integrate(integrand_f1,(z1,ranges[0],ranges[1]),(z2,ranges[2],ranges[3])))

    print('MC difference squared =',(values.mean()-exact_integral_f1)**2)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    MC difference squared = 0.0035419268772882277


.. _estimator-histogram:

Now let us compute the MSE for different sample sets of the same size for :math:`N=100,1000` and plot the distribution of the MC estimator :math:`Q_{\alpha,N}`


.. code-block:: default


    ntrials=1000
    means = np.empty((ntrials,2))
    for ii in range(ntrials):
        samples = pya.generate_independent_random_samples(
            variable,nsamples)
        values = model.m1(samples)
        means[ii] = values[:100].mean(),values.mean()
    fig,ax = plt.subplots()
    textstr = '\n'.join([r'$\mathbb{E}[Q_{1,100}]=\mathrm{%.2e}$'%means[:,0].mean(),
                         r'$\mathbb{V}[Q_{1,100}]=\mathrm{%.2e}$'%means[:,0].var(),
                         r'$\mathbb{E}[Q_{1,1000}]=\mathrm{%.2e}$'%means[:,1].mean(),
                         r'$\mathbb{V}[Q_{1,1000}]=\mathrm{%.2e}$'%means[:,1].var()])
    ax.hist(means[:,0],bins=ntrials//100,density=True)
    ax.hist(means[:,1],bins=ntrials//100,density=True,alpha=0.5)
    ax.axvline(x=shifts[0],c='r',label=r'$\mathbb{E}[Q_1]$')
    ax.axvline(x=0,c='k',label=r'$\mathbb{E}[Q_0]$')
    props = {'boxstyle':'round','facecolor':'white','alpha':1}
    ax.text(0.65,0.8,textstr,transform=ax.transAxes,bbox=props)
    ax.set_xlabel(r'$\mathbb{E}[Q_N]$')
    ax.set_ylabel(r'$\mathbb{P}(\mathbb{E}[Q_N])$')
    _ = ax.legend(loc='upper left')


.. image:: /auto_tutorials/foundations/images/sphx_glr_plot_monte_carlo_001.png
    :alt: plot monte carlo
    :class: sphx-glr-single-img


The numerical results match our theory. Specifically the estimator is unbiased( i.e. mean zero, and the variance of the estimator is :math:`\var{Q_{0,N}}=\var{Q_{0}}/N=1/N`.

The variance of the estimator can be driven to zero by increasing the number of samples :math:`N`. However when the variance becomes less than the bias, i.e. :math:`\left(\mean{Q_{\alpha}-Q}\right)^2>\var{Q_{\alpha}}/N`, then the MSE will not decrease and any further samples used to reduce the variance are wasted.

Let our true model be :math:`f_0` above. The following code compues the bias induced by using :math:`f_\alpha=f_1` and also plots the contours of :math:`f_0(\rv)-f_1(\rv)`.


.. code-block:: default


    integrand_f0 = model.A0*(sp.cos(model.theta0)*z1**5+
                             sp.sin(model.theta0)*z2**5)*0.25
    exact_integral_f0 = float(
        sp.integrate(integrand_f0,(z1,ranges[0],ranges[1]),(z2,ranges[2],ranges[3])))
    bias = (exact_integral_f0-exact_integral_f1)**2
    print('MC f1 bias =',bias)
    print('MC f1 variance =',means.var())
    print('MC f1 MSE =',bias+means.var())

    fig,ax = plt.subplots()
    X,Y,Z = pya.get_meshgrid_function_data(
        lambda z: model.m0(z)-model.m1(z),[-1,1,-1,1],50)
    cset = ax.contourf(X, Y, Z, levels=np.linspace(Z.min(),Z.max(),20))
    _ = plt.colorbar(cset,ax=ax)
    #plt.show()


.. image:: /auto_tutorials/foundations/images/sphx_glr_plot_monte_carlo_002.png
    :alt: plot monte carlo
    :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    MC f1 bias = 0.010000000000000018
    MC f1 variance = 0.005522183703329499
    MC f1 MSE = 0.015522183703329516


As :math:`N\to\infty` the MSE will only converge to the bias (:math:`s_1`). Try this by increasing :math:`\texttt{nsamples}`.

We can produced unbiased estimators using the high fidelity model. However if this high-fidelity model is more expensive then this comes at the cost of the estimator having larger variance. To see this the following plots the distribution of the MC estimators using 100 samples of the :math:`f_1` and 10 samples of :math:`f_0`. The cost of constructing these estimators would be equivalent if the high-fidelity model is 10 times more expensive than the low-fidelity model.


.. code-block:: default

    ntrials=1000
    m0_means = np.empty((ntrials,1))
    for ii in range(ntrials):
        samples = pya.generate_independent_random_samples(
            variable,nsamples)
        values = model.m0(samples)
        m0_means[ii] = values[:10].mean()

    fig,ax = plt.subplots()
    textstr = '\n'.join([r'$\mathbb{E}[Q_{1,100}]=\mathrm{%.2e}$'%means[:,0].mean(),
                         r'$\mathbb{V}[Q_{1,100}]=\mathrm{%.2e}$'%means[:,0].var(),
                         r'$\mathbb{E}[Q_{0,10}]=\mathrm{%.2e}$'%m0_means[:,0].mean(),
                         r'$\mathbb{V}[Q_{0,10}]=\mathrm{%.2e}$'%m0_means[:,0].var()])
    ax.hist(means[:,0],bins=ntrials//100,density=True)
    ax.hist(m0_means[:,0],bins=ntrials//100,density=True,alpha=0.5)
    ax.axvline(x=shifts[0],c='r',label=r'$\mathbb{E}[Q_1]$')
    ax.axvline(x=0,c='k',label=r'$\mathbb{E}[Q_0]$')
    props = {'boxstyle':'round','facecolor':'white','alpha':1}
    ax.text(0.65,0.8,textstr,transform=ax.transAxes,bbox=props)
    ax.set_xlabel(r'$\mathbb{E}[Q_N]$')
    ax.set_ylabel(r'$\mathbb{P}(\mathbb{E}[Q_N])$')
    _ = ax.legend(loc='upper left')
    #plt.show()


.. image:: /auto_tutorials/foundations/images/sphx_glr_plot_monte_carlo_003.png
    :alt: plot monte carlo
    :class: sphx-glr-single-img


In a series of tutorials starting with :ref:`sphx_glr_auto_tutorials_multi_fidelity_plot_control_variate_monte_carlo.py` we show how to produce an unbiased estimator with small variance using both these models.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  1.132 seconds)


.. _sphx_glr_download_auto_tutorials_foundations_plot_monte_carlo.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example


  .. container:: sphx-glr-download sphx-glr-download-python

     :download:`Download Python source code: plot_monte_carlo.py <plot_monte_carlo.py>`


  .. container:: sphx-glr-download sphx-glr-download-jupyter

     :download:`Download Jupyter notebook: plot_monte_carlo.ipynb <plot_monte_carlo.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_