Adaptive Sparse Grids: Data-Driven Refinement for Anisotropic Functions

PyApprox Tutorial Library

Implement the greedy adaptive algorithm, configure admissibility criteria, visualize index set evolution, and compare adaptive vs. isotropic sparse grids on anisotropic functions.

Download Notebook

Download as Jupyter Notebook

Learning Objectives

After completing this tutorial, you will be able to:

Explain why adaptive refinement outperforms isotropic grids for anisotropic functions
Use the step_samples() / step_values() pattern with SingleFidelityAdaptiveSparseGridFitter
Understand the knapsack formulation of index set selection
Configure admissibility criteria: MaxLevelCriteria, Max1DLevelsCriteria, CompositeCriteria
Visualise the adaptive refinement process: index sets, point distributions, and convergence
Compare adaptive vs. isotropic convergence on anisotropic and 4D benchmark functions

Prerequisites

Complete Isotropic Sparse Grids before this tutorial. You should understand the Smolyak combination technique, downward-closed index sets, and the IsotropicSparseGridFitter API.

Setup

import numpy as np
import matplotlib.pyplot as plt

from pyapprox.util.backends.numpy import NumpyBkd
from pyapprox.surrogates.sparsegrids import (
    IsotropicSparseGridFitter,
    SingleFidelityAdaptiveSparseGridFitter,
    TensorProductSubspaceFactory,
)
from pyapprox.surrogates.sparsegrids.basis_factory import (
    ClenshawCurtisLagrangeFactory,
)
from pyapprox.surrogates.affine.indices import (
    ClenshawCurtisGrowthRule,
    MaxLevelCriteria,
    Max1DLevelsCriteria,
    CompositeCriteria,
)
from pyapprox.probability.univariate.uniform import UniformMarginal
from pyapprox.interface.functions.plot.plot2d_rectangular import (
    Plotter2DRectangularDomain,
)
from pyapprox.interface.functions.fromcallable.function import (
    FunctionFromCallable,
)
from pyapprox_tutorials.figures._quadrature import (
    plot_additive_function,
    plot_index_set,
    plot_points_compare,
    plot_adaptive_vs_iso,
    plot_4d_adaptive,
)

bkd = NumpyBkd()
np.random.seed(42)

Motivation: Anisotropy

The isotropic sparse grid treats all dimensions symmetrically: level $l$ in dimension 1 pairs with level $l$ in dimension 2. For anisotropic functions — where variation is concentrated in a subset of dimensions — this symmetry is wasteful. An adaptive algorithm that allocates more nodes to the high-variation dimensions and fewer to the low-variation ones will reach a given error tolerance with far fewer function evaluations.

Consider $f(z_1, z_2) = \cos(5z_1) + 0.1 z_2$. This function varies much more in $z_1$ than in $z_2$. The isotropic grid places equal resources in both dimensions; an adaptive grid should concentrate almost all nodes along the $z_1$ axis.

def additive_function(samples, bkd):
    """f(z1, z2) = cos(5*z1) + 0.1*z2. Shape: (1, nsamples)."""
    z1, z2 = samples[0, :], samples[1, :]
    values = bkd.cos(5.0 * z1) + 0.1 * z2
    return bkd.reshape(values, (1, -1))

func_2d = FunctionFromCallable(
    nqoi=1, nvars=2,
    fun=lambda s: additive_function(s, bkd),
    bkd=bkd,
)

fig, ax = plt.subplots(figsize=(7, 5))
plotter = Plotter2DRectangularDomain(func_2d, [-1, 1, -1, 1])
plot_additive_function(ax, plotter)
plt.tight_layout()
plt.show()

The Adaptive Algorithm

Finding the optimal downward-closed index set can be cast as a binary knapsack problem. Let $\Delta E_\beta$ and $\Delta W_\beta$ be the reduction in approximation error and the computational cost, respectively, of adding sub-grid $\beta$:

\[ \max_{\delta} \sum_{\beta} \Delta E_\beta\, \delta_\beta \quad \text{subject to} \quad \sum_{\beta} \Delta W_\beta\, \delta_\beta \leq W_{\max}, \quad \delta_\beta \in \{0, 1\}. \]

Solving this exactly is intractable, so PyApprox uses a greedy adaptive procedure:

Begin with $\mathcal{I} = \{\mathbf{0}\}$.
Identify active indices — candidates satisfying downward closure that are not yet in $\mathcal{I}$.
Evaluate $f$ at all points associated with active indices.
Compute an error indicator $\Delta E_\beta$ for each active index (e.g., variance of the sub-space contribution).
Add the active index with the largest indicator to $\mathcal{I}$.
Update the active set. Repeat until the budget is exhausted.

The Step Pattern

SingleFidelityAdaptiveSparseGridFitter uses a step pattern for incremental construction:

while True:
    new_samples = fitter.step_samples()  # Get new sample points
    if new_samples is None:
        break  # All admissible indices exhausted
    new_values = my_function(new_samples)
    fitter.step_values(new_values)        # Provide function values

Each call to step_samples() returns the points for the next candidate subspace. After evaluating the function at those points and passing the values back via step_values(), the algorithm decides which subspace to add based on the error indicator.

nvars = 2
marginal = UniformMarginal(-1.0, 1.0, bkd)
factories = [ClenshawCurtisLagrangeFactory(marginal, bkd) for _ in range(nvars)]
growth = ClenshawCurtisGrowthRule()

# Admissibility: maximum total level (set high so it does not activate)
admissibility = MaxLevelCriteria(max_level=20, pnorm=1.0, bkd=bkd)

# Create adaptive sparse grid fitter
tp_factory = TensorProductSubspaceFactory(bkd, factories, growth)
fitter = SingleFidelityAdaptiveSparseGridFitter(bkd, tp_factory, admissibility)

# Test samples for error estimation
n_test = 2000
test_samples = bkd.asarray(np.random.uniform(-1, 1, (nvars, n_test)))
true_values = additive_function(test_samples, bkd)

# Track convergence --- stop when error estimate drops below tolerance
nsamples_history = []
error_history = []
tol = 1e-12

for step in range(100):
    new_samples = fitter.step_samples()
    if new_samples is None:
        break

    new_values = additive_function(new_samples, bkd)
    fitter.step_values(new_values)

    # Track approximation error on test set
    ada_result = fitter.result()
    if ada_result.nsamples > 1:
        approx_values = ada_result.surrogate(test_samples)
        error = float(
            bkd.sqrt(bkd.sum((true_values - approx_values) ** 2))
            / np.sqrt(n_test)
        )
        nsamples_history.append(ada_result.nsamples)
        error_history.append(error)

    # Stop when the fitter's built-in error estimate is small
    if fitter.current_error() < tol:
        break

ada_result = fitter.result()

fig, ax = plt.subplots(figsize=(8, 5))
ax.semilogy(nsamples_history, error_history, "o-", color="#2C7FB8", lw=2, ms=5)
ax.set_xlabel("Number of samples", fontsize=12)
ax.set_ylabel("RMS Error", fontsize=12)
ax.set_title("Adaptive Sparse Grid Convergence", fontsize=12)
ax.grid(True, which="both", alpha=0.3)
plt.tight_layout()
plt.show()

Figure 2: Adaptive sparse grid convergence on the additive function. The error decreases rapidly as the algorithm discovers and refines the important dimension.

Visualizing the Selected Index Set

Because $f$ is additive ($f = g_1(z_1) + g_2(z_2)$), no mixed terms are needed: the ideal index set consists only of axis-aligned indices $(k, 0)$ and $(0, k)$. The adaptive algorithm discovers both the additive structure (no cross-indices) and the anisotropy (more levels in $z_1$ than $z_2$):

selected = ada_result.indices
sel_np = bkd.to_numpy(selected).astype(int)

fig, ax = plt.subplots(figsize=(8, 6))
plot_index_set(ax, sel_np)
plt.tight_layout()
plt.show()

Figure 3: Selected index set after adaptive refinement. The algorithm allocates far more levels along z_1 (the dominant dimension) than z_2, demonstrating automatic anisotropy detection.

Notice that the index set contains only axis-aligned indices — no cross-terms like $(2, 1)$ or $(1, 2)$ — reflecting the additive structure of $f$. It also extends much further in $k_1$ than in $k_2$, reflecting the much stronger dependence on $z_1$.

Visualizing Grid Points

# Adaptive points --- selected vs candidate
ada_selected = bkd.to_numpy(fitter.get_samples("selected"))
ada_candidate = bkd.to_numpy(fitter.get_samples("candidate"))

# Isotropic with comparable point count
iso_level = 4
iso_tp_factory = TensorProductSubspaceFactory(bkd, factories, growth)
iso_fitter = IsotropicSparseGridFitter(bkd, iso_tp_factory, iso_level)
iso_samples = bkd.to_numpy(iso_fitter.get_samples())
iso_nsamples = iso_samples.shape[1]

fig, axes = plt.subplots(1, 2, figsize=(12, 5))
plot_points_compare(
    axes, ada_selected, ada_candidate, ada_result.nsamples,
    iso_samples, iso_nsamples, iso_level,
)
plt.tight_layout()
plt.show()

Figure 4: Point distribution for adaptive (left) vs. isotropic (right) sparse grids. The adaptive grid concentrates more points along the first variable. Blue circles are points from selected subspaces; light blue squares are points from candidate subspaces awaiting prioritisation.

Convergence Comparison: Adaptive vs. Isotropic

# Isotropic convergence
iso_nsamples_list, iso_errors = [], []

for level in range(7):
    iso_tp_fac = TensorProductSubspaceFactory(bkd, factories, growth)
    iso_fit = IsotropicSparseGridFitter(bkd, iso_tp_fac, level)
    iso_samp = iso_fit.get_samples()
    iso_vals = additive_function(iso_samp, bkd)
    iso_res = iso_fit.fit(iso_vals)

    approx = iso_res.surrogate(test_samples)
    error = float(
        bkd.sqrt(bkd.sum((true_values - approx) ** 2)) / np.sqrt(n_test)
    )
    iso_nsamples_list.append(iso_samp.shape[1])
    iso_errors.append(error)

fig, ax = plt.subplots(figsize=(8, 5))
plot_adaptive_vs_iso(ax, nsamples_history, error_history,
                     iso_nsamples_list, iso_errors)
plt.tight_layout()
plt.show()

Figure 5: Adaptive vs. isotropic convergence on the additive benchmark. Adaptive refinement reaches the same accuracy with fewer points by concentrating resources on the important dimension.

Admissibility Criteria

The admissibility criteria determine which indices are allowed as candidates for refinement.

Criterion	Constraint	Use case
`MaxLevelCriteria(l, pnorm)`	$\lVert\beta\rVert_p \leq l$	Total budget
`Max1DLevelsCriteria(max_levels)`	$\beta_i \leq l_i$ for all $i$	Per-dimension cap
`CompositeCriteria(c1, c2)`	$c_1(\beta) \wedge c_2(\beta)$	AND of multiple criteria

# Per-dimension maximum levels: more refinement in z1 than z2
max_levels = bkd.asarray([8, 4])
per_dim = Max1DLevelsCriteria(max_levels, bkd=bkd)

# Combine with total level budget
combined = CompositeCriteria(
    MaxLevelCriteria(max_level=10, pnorm=1.0, bkd=bkd),
    per_dim,
)

# Build adaptive grid with combined criteria
custom_tp_factory = TensorProductSubspaceFactory(bkd, factories, growth)
custom_fitter = SingleFidelityAdaptiveSparseGridFitter(
    bkd, custom_tp_factory, combined,
)

for _ in range(25):
    s = custom_fitter.step_samples()
    if s is None:
        break
    custom_fitter.step_values(additive_function(s, bkd))
    if custom_fitter.current_error() < 1e-12:
        break

custom_result = custom_fitter.result()

4D Adaptive Example

Adaptive refinement pays larger dividends in higher dimensions, especially when the function depends strongly on only a few inputs.

nvars4 = 4

def fun4d_additive(samples, bkd):
    """sin(3*z1) + 0.5*z2 + 0.1*z3 + 0.1*z4. Shape: (1, nsamples)."""
    z1, z2, z3, z4 = samples[0], samples[1], samples[2], samples[3]
    values = bkd.sin(3 * z1) + 0.5 * z2 + 0.1 * z3 + 0.1 * z4
    return bkd.reshape(values, (1, -1))

facs_4d = [ClenshawCurtisLagrangeFactory(marginal, bkd) for _ in range(nvars4)]

# Test samples
test_4d = bkd.asarray(np.random.uniform(-1, 1, (nvars4, 2000)))
true_4d = fun4d_additive(test_4d, bkd)

# Adaptive grid with error-based stopping
adm_4d = MaxLevelCriteria(max_level=20, pnorm=1.0, bkd=bkd)
tp_fac_4d = TensorProductSubspaceFactory(bkd, facs_4d, growth)
adaptive_4d = SingleFidelityAdaptiveSparseGridFitter(bkd, tp_fac_4d, adm_4d)

adapt_nsamples, adapt_errors = [], []
for step in range(100):
    s = adaptive_4d.step_samples()
    if s is None:
        break
    adaptive_4d.step_values(fun4d_additive(s, bkd))
    r4 = adaptive_4d.result()
    if r4.nsamples > 1:
        approx = r4.surrogate(test_4d)
        err = float(
            bkd.sqrt(bkd.sum((true_4d - approx) ** 2)) / np.sqrt(2000)
        )
        adapt_nsamples.append(r4.nsamples)
        adapt_errors.append(err)
    if adaptive_4d.current_error() < 1e-12:
        break

# Isotropic comparison
iso_nsamples_4d, iso_errors_4d = [], []
for level in range(5):
    iso_tp_fac4 = TensorProductSubspaceFactory(bkd, facs_4d, growth)
    iso_fit4 = IsotropicSparseGridFitter(bkd, iso_tp_fac4, level)
    iso_samp4 = iso_fit4.get_samples()
    iso_res4 = iso_fit4.fit(fun4d_additive(iso_samp4, bkd))
    approx = iso_res4.surrogate(test_4d)
    err = float(
        bkd.sqrt(bkd.sum((true_4d - approx) ** 2)) / np.sqrt(2000)
    )
    iso_nsamples_4d.append(iso_samp4.shape[1])
    iso_errors_4d.append(err)

fig, ax = plt.subplots(figsize=(8, 5))
plot_4d_adaptive(ax, adapt_nsamples, adapt_errors,
                 iso_nsamples_4d, iso_errors_4d)
plt.tight_layout()
plt.show()

Figure 6: 4D adaptive vs. isotropic convergence for a function with one dominant variable. The adaptive algorithm discovers the important dimension and converges much faster.

Summary

Adaptive sparse grids discover the structure of $f$ from function evaluations alone: for additive functions the algorithm selects only axis-aligned indices (no cross-terms), and for anisotropic functions it allocates more levels to the important dimensions.
The greedy adaptive algorithm selects the index with the largest variance indicator at each step, approximating the solution to a knapsack optimization problem.
For strongly anisotropic functions, adaptive grids can achieve the same accuracy as isotropic grids at a fraction of the cost — the advantage grows with the degree of anisotropy.
Admissibility criteria (MaxLevelCriteria, Max1DLevelsCriteria, CompositeCriteria) provide flexible budget and safety constraints on the index set.
The step-based API (step_samples() / step_values()) allows fine-grained control; use fitter.current_error() to stop when the estimated error drops below a tolerance.
For UQ with sparse grids (moments, Sobol indices, marginal densities), see UQ with Sparse Grids.

Exercises

Anisotropy detection. Run the adaptive algorithm on $f(z_1, z_2, z_3) = \cos(4z_1) + 0.1 z_2 + 0.01 z_3$. After convergence, visualize the index set. Does the algorithm correctly rank dimensions by importance?
Budget comparison. For the 2D additive benchmark, compare adaptive and isotropic grids at 50, 100, and 200 evaluations. At what budget does adaptive become clearly superior?
Criteria impact. Build an adaptive grid with Max1DLevelsCriteria limiting $z_2$ to level 2. Compare the resulting index set and convergence to the unconstrained case.
Statistics. For the 2D benchmark, track $\mathbb{E}[f]$ at each step of the adaptive algorithm and plot its convergence to the analytical value $\sin(5)/5$. Compare with a Monte Carlo estimate using the same number of evaluations.

Next Steps

Multifidelity Sparse Grids — Combine adaptive refinement with a hierarchy of model fidelities to reduce cost further

--- title: "Adaptive Sparse Grids: Data-Driven Refinement for Anisotropic Functions" subtitle: "PyApprox Tutorial Library" description: "Implement the greedy adaptive algorithm, configure admissibility criteria, visualize index set evolution, and compare adaptive vs. isotropic sparse grids on anisotropic functions." tutorial_type: hands-on topic: surrogate_modeling difficulty: intermediate estimated_time: 30 render_time: 12 prerequisites: - isotropic_sparse_grids tags: - surrogate - sparse-grid - adaptive - anisotropic format: html: code-fold: false code-tools: true toc: true execute: echo: true warning: false jupyter: python3 --- ::: {.callout-tip collapse="true"} ## Download Notebook [Download as Jupyter Notebook](notebooks/adaptive_sparse_grids.ipynb) ::: ## Learning Objectives After completing this tutorial, you will be able to: - Explain why adaptive refinement outperforms isotropic grids for anisotropic functions - Use the `step_samples()` / `step_values()` pattern with `SingleFidelityAdaptiveSparseGridFitter` - Understand the knapsack formulation of index set selection - Configure admissibility criteria: `MaxLevelCriteria`, `Max1DLevelsCriteria`, `CompositeCriteria` - Visualise the adaptive refinement process: index sets, point distributions, and convergence - Compare adaptive vs. isotropic convergence on anisotropic and 4D benchmark functions ## Prerequisites Complete [Isotropic Sparse Grids](isotropic_sparse_grids.qmd) before this tutorial. You should understand the Smolyak combination technique, downward-closed index sets, and the `IsotropicSparseGridFitter` API. ## Setup ```{python} import numpy as np import matplotlib.pyplot as plt from pyapprox.util.backends.numpy import NumpyBkd from pyapprox.surrogates.sparsegrids import ( IsotropicSparseGridFitter, SingleFidelityAdaptiveSparseGridFitter, TensorProductSubspaceFactory, ) from pyapprox.surrogates.sparsegrids.basis_factory import ( ClenshawCurtisLagrangeFactory, ) from pyapprox.surrogates.affine.indices import ( ClenshawCurtisGrowthRule, MaxLevelCriteria, Max1DLevelsCriteria, CompositeCriteria, ) from pyapprox.probability.univariate.uniform import UniformMarginal from pyapprox.interface.functions.plot.plot2d_rectangular import ( Plotter2DRectangularDomain, ) from pyapprox.interface.functions.fromcallable.function import ( FunctionFromCallable, ) from pyapprox_tutorials.figures._quadrature import ( plot_additive_function, plot_index_set, plot_points_compare, plot_adaptive_vs_iso, plot_4d_adaptive, ) bkd = NumpyBkd() np.random.seed(42) ``` ## Motivation: Anisotropy The isotropic sparse grid treats all dimensions symmetrically: level $l$ in dimension 1 pairs with level $l$ in dimension 2. For **anisotropic functions** --- where variation is concentrated in a subset of dimensions --- this symmetry is wasteful. An adaptive algorithm that allocates more nodes to the high-variation dimensions and fewer to the low-variation ones will reach a given error tolerance with far fewer function evaluations. Consider $f(z_1, z_2) = \cos(5z_1) + 0.1 z_2$. This function varies much more in $z_1$ than in $z_2$. The isotropic grid places equal resources in both dimensions; an adaptive grid should concentrate almost all nodes along the $z_1$ axis. ```{python} #| fig-cap: "The additive benchmark f(z_1,z_2) = cos(5z_1) + 0.1z_2. Variation along z_1 dominates; the adaptive algorithm should detect this and refine preferentially in z_1." #| label: fig-additive-function def additive_function(samples, bkd): """f(z1, z2) = cos(5*z1) + 0.1*z2. Shape: (1, nsamples).""" z1, z2 = samples[0, :], samples[1, :] values = bkd.cos(5.0 * z1) + 0.1 * z2 return bkd.reshape(values, (1, -1)) func_2d = FunctionFromCallable( nqoi=1, nvars=2, fun=lambda s: additive_function(s, bkd), bkd=bkd, ) fig, ax = plt.subplots(figsize=(7, 5)) plotter = Plotter2DRectangularDomain(func_2d, [-1, 1, -1, 1]) plot_additive_function(ax, plotter) plt.tight_layout() plt.show() ``` ## The Adaptive Algorithm Finding the optimal downward-closed index set can be cast as a **binary knapsack problem**. Let $\Delta E_\beta$ and $\Delta W_\beta$ be the reduction in approximation error and the computational cost, respectively, of adding sub-grid $\beta$: $$ \max_{\delta} \sum_{\beta} \Delta E_\beta\, \delta_\beta \quad \text{subject to} \quad \sum_{\beta} \Delta W_\beta\, \delta_\beta \leq W_{\max}, \quad \delta_\beta \in \{0, 1\}. $$ Solving this exactly is intractable, so PyApprox uses a **greedy adaptive procedure**: 1. Begin with $\mathcal{I} = \{\mathbf{0}\}$. 2. Identify **active indices** --- candidates satisfying downward closure that are not yet in $\mathcal{I}$. 3. Evaluate $f$ at all points associated with active indices. 4. Compute an **error indicator** $\Delta E_\beta$ for each active index (e.g., variance of the sub-space contribution). 5. Add the active index with the largest indicator to $\mathcal{I}$. 6. Update the active set. Repeat until the budget is exhausted. ## The Step Pattern `SingleFidelityAdaptiveSparseGridFitter` uses a **step pattern** for incremental construction: ```python while True: new_samples = fitter.step_samples() # Get new sample points if new_samples is None: break # All admissible indices exhausted new_values = my_function(new_samples) fitter.step_values(new_values) # Provide function values ``` Each call to `step_samples()` returns the points for the next candidate subspace. After evaluating the function at those points and passing the values back via `step_values()`, the algorithm decides which subspace to add based on the error indicator. ```{python} nvars = 2 marginal = UniformMarginal(-1.0, 1.0, bkd) factories = [ClenshawCurtisLagrangeFactory(marginal, bkd) for _ in range(nvars)] growth = ClenshawCurtisGrowthRule() # Admissibility: maximum total level (set high so it does not activate) admissibility = MaxLevelCriteria(max_level=20, pnorm=1.0, bkd=bkd) # Create adaptive sparse grid fitter tp_factory = TensorProductSubspaceFactory(bkd, factories, growth) fitter = SingleFidelityAdaptiveSparseGridFitter(bkd, tp_factory, admissibility) # Test samples for error estimation n_test = 2000 test_samples = bkd.asarray(np.random.uniform(-1, 1, (nvars, n_test))) true_values = additive_function(test_samples, bkd) # Track convergence --- stop when error estimate drops below tolerance nsamples_history = [] error_history = [] tol = 1e-12 for step in range(100): new_samples = fitter.step_samples() if new_samples is None: break new_values = additive_function(new_samples, bkd) fitter.step_values(new_values) # Track approximation error on test set ada_result = fitter.result() if ada_result.nsamples > 1: approx_values = ada_result.surrogate(test_samples) error = float( bkd.sqrt(bkd.sum((true_values - approx_values) ** 2)) / np.sqrt(n_test) ) nsamples_history.append(ada_result.nsamples) error_history.append(error) # Stop when the fitter's built-in error estimate is small if fitter.current_error() < tol: break ada_result = fitter.result() ``` ```{python} #| fig-cap: "Adaptive sparse grid convergence on the additive function. The error decreases rapidly as the algorithm discovers and refines the important dimension." #| label: fig-adaptive-convergence fig, ax = plt.subplots(figsize=(8, 5)) ax.semilogy(nsamples_history, error_history, "o-", color="#2C7FB8", lw=2, ms=5) ax.set_xlabel("Number of samples", fontsize=12) ax.set_ylabel("RMS Error", fontsize=12) ax.set_title("Adaptive Sparse Grid Convergence", fontsize=12) ax.grid(True, which="both", alpha=0.3) plt.tight_layout() plt.show() ``` ## Visualizing the Selected Index Set Because $f$ is **additive** ($f = g_1(z_1) + g_2(z_2)$), no mixed terms are needed: the ideal index set consists only of axis-aligned indices $(k, 0)$ and $(0, k)$. The adaptive algorithm discovers both the **additive structure** (no cross-indices) and the **anisotropy** (more levels in $z_1$ than $z_2$): ```{python} #| fig-cap: "Selected index set after adaptive refinement. The algorithm allocates far more levels along z_1 (the dominant dimension) than z_2, demonstrating automatic anisotropy detection." #| label: fig-index-set selected = ada_result.indices sel_np = bkd.to_numpy(selected).astype(int) fig, ax = plt.subplots(figsize=(8, 6)) plot_index_set(ax, sel_np) plt.tight_layout() plt.show() ``` Notice that the index set contains only axis-aligned indices --- no cross-terms like $(2, 1)$ or $(1, 2)$ --- reflecting the additive structure of $f$. It also extends much further in $k_1$ than in $k_2$, reflecting the much stronger dependence on $z_1$. ### Visualizing Grid Points ```{python} #| fig-cap: "Point distribution for adaptive (left) vs. isotropic (right) sparse grids. The adaptive grid concentrates more points along the first variable. Blue circles are points from selected subspaces; light blue squares are points from candidate subspaces awaiting prioritisation." #| label: fig-points-compare # Adaptive points --- selected vs candidate ada_selected = bkd.to_numpy(fitter.get_samples("selected")) ada_candidate = bkd.to_numpy(fitter.get_samples("candidate")) # Isotropic with comparable point count iso_level = 4 iso_tp_factory = TensorProductSubspaceFactory(bkd, factories, growth) iso_fitter = IsotropicSparseGridFitter(bkd, iso_tp_factory, iso_level) iso_samples = bkd.to_numpy(iso_fitter.get_samples()) iso_nsamples = iso_samples.shape[1] fig, axes = plt.subplots(1, 2, figsize=(12, 5)) plot_points_compare( axes, ada_selected, ada_candidate, ada_result.nsamples, iso_samples, iso_nsamples, iso_level, ) plt.tight_layout() plt.show() ``` ## Convergence Comparison: Adaptive vs. Isotropic ```{python} #| fig-cap: "Adaptive vs. isotropic convergence on the additive benchmark. Adaptive refinement reaches the same accuracy with fewer points by concentrating resources on the important dimension." #| label: fig-adaptive-vs-iso # Isotropic convergence iso_nsamples_list, iso_errors = [], [] for level in range(7): iso_tp_fac = TensorProductSubspaceFactory(bkd, factories, growth) iso_fit = IsotropicSparseGridFitter(bkd, iso_tp_fac, level) iso_samp = iso_fit.get_samples() iso_vals = additive_function(iso_samp, bkd) iso_res = iso_fit.fit(iso_vals) approx = iso_res.surrogate(test_samples) error = float( bkd.sqrt(bkd.sum((true_values - approx) ** 2)) / np.sqrt(n_test) ) iso_nsamples_list.append(iso_samp.shape[1]) iso_errors.append(error) fig, ax = plt.subplots(figsize=(8, 5)) plot_adaptive_vs_iso(ax, nsamples_history, error_history, iso_nsamples_list, iso_errors) plt.tight_layout() plt.show() ``` ## Admissibility Criteria The admissibility criteria determine which indices are allowed as candidates for refinement. | Criterion | Constraint | Use case | |-----------|-----------|----------| | `MaxLevelCriteria(l, pnorm)` | $\lVert\beta\rVert_p \leq l$ | Total budget | | `Max1DLevelsCriteria(max_levels)` | $\beta_i \leq l_i$ for all $i$ | Per-dimension cap | | `CompositeCriteria(c1, c2)` | $c_1(\beta) \wedge c_2(\beta)$ | AND of multiple criteria | ```{python} # Per-dimension maximum levels: more refinement in z1 than z2 max_levels = bkd.asarray([8, 4]) per_dim = Max1DLevelsCriteria(max_levels, bkd=bkd) # Combine with total level budget combined = CompositeCriteria( MaxLevelCriteria(max_level=10, pnorm=1.0, bkd=bkd), per_dim, ) # Build adaptive grid with combined criteria custom_tp_factory = TensorProductSubspaceFactory(bkd, factories, growth) custom_fitter = SingleFidelityAdaptiveSparseGridFitter( bkd, custom_tp_factory, combined, ) for _ in range(25): s = custom_fitter.step_samples() if s is None: break custom_fitter.step_values(additive_function(s, bkd)) if custom_fitter.current_error() < 1e-12: break custom_result = custom_fitter.result() ``` ## 4D Adaptive Example Adaptive refinement pays larger dividends in higher dimensions, especially when the function depends strongly on only a few inputs. ```{python} #| fig-cap: "4D adaptive vs. isotropic convergence for a function with one dominant variable. The adaptive algorithm discovers the important dimension and converges much faster." #| label: fig-4d-adaptive nvars4 = 4 def fun4d_additive(samples, bkd): """sin(3*z1) + 0.5*z2 + 0.1*z3 + 0.1*z4. Shape: (1, nsamples).""" z1, z2, z3, z4 = samples[0], samples[1], samples[2], samples[3] values = bkd.sin(3 * z1) + 0.5 * z2 + 0.1 * z3 + 0.1 * z4 return bkd.reshape(values, (1, -1)) facs_4d = [ClenshawCurtisLagrangeFactory(marginal, bkd) for _ in range(nvars4)] # Test samples test_4d = bkd.asarray(np.random.uniform(-1, 1, (nvars4, 2000))) true_4d = fun4d_additive(test_4d, bkd) # Adaptive grid with error-based stopping adm_4d = MaxLevelCriteria(max_level=20, pnorm=1.0, bkd=bkd) tp_fac_4d = TensorProductSubspaceFactory(bkd, facs_4d, growth) adaptive_4d = SingleFidelityAdaptiveSparseGridFitter(bkd, tp_fac_4d, adm_4d) adapt_nsamples, adapt_errors = [], [] for step in range(100): s = adaptive_4d.step_samples() if s is None: break adaptive_4d.step_values(fun4d_additive(s, bkd)) r4 = adaptive_4d.result() if r4.nsamples > 1: approx = r4.surrogate(test_4d) err = float( bkd.sqrt(bkd.sum((true_4d - approx) ** 2)) / np.sqrt(2000) ) adapt_nsamples.append(r4.nsamples) adapt_errors.append(err) if adaptive_4d.current_error() < 1e-12: break # Isotropic comparison iso_nsamples_4d, iso_errors_4d = [], [] for level in range(5): iso_tp_fac4 = TensorProductSubspaceFactory(bkd, facs_4d, growth) iso_fit4 = IsotropicSparseGridFitter(bkd, iso_tp_fac4, level) iso_samp4 = iso_fit4.get_samples() iso_res4 = iso_fit4.fit(fun4d_additive(iso_samp4, bkd)) approx = iso_res4.surrogate(test_4d) err = float( bkd.sqrt(bkd.sum((true_4d - approx) ** 2)) / np.sqrt(2000) ) iso_nsamples_4d.append(iso_samp4.shape[1]) iso_errors_4d.append(err) fig, ax = plt.subplots(figsize=(8, 5)) plot_4d_adaptive(ax, adapt_nsamples, adapt_errors, iso_nsamples_4d, iso_errors_4d) plt.tight_layout() plt.show() ``` ## Summary - Adaptive sparse grids discover the structure of $f$ from function evaluations alone: for additive functions the algorithm selects only axis-aligned indices (no cross-terms), and for anisotropic functions it allocates more levels to the important dimensions. - The greedy adaptive algorithm selects the index with the largest variance indicator at each step, approximating the solution to a knapsack optimization problem. - For strongly anisotropic functions, adaptive grids can achieve the same accuracy as isotropic grids at a fraction of the cost --- the advantage grows with the degree of anisotropy. - Admissibility criteria (`MaxLevelCriteria`, `Max1DLevelsCriteria`, `CompositeCriteria`) provide flexible budget and safety constraints on the index set. - The step-based API (`step_samples()` / `step_values()`) allows fine-grained control; use `fitter.current_error()` to stop when the estimated error drops below a tolerance. - For UQ with sparse grids (moments, Sobol indices, marginal densities), see [UQ with Sparse Grids](uq_with_sparse_grids.qmd). ## Exercises ::: {.callout-note} ## Exercises 1. **Anisotropy detection.** Run the adaptive algorithm on $f(z_1, z_2, z_3) = \cos(4z_1) + 0.1 z_2 + 0.01 z_3$. After convergence, visualize the index set. Does the algorithm correctly rank dimensions by importance? 2. **Budget comparison.** For the 2D additive benchmark, compare adaptive and isotropic grids at 50, 100, and 200 evaluations. At what budget does adaptive become clearly superior? 3. **Criteria impact.** Build an adaptive grid with `Max1DLevelsCriteria` limiting $z_2$ to level 2. Compare the resulting index set and convergence to the unconstrained case. 4. **Statistics.** For the 2D benchmark, track $\mathbb{E}[f]$ at each step of the adaptive algorithm and plot its convergence to the analytical value $\sin(5)/5$. Compare with a Monte Carlo estimate using the same number of evaluations. ::: ## Next Steps - [Multifidelity Sparse Grids](multifidelity_sparse_grids.qmd) --- Combine adaptive refinement with a hierarchy of model fidelities to reduce cost further