Pilot Studies

PyApprox Tutorial Library

The bootstrapping problem at the heart of multi-fidelity estimation: you need the model covariance to plan the experiment, but computing the covariance requires running the models — and spending budget.

Learning Objectives

After completing this tutorial, you will be able to:

  • Articulate the bootstrapping problem: why any ACV estimator requires pilot data before its optimal allocation can be computed
  • Explain the two competing costs of a pilot study: covariance estimation error and budget consumed
  • Sketch the MSE-vs-pilot-size curve and identify its characteristic U-shape when pilot cost is accounted for
  • State the practical rule of thumb for pilot sizing on the polynomial benchmark

Prerequisites

Complete any of the concept tutorials and review the API Cookbook before this tutorial. Pilot studies are the final practical ingredient needed to run the full multi-fidelity workflow on a real problem where population statistics are unknown.

The Bootstrapping Problem

Every ACV estimator we have built so far has one hidden assumption: the model covariance matrix \(\boldsymbol{\Sigma}\) is known. This assumption is used in two places:

  1. Sample allocation: allocate_samples(P) solves an optimisation problem that depends on \(\boldsymbol{\Sigma}\). Without it, you cannot determine how many HF and LF samples to take.
  2. Control variate coefficients: the optimal weights \(\mathbf{H}^*\) or \(\boldsymbol{\alpha}^*\) depend on \(\boldsymbol{\Sigma}\).

In all previous tutorials, \(\boldsymbol{\Sigma}\) was supplied via set_pilot_quantities using the population value from the benchmark. In practice, \(\boldsymbol{\Sigma}\) is unknown — if you already knew the moments of \(f_0\), you wouldn’t need to estimate them.

The solution is a pilot study: evaluate all \(M+1\) models at a small shared set of \(N_p\) samples, use the resulting data to estimate \(\boldsymbol{\Sigma}\), then use that estimate to plan and run the main estimator. This introduces a circular dependency: you need moments to plan the experiment, but computing moments requires running the experiment.

Two Competing Costs

The pilot study involves a genuine trade-off between two costs:

Cost 1: Covariance estimation error. A small pilot (\(N_p\) small) produces a noisy \(\hat{\boldsymbol{\Sigma}}\). The sample allocation and control variate coefficients derived from \(\hat{\boldsymbol{\Sigma}}\) are sub-optimal. The resulting estimator has higher variance than the oracle (population-covariance) version.

Cost 2: Budget consumed. The pilot uses real compute. If the total budget is \(P\) and the pilot costs \(P_p = N_p \sum_\alpha C_\alpha\), then only \(P - P_p\) remains for the main estimator. A large pilot leaves too little budget for the estimator itself.

Figure 1 illustrates this tension on the three-model polynomial benchmark.

Figure 1: MSE (relative to single-fidelity MC MSE) vs pilot size \(N_p\) for MFMC on the three-model polynomial benchmark at total budget \(P=100\). Left: Pilot cost is not deducted from the estimator budget — only covariance estimation error matters, so MSE decreases monotonically with \(N_p\). Right: Pilot cost is deducted — too large a pilot starves the main estimator, and MSE has a minimum at an intermediate \(N_p\).

The left panel of Figure 1 shows that ignoring pilot cost gives a monotonically decreasing MSE — more pilot samples always help with covariance estimation. The right panel (the realistic case) shows a U-shape: too few pilot samples give a bad covariance estimate; too many leave nothing for the main estimator. The minimum is the optimal pilot size.

The Full Two-Stage Workflow

The pseudocode below summarises the two stages of a multi-fidelity estimation campaign. The key detail is that the pilot cost must be subtracted from the total budget before the main estimator allocates its samples.

Stage 1 — Pilot
  1. Choose pilot size  Np  (rule of thumb: 2–5 × (M + 1))
  2. Draw Np shared input samples
  3. Evaluate every model at those samples → pilot outputs
  4. Compute sample covariance  Σ̂  from pilot outputs
  5. Record pilot cost  Pp = Np × Σ Cα

Stage 2 — Main Estimator
  6. Subtract pilot cost from total budget:  P_main = P − Pp
  7. Set pilot quantities:      stat.set_pilot_quantities(Σ̂)
  8. Allocate main samples:     estimator.allocate_samples(P_main)
  9. Generate sample sets:      estimator.generate_samples(variable)
 10. Evaluate models:           estimator.evaluate_samples(models)
 11. Compute estimate:          result = estimator(values)

Always subtract pilot cost — allocating the full budget \(P\) as if the pilot were free leads to an over-optimistic allocation and wastes the budget spent on the pilot.

See the API Cookbook → Universal Workflow for a runnable code version of these steps.

What the Pilot Provides

The pilot study provides three things:

  1. Covariance estimate \(\hat{\boldsymbol{\Sigma}}\): used to set pilot quantities via stat.set_pilot_quantities(cov_hat).
  2. Sample allocation: allocate_samples(P - P_p) is called on the remaining budget after subtracting the pilot cost.
  3. Model cost estimates: the median wall-clock time per model over the pilot samples is used as \(C_\alpha\) when physical timing (not synthetic ratios) is needed.

All three are needed before the main estimator can be run.

What Makes a Good Pilot Design?

Sample count \(N_p\). The minimum needed for a non-singular covariance estimate is \(N_p > M + 1\) (one more than the number of models). A reliable estimate typically requires \(N_p \geq 2(M+1)\) to \(5(M+1)\). For three models (\(M=2\)) and \(P=100\), \(N_p = 10\)\(20\) is usually sufficient when models have moderate correlation (\(\rho \sim 0.9\)). Weaker correlations require more pilot samples because the relevant off-diagonal entries of \(\hat{\boldsymbol{\Sigma}}\) are harder to estimate.

Shared samples. All models must be evaluated at the same pilot sample points. This is essential: the cross-covariance estimate \(\hat{\sigma}_{\alpha\beta} = \frac{1}{N_p}\sum_n (f_\alpha^{(n)} - \bar{f}_\alpha)(f_\beta^{(n)} - \bar{f}_\beta)\) requires paired evaluations.

Independent from main study. Pilot samples are separate from the main estimator’s sample partitions. Including pilot samples in the main allocation introduces an optimism bias.

Key Takeaways

  • Every ACV estimator requires a pilot study to estimate \(\boldsymbol{\Sigma}\) before it can be planned or run
  • A small pilot introduces covariance estimation error; a large pilot consumes too much of the total budget
  • The realistic MSE-vs-pilot-size curve has a U-shape with a well-defined optimal \(N_p^*\) (Figure 1, right panel)
  • All models must be evaluated at the same pilot sample points; pilot samples are independent of the main estimator’s samples
  • As a rule of thumb, \(N_p \approx 2\text{–}5\,(M+1)\) is a good starting point for problems with strong correlations (\(\rho \geq 0.8\))

Exercises

  1. From Figure 1 (right), estimate the optimal pilot size \(N_p^*\) for this benchmark. At this \(N_p^*\), what fraction of the total budget \(P=100\) does the pilot consume?

  2. Why must all models be evaluated at the same pilot sample points? What goes wrong if each model is evaluated at different samples?

  3. Suppose the pilot covariance estimate is available but very noisy (\(N_p = 5\)). You observe that allocate_samples assigns 0 samples to one model. What is the likely cause, and how can you diagnose it?

  4. For a 10-model ensemble at budget \(P=500\), using the rule of thumb \(N_p \approx 5(M+1)\), what is the pilot budget \(P_p\) if all models have equal cost \(C=1\)? What fraction of \(P\) does this represent?

Tip

Ready to try this? See API Cookbook → Universal Workflow.

Next Steps

  • Pilot Studies Analysis — MSE decomposition, sensitivity of optimal allocation to covariance estimation error, and how the U-shaped curve shifts with budget and correlation
  • API Cookbook — End-to-end two-stage workflow in PyApprox: pilot → plan → run