Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Sola/HdsaLib Model Discrepancy Documentation

Model Discrepancy Problem Formulation

This documentation presents the mathematics and software design used in both Sola and HdsaLib for their model_discrepancy modules. These two libraries, written in object-oriented Matlab and C++, respectively, mirror one another in code design and serve as a development platform to rapidly prototype in Sola and then scale to high performance computing problems with HdsaLib. Since the directories and filenames closely mirror one another, we will refer to directories and filenames without explicit reference to Sola or HdsaLib. Where necessary, we will qualify which library is being referred to.

We consider optimization problems of the form

minzRnzJ(S(z),z)\begin{align} & \min_{\z \in \R^{n_z}} J(S(\z),\z) \end{align}

where zRnz\z \in \R^{n_z} denotes an optimization variable, S:RnzRnuS:\R^{n_z} \to \R^{n_u} is the solution operator for a discretized differential equation with state variable uRnu\u \in \R^{n_u}, and J:Rnu×RnzRJ:\R^{n_u} \times \R^{n_z} \to \R is the objective function. Although we would like to solve (1), it is frequently not possible or practical due to the complexity of the high-fidelity model. Rather, we solve

minzRnzJ(S~(z),z)\begin{align} & \min_{\z \in \R^{n_z}} J(\tilde{S}(\z),\z) \end{align}

where S~:RnzRnu\tilde{S}:\R^{n_z} \to \R^{n_u} is the solution operator for a low-fidelity model. The goal in hyper-differential sensitivity analysis with respect to model discrepancy (HDSA-MD) is to update the optimal solution of (2) using a limited number of high-fidelity simulations, i.e. to approximate the optimal solution of (1) without requiring evaluation of S(z)S(\z) within the optimization algorithm. The first HDSA-MD developments are in Hart & van Bloemen Waanders (2023). This was subsequently expanded upon in Hart & van Bloemen Waanders (2024) and then further improved in Hart et al. (2026). We refer the reader to Hart & van Bloemen Waanders (2024) for the most detailed description of the mathematics, but note that the model discrepancy prior introduced below and used in the code was first presented in Hart et al. (2026).

To incorporate model discrepancy, consider the parameterized optimization problem

minzRnJ(z,θ):=J(S~(z)+δ(z,θ),z),\begin{align} \min_{\z \in \R^n} \hspace{1 mm} \mathcal{J}(\z,\t):=J(\tilde{S}(\z)+\d(\z,\t),\z) , \end{align}

where δ:Rnz×RnθRnu\d:\R^{n_z} \times \R^{n_\theta} \to \R^{n_u}, which we call the model discrepancy representation, will approximate the difference between the high and low-fidelity solution operators. That is, we parameterize δ\d and seek to determine a parameter vector θRnθ\t \in \R^{n_\theta} such that δ(z,θ)S(z)S~(z)\d(\z,\t) \approx S(\z)-\tilde{S}(\z).

Let z~\tilde{\z} be a local minimizer of (2), which coincides with (3) when θ=θ0:=0\t=\t_0 := \vec{0} is chosen so that δ(z,θ0)0\d(\z,\t_0) \equiv \vec{0}. Given NN high-fidelity model evaluations to form a dataset {z,d}=1N\{ \z_\ell, \vec{d}_\ell \}_{\ell=1}^N, d=S(z)S~(z)\vec{d}_\ell=S(\z_\ell)-\tilde{S}(\z_\ell), the HDSA-MD framework consists of two steps:

  1. Calibrating θ\t so that δ(z,θ)d\d(\z_\ell,\t) \approx \vec{d}_\ell, =1,2,,N\ell=1,2,\dots,N.

  2. Propagating the calibrated θ\t through the optimization problem (3) to predict how the optimal solution changes with respect to the model discrepancy.

The calibration in step (i) is done via a Bayesian approach. Consequently, we specify a prior distribution for θ\t and use the NN high-fidelity model evaluations to compute a posterior distribution for θ\t. The result of the HDSA-MD framework is (1) a posterior distribution on the model discrepancy representation, and (2) a posterior pushforward distribution on the optimization variable that captures uncertainty due to the model discrepancy. The codes in the model_discrepancy module facilitate the specification of the prior θ\t distribution, the computation of samples from the posterior θ\t distribution, and the corresponding samples from the posterior optimal solution. The θ\t distribution samples are analyzed and interpreted by pushing them through δ\d to consider samples of the model discrepancy representation.

In what follows, we outline the basic mathematics involved in these computations and highlight where the various aspects of the mathematics appear in the code.

Discrepancy Calibration

We consider the model discrepancy representation δ:Rnz×RnθRnu\d:\R^{n_z} \times \R^{n_\theta} \to \R^{n_u} to be parameterized as

δ(z,θ)=(InuInuzMz)θ\begin{align*} \d(\z,\t) = \left( \begin{array}{cc} \I_{n_u} & \I_{n_u} \otimes \z^\top \M_{\z} \end{array} \right) \t \end{align*}

where InuRnu×nu\I_{n_u} \in \R^{n_u \times n_u} is the identity matrix, and MzRn×n\M_{\z} \in \R^{n \times n} weights the z\z inner products (Mz\M_{\z} is the mass matrix if z\z arises from finite element discretization of a function). Hence, θRnθ\t \in \R^{n_\theta} has dimension nθ=nu(nz+1)n_\theta = n_u (n_z+1). If the optimization variable is high-dimensional, i.e., nz1n_z \gg 1, then computation in dimension nθn_\theta is intractable. However, the Kronecker product structure in (4) will enable computations in Rnu\R^{n_u} and Rnz\R^{n_z} rather than Rnθ\R^{n_\theta}. As a result, all vectors in Rnθ\R^{n_\theta} will be implicitly represented via a collection of vectors in Rnu\R^{n_u} and Rnz\R^{n_z}. This creates a degree of complexity in reading the code. However, this approach is crucial to ensuring computational efficiency.

To calibrate the model discrepancy representation, we assume a mean-zero Gaussian prior for θ\t whose precision matrix is

Wθ=(WuWuz~MzWuMzz~Wu(Wz+Mzz~z~Mz))Rnθ×nθ,\begin{align*} \W_{\t} = \left( \begin{array}{cc} \W_{\u} & \W_{\u} \otimes \ztilde^\top \M_{\z} \\ \W_{\u} \otimes \M_{\z} \ztilde & \W_{\u} \otimes (\W_{\z}+ \M_{\z} \ztilde \ztilde^\top \M_{\z} ) \end{array} \right) \in \R^{n_\theta \times n_\theta}, \end{align*}

that is, the covariance matrix for the Gaussian prior is Wθ1\W_{\t}^{-1}. In (5), WuRnu×nu\W_{\u} \in \R^{n_u \times n_u} and WzRnz×nz\W_{\z} \in \R^{n_z \times n_z} are precision matrices corresponding to mean-zero Gaussian distributions with covariances Wu1\W_{\u}^{-1} and Wz1\W_{\z}^{-1} defined on the state and optimization variable spaces, respectively. When u\u and/or z\z are coordinates from a spatial discretization, it is common to define Wu1\W_{\u}^{-1} and/or Wz1\W_{\z}^{-1} as scalar multiples of squared inverse Laplacian-like operators.

We assume a Gaussian noise model, and consequently, the posterior θ\t distribution is Gaussian with known mean and covariance. However, due to the dimension of θ\t, samples from the posterior cannot be computed via direct linear algebra in Rnθ\R^{n_\theta}. Rather, efficient expressions for the posterior mean and posterior samples are determined. To this end, let Z=(z1z2zN)Rnz×N\vec{Z} = \begin{pmatrix} \z_1 & \z_2 & \dots & \z_N \end{pmatrix} \in \R^{n_z \times N} be the matrix of optimization variable inputs to the high-fidelity model evaluations, and define

G=ee+(Zz~e)MzWz1Mz(Zz~e)RN×N,\begin{align*} & \G= \e \e^\top + (\vec{Z} - \ztilde \e^\top)^\top \M_{\z} \W_{\z}^{-1} \M_{\z} (\vec{Z}-\ztilde \e^\top) \in \R^{N \times N}, \end{align*}

where eRN\e \in \R^N is the vector of ones. Let (gi,μi)(\vec{g}_i,\mu_i) denote the eigenvectors and eigenvalues of G\vec{G} and define

yi=Zgi(egi)z~andsi=(egi)yiMzWz1Mzz~\begin{align*} \vec{y}_i = \vec{Z} \vec{g}_i - (\e^\top \vec{g}_i) \ztilde \qquad \text{and} \qquad s_i = (\e^\top \vec{g}_i) - \vec{y}_i^\top \M_{\z} \W_{\z}^{-1} \M_{\z} \ztilde \end{align*}

for i=1,2,,Ni=1,2,\dots,N. The posterior mean for the model discrepancy representation parameters is where

a=1z~MzWz1Mz(zz~)bi,=(zz~)MzWz1MzZgi+(egi)au=Wu1Mudui,=(αdWu+μiMu)1Muu.\begin{align*} & a_{\ell} = 1-\ztilde^\top \M_{\z} \W_{\z}^{-1} \M_{\z} (\z_\ell-\ztilde) \qquad & b_{i,\ell} = (\z_\ell-\ztilde)^\top \M_{\z}\W_{\z}^{-1} \M_{\z} \vec{Z} \vec{g}_i + (\e^\top \vec{g}_i) a_\ell \\ & \vec{u}_\ell = \W_{\u}^{-1} \M_{\u} \vec{d}_\ell & \vec{u}_{i,\ell} = \left( \alpha_{\vec{d}} \W_{\u} + \mu_i \M_{\u} \right)^{-1} \M_{\u} \u_\ell . \end{align*}

Samples from the posterior distribution are given by where with

u^iN(0,(αdWu+μiMu)1)s˘k=z~MzWz12z˘ku˘kN(0,Wu1)y˘k=Wz12z˘k,\begin{align*} & \hat{\u}_i \sim \mathcal N(\vec{0},\left( \alpha_{\vec{d}} \W_{\u} + \mu_i \M_{\u} \right)^{-1} ) \qquad & \breve{s}_k= -\ztilde^\top \M_{\z} \W_{\z}^{-\frac{1}{2}} \breve{\z}_k \\ & \breve{\u}_k \sim \mathcal N(\vec{0}, \W_{\u}^{-1} ) & \breve{\vec{y}}_k = \W_{\z}^{-\frac{1}{2}} \breve{\z}_k , \end{align*}

where N\mathcal N denotes the Gaussian distribution.

Propagating through the optimization problem

To propagate posterior θ\t samples through the optimization problem (3), we consider using post-optimality sensitivities to construct a linear approximation of the θ\t-to-optimal solution mapping. Specifically, applying the Implicit Function Theorem to the first order optimality conditions yields a mapping F:N(θ0)RnzF:\mathcal N(\t_0) \to \R^{n_z} that maps model discrepancy parameters in a neighborhood of θ0\t_0 to the minimizer corresponding to the model discrepancy representation associated with θN(θ0)\t \in \mathcal N(\t_0). Furthermore, the Jacobian of the minimizer with respect to the model discrepancy parameters θ\t is given by

θF(0)=H1BRnz×nθ\begin{align*} \nabla_{\t} F(\vec{0}) = - \H^{-1} \B \in \R^{n_z \times n_\theta} \end{align*}

where H=z,zJ(z~,θ0)Rnz×nz\H =\nabla_{\z,\z} \mathcal{J}(\tilde{\z},\t_0) \in \R^{n_z \times n_z} is the Hessian of J\mathcal{J} and B=z,θJ(z~,θ0)Rnz×nθ\B = \nabla_{\z,\t} \mathcal{J}(\tilde{\z},\t_0) \in \R^{n_z \times n_\theta} is the Jacobian of zJ\nabla_{\z} \mathcal{J} with respect to θ\t, both are evaluated at the low-fidelity solution (z~,θ0)(\tilde{\z},\t_0). We refer to θF\nabla_{\t} F as the post-optimality sensitivity operator.

Then samples from the optimal solution posterior distribution may be approximated via

z~H1B(θ+θ^+θ˘).\begin{align} \ztilde - \H^{-1} \B (\overline{\t} + \hat{\t} + \breve{\t}). \end{align}

However, this may lead to having large posterior uncertainty in the optimization solution that corresponds to directions in which the objective function has small variation. To combat this, we introduce a projector that restricts the optimization solution changes to a subspace of high objective function sensitivity. Specifically, let H=WzVΛVWz\H = \W_{\z} \vec{V} \vec{\Lambda} \vec{V}^\top \W_{\z} denote the generalized eigenvalue decomposition of the Hessian H\H in the Wz\W_{\z} weighted inner product. Given a truncation rank r<nzr < n_z, we restrict to the subspace of the rr leading generalized eigenvectors and define the projector onto this subspace P=VrVrWz\vec{P} = \vec{V}_{r} \vec{V}_{r}^\top \W_{\z}, where VrRnz×r\vec{V}_{r} \in \R^{n_z \times r} consists of the leading rr generalized eigenvectors. Then we consider projected approximate optimal solution samples of the form

z~PH1B(θ+θ^+θ˘).\begin{align} \ztilde - \vec{P} \H^{-1} \B (\overline{\t} + \hat{\t} + \breve{\t}). \end{align}

Using post-optimality linearization of the form (11) or (12) is computationally advantageous since many samples may be computed with a manageable cost. However, the error resulting from the post-optimality sensitivity linearization may be detrimental. To address this, we leverage the preconditioned pseudo-time continuation approach that was introduced in Hart et al. (2026) for model parameter perturbations and adapted for the model discrepancy context in Madhavan et al. (2026). This produces a sequence of linearizations that better approximates the optimization solution after a change in the model discrepancy representation.

Model Discrepancy Code Design

The code design in sola/src/model_discrepancy and hdsalib/src/core/model_discrepancy is divided into two subdirectories: interfaces and analysis. Note that other subdirectories in sola/src/model_discrepancy are specific to Sola for the purpose of interfacing with other components of Sola. As their names indicate, the code under interfaces defines the interfaces required for a user to conduct the HDSA-MD analysis and the code under analysis implements the algorithms to compute prior and posterior samples from the model discrepancy representation and the optimization solution.

Interfaces

Within model_discrepancy/interfaces, there are six subdirectories. In its most basic form, the interfaces consist of four abstract base classes that define the prior precision matrices Wu\W_{\u} and Wz\W_{\z} from (5), a class that defines the optimization problem interfaces such as derivatives of the objective function JJ and PDE solution operator S~\tilde{S}, and a class that defines an interface to load data corresponding to the low-fidelity optimal solution and the high-fidelity solution operator evaluations. These base classes are:

  1. MD_u_Prior_Interface

  2. MD_z_Prior_Interface

  3. MD_Opt_Prob_Interface

  4. MD_Data_Interface

The optimization problem interface and data interface classes are contained in the directory:

model_discrepancy/interfaces/problem_interfaces

The u\u and z\z prior interfaces are contained in the directories bearing their names:

model_discrepancy/interfaces/u_prior_interfaces

and

model_discrepancy/interfaces/z_prior_interfaces

Within these directories, there are several classes that form a hierarchy of class inheritances to implement the priors for various problem instances. Additionally, the code in

model_discrepancy/interfaces/transient_prior_interfaces

facilitates the specification of these prior covariance matrices for transient problems.

The files in

model_discrepancy/interfaces/hyperparameter_interfaces facilitate prior hyperparameter selection using the algorithms described in Hart et al. (2026). These interfaces are not strictly required to conduct analysis, but they are required as inputs to some of the prior interface classes.

The files in

model_discrepancy/interfaces/OUU_interfaces provide an interface extension to apply HDSA-MD to certain optimization under uncertainty problems. More details regarding this part of the code are forthcoming.

Analysis

Within model_discrepancy/analysis, there are three subdirectories. The files in

model_discrepancy/analysis/prior

implement algorithms relevant for computing samples from the prior discrepancy distribution and propagating these prior samples through the optimization problem. Similarly, the files in

model_discrepancy/analysis/posterior

implement algorithms for computing samples from the posterior discrepancy and propagating these posterior samples through the optimization problem.

Posterior sample propagation using post-optimality linearization is implemented in the class MD_Update. In Sola, there has been additional development of propagation using pseudo-time continuation. This is implemented in the class MD_Continuation_Update, along with associated classes needed to interface with codes from the pseudo-time continuation module in

sola/src/pseudo_time_continuation

These algorithms will be implemented in HdsaLib in the future.

Lastly, the model_discrepancy/analysis/auxiliary subdirectory contains classes for auxiliary analysis capabilities. This includes the computation of the Hessian decomposition used to define the projector in (12), the computation of Laplacian-like operator spectral properties used for prior hyper-parameter specification, and Bayesian optimal experimental design (OED) to select the input data {z}=1N\{ \z_\ell \}_{\ell=1}^N for the high-fidelity model evaluations. The OED codes are currently only available in Sola, but will be implemented in HdsaLib in the future.

Implementing an example

To conduct HDSA-MD analysis on a given example, the user must implement the abstract base classes MD_Opt_Prob_Interface and MD_Data_Interface in a way that is consistent with the data structures and solvers used in the optimization problem. To implement MD_u_Prior_Interface and MD_z_Prior_Interface the user must choose an appropriate derived class for each interface. This choice is based on problem-specific considerations such as whether u\u and/or z\z are functions of space, time, or both. We suggest reviewing the simple examples in the test suite to see how these interfaces are instantiated and passed to the analysis classes.

References
  1. Hart, J., & van Bloemen Waanders, B. (2023). Hyper-differential sensitivity analysis with respect to model discrepancy: Optimal solution updating. Computer Methods in Applied Mechanics and Engineering, 412, 1–18.
  2. Hart, J., & van Bloemen Waanders, B. (2024). Hyper-differential sensitivity analysis with respect to model discrepancy: Posterior optimal solution sampling. AIMS Foundations of Data Science, 99–133.
  3. Hart, J., van Bloemen Waanders, B., Li, J., Ouermi, T. A. J., & Johnson, C. R. (2026). Hyper-differential sensitivity analysis with respect to model discrepancy: Prior Distributions. International Journal for Uncertainty Quantification, 16(1), 51–77.
  4. Hart, J., Alexanderian, A., & van Bloemen Waanders., B. (2026). Preconditioned pseudo-time continuation for parameterized inverse problems. To Appear in the SIAM Journal on Scientific Computing. arXiv.2508.21155.
  5. Madhavan, M., Hart, J., & van Bloemen Waanders, B. (2026). Hyper-differential sensitivity analysis with respect to model discrepancy: Sequential optimal experimental design. Under Review. arXiv:2604.02253.