Sensitivity Analysis with Effective Quadratures

In this blog post, we explore one of the applications of Effective Quadratures—sensitivity analysis. The goal of sensitivity analysis is summarized by the question: Which of my input parameters is the most important? Given a computational model, we want to create a ranking of the various input parameters in terms of their influence on the output—how sensitive the output is to changes in a certain parameter.

So why do we want to know about the importance of variables? Consider a model for the cycle time of a spring-loaded piston from Kenett et al. [1], consisting of seven input parameters:

  1. $M$: The piston mass
  2. $S$: The piston area
  3. $V_0$: Initial gas volume
  4. $k$: Spring constant
  5. $P_0$: Atmospheric pressure
  6. $T_a$: Ambient temperature
  7. $T_0$: Filling gas temperature

piston

Sensitivity analysis allows us to make plots such as the one below:

Hence, instead of a seven parameter model, we could simply build a two-parameter model with $S$ and $V_0$, keeping others constant. Identifying important parameters enables the simplification of models that may have hundreds of parameters down to a handful.

There are multiple methods to quantify the importance of a variable, one of which is through Sobol’ indices. The idea behind Sobol’ indices is simple: If we treat the input parameters as random variables, carrying a degree of uncertainty, running them through the model yields some uncertainty in the output, which can be calculated through the output variance. The idea is to find out how much of this output variance is caused by the variation of each individual variable—or potentially the interaction between multiple variables.

In the seminal paper by I. M. Sobol [2]—who came up with the indices—he expressed the index via integrals. In general, these integrals cannot be evaluated analytically for computational models. A simple approach to approximating these integrals is Monte Carlo. For example, if we are approximating the mean of the function $f(x)$:

$$
\mu = \int f(x) \rho(x) dx
$$

where $\rho(x)$ is the probability density function of the input, we can use the following recipe:

  1. Sample $N$ evaluation points $x_1, x_2, \dots ,x_N$ randomly in the input domain according to $\rho(x)$.
  2. Evaluate the function at these points: $f(x_1), f(x_2), \dots , f(x_N)$.
  3. Sum up the function evaluations:

$$
\hat{\mu} = \sum_{i=1}^N f(x_i)
$$

Then, it can be shown that as $N$ tends to infinity, the Monte Carlo estimate approaches the true value. The evaluation of a Sobol’ index is more involved, requiring two function evaluations per Monte Carlo sample, but the principle is the same. Let’s examine the convergence of this method on computing the Sobol’ index for $S$ for the piston model.

The estimate is evaluated with different number of data points used for 30 trials. As the number of function evaluations increases, the fluctuation around the true value decreases. However, even with 2500 evaluations, the variation around the mean is still rather large. Is there a better way to do this?

Instead of using the function evaluations for computing the integral directly, Sudret et al. [3] showed that it is possible to leverage polynomial approximations for the computation of Sobol’ indices. Provided that the polynomials satisfy orthogonality, the Sobol’ indices can be read off from the coefficients of the expansion. Refer to the tutorials for more details on polynomials and the capabilities within EQ to work with them.

Using this approach, the computation of Sobol’ indices now boil down to the estimation of polynomial coefficients—i.e. finding a polynomial fit for the function. One way to do this is through least squares, where point evaluations are used as data to find the best fit coefficients. For a given number of evaluations of the function, how well does this compare with the Monte Carlo approach?

The fluctuation around the converged value is reduced drastically, as seen in the figure above. This example demonstrates the power of polynomial approximations: function evaluations are more well-spent if used on a suitable polynomial approximation than on brute force approaches.

Can we do better? Experienced users may know that a lower limit is placed on the least squares approach: we need as many function evaluations as the number of polynomial basis terms; and as the number of input variables to the function increases, the number of basis terms scales exponentially. In EQ, there are various methods for circumventing this curse of dimensionality , one of which is compressed sensing. Another approach is to perform regression on a dimension-reduced subspace. In our recent paper [4], we explored the use of the latter technique to compute sensitivity indices.

The utility of our tools certainly does not end here. For more information, please refer to the tutorials and more blog posts to come.

References:

[1] R. Kenett, S. Zacks, and D. Amberti. Modern Industrial Statistics: with applications in R, MINITAB and JMP . John Wiley & Sons, 2013.

[2] I. M. Sobol’. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Mathematics and Computers in Simulation , 55(1):271 – 280, 2001.

[3] B. Sudret. Global sensitivity analysis using polynomial chaos expansions. Reliability Engineering & System Safety , 93(7):964–979, 2008.

[4] C. Y. Wong, P. Seshadri, and G. T. Parks. Extremum Sensitivity Analysis with Least Squares Polynomials and their Ridges. arXiv: 1907.08113 [math], (2020), [1907.08113] Extremum Sensitivity Analysis with Polynomial Monte Carlo Filtering (Preprint submitted to Reliability Engineering & System Safety )

2 Likes

Looking good @Nick!

This is terrific @Nick! Building upon this, a valid query would be what to do if some, or indeed all of the inputs were correlated. Before we embark on this, it will be useful to understand a key limitation in the existing polynomial chaos (although there is nothing chaotic about using polynomials for approximation) workflow. To ease the exposition, initially, let’s assume a bivariate model $f \approx f\left(s_1, s_2 \right)$, where $s_1$ and $s_2$ are the two model parameters that have marginals given by $\omega_1(s_1)$ and $\omega_2(s_2)$ that have support $\mathcal{D}_1$ and $\mathcal{D}_2$ respectively

The standard polynomial approximation we seek is of the form
$$
f \left(s_1, s_2 \right) \approx \sum_{i} x_i p_i \left( s_1, s_2 \right)
$$

where $p_i$ is the $i$-th bivariate orthogonal polynomial and $x_i$ is its corresponding coefficient. The latter is given by

$$
x_i = \int_{\mathcal{D}1} \int{\mathcal{D}_2} f \left(s_1, s_2 \right) p_i \left( s_1, s_2 \right) \varrho \left(s_1, s_2 \right) ds_1 ds_2,
$$

where we explicitly assume that $\varrho \left(s_1, s_2 \right) = \omega_1(s_1) \omega_2(s_2)$, i.e., $s_1$ and $s_2$ are independent and thus their joint distribution is a product of its marginals.

Now, when $s_1$ and $s_2$ are correlated, one cannot express the joint distribution as a product of the marginals. To deal with this issue and still leverage polynomial approximations to compute moments and sensitivities, we require an alternate strategy. Enter the Nataf transform, implemented here in the code.

Assume that both $\omega_1(s_1)$ and $\omega(s_2)$ are zero mean Gaussians with a variance of unity. In Figure 1, on the left we have that $s_1$ and $s_2$ are correlated, while on the right set them to be independent.

Now, in the standard polynomial approximation workflow, we evaluate our model $f$ at points over the domain shown in the Figure on the right. Assuming we adopt a sparse grid, we might end up with the sampling shown in the right subfigure in Figure 2. Using the same sampling strategy on the subfigure on the left seems incorrect (and it is for the reason mentioned above).

What we would intuitively expect is for our sparse grid to somehow exploit the underlying correlation structure in the joint distribution and have points that look akin to those shown in the left subfigure of Figure 3.

This transformation (or map) is referred to as the Nataf transform. Succinctly stated, we construct our polynomials over the space of independent standard Gaussian distributions (right), but evaluate our model at the transformed coordinates (left). The code below shows this in action!

from equadratures import *
import numpy as np

# A model function
def function(s):
    s1 = s[0]
    s2 = s[1]
    return s1**2 - s2**3 - 2. * s1 - 0.5 * s2

# The standard setup.
param_s1 = Parameter(distribution='Gaussian', shape_parameter_A=0., shape_parameter_B=1., order=3)
param_s2 = Parameter(distribution='Gaussian', shape_parameter_A=0., shape_parameter_B=1., order=3)
R = np.array([ [ 1.0,  0.8],
               [ 0.8,  1.0]])
basis = Basis('sparse-grid', level=3, growth_rule='exponential')
poly = Poly([param_s1, param_s2], basis, method='numerical-integration')

# Now feed this polynomial to the Correlations class along with the correlation matrix.
corr = Correlations(poly, R)
corr.set_model(function)
poly2 = corr.get_transformed_poly()

Now we can use this transformed polynomial to compute statistics.

print(poly2.get_mean_and_variance())
>> (0.9999999999999987, 35.449838615261825)

These values can be verified by numpy

S = np.random.multivariate_normal( mean=[0., 0.], cov=R, size=1000000)
model_evals = evaluate_model(S, function)
print(np.mean(model_evals), np.var(model_evals))
1 Like