In this blog post, we demonstrate the use of Effective Quadratures machinery in the recently proposed framework for probabilistic blade design and manufacturing—Blade Envelopes (BE). With BE, designs that obey certain invariance conditions in flow quantities of interest are discovered and quantified. This is an important goal for designing tolerance limits for manufactured geometries, which are used as guidelines in blade workshops to measure the airworthiness of a fresh manufactured blade. It provides a quantitative answer to the question “Should this blade be scrapped or not?”.
Polynomial approximation and dimension reduction methods are the main mathematical machinery behind this framework. In this post, we briefly explain how these methods are used in BE and the EQ code that are used to generate the numerical results in our papers [1, 2].
Methodology
At the core of the BE framework is the idea of probabilistic design. Instead of designing a single geometry, we design a distribution of geometries with mean and (spatial) tolerance covariance structures, which achieves certain performance goals. In our work, we address a common requirement in blade manufacturing—finding a distribution of blade geometries that are invariant in one of more quantities of interest, such as the pressure loss coefficient, mass flow function etc. In this context, the mean of the geometry can be regarded as the nominal geometry, and the covariance is a measure of tolerance ranges that can be allowed around the nominal geometry. However, it also encapsulates the idea that the tolerances around the blade should be correlated spatially. For example, as long as a positive displacement in the leading edge is compensated by a corresponding negative displacement near the trailing edge, maintaining a smooth curvature, performance may not be impacted severely. This implies that the geometry should be accepted.
In the figure below, a blade envelope is visually represented with a control zone and sample profiles. The former shows the mean profile along with the extent of pointwise geometry excursions allowed around the mean, informed by the tolerance covariance. However, the control zone alone is not sufficient, and must be accompanied by a representation of the spatial correlations embodied by acceptable geometries. This is achieved through the sample profiles.
(Figure reproduced from [1] under the CCBYNCND 4.0 license.)
In practice, how can we obtain such a distribution? We propose a solution based on the idea of active and inactive subspaces. In a parametric design space, certain active directions (linear combinations of design parameters) can lead to large changes in an objective function, while directions orthogonal to the active directions (inactive directions) lead to negligible objective changes. Through dimension reduction methods, one can identify these directions which carve out the active and inactive subspaces respectively. Crafting the design space to include all possible excursions from the nominal geometry, we can then identify the inactive subspace as a subspace containing the designs that are invariant in the objective of interest. Finding the limits on the corresponding geometries informs us of how the tolerance covariance should be prescribed.
Note how we had to specify a parametric design space to model the effect of manufacturing excursions. In deployment, we are faced with geometries derived from blade scanning methods, and we wish to judge geometries based only on their inherent shape, independent of a design space. However, the inactive subspace must be generated from the specification of a certain design space. How can we ensure the method is invariant to the choice of the design space?
We wish to characterise the shapes of the profiles within the inactive subspace. To do so, we draw samples from the inactive subspace and plot out the geometrical coordinates of the drawn samples. From these samples, the mean and tolerance covariance are calculated as the sample mean and covariance. It is important to note that at this stage, we may need a sizable amount of samples, e.g. a few thousand, to ensure the convergence of the mean and covariance. However, these samples are drawn without the requirement to evaluate their output performance metrics (e.g. through CFD); since they belong in the inactive subspace, invariance in the objective of interest is guaranteed. The main computational bottleneck is the construction of a database of training data to find the inactive subspace.
The final piece of the puzzle is a method to determine if a certain scanned geometry belongs to the inactive subspace, i.e. can we expect the scanned geometry to obey output invariance? For this, the Mahalanobis distance between the geometry and the blade envelope distribution can be calculated,
where \mathbf{s} is the geometric coordinates, \boldsymbol{\mu} and \mathbf{\Sigma} are the BE mean and covariance respectively.
As seen, the distance is a function of the mean and tolerance covariance that is known to us. This distance gives us a litmus test of whether the geometry can be expected to be invariant: A small distance implies strong invariance in the output objective. Thus, it can be passed as an argument to a tailored logistic function that converts the distance to a score between 0 and 1, representing how certain we are that a manufactured component can be used.
(Figure reproduced from [1] under the CCBYNCND 4.0 license.)
The whole workflow can be summarised as below.
(Figure reproduced from [1] under the CCBYNCND 4.0 license.)
Code demo
Let us demonstrate the generation and deployment workflow of blade envelopes with an example. Consider the design of the LS89 turbine cascade profile from the Von Karman Institute, with the goal to prescribe tolerance guidelines such that the stage pressure loss coefficient is maintained approximately invariant. Follow along with the interactive Google Colaboratory notebook here:
First, we import relevant modules and fetch the relevant data from GitHub:
import numpy as np
from equadratures import *
import matplotlib.pyplot as plt
from scipy.stats import linregress
!wget O master.zip https://github.com/ncywong/BE_data/archive/master.zip
!unzip o master.zip
os.chdir('BE_datamaster')
Now, we load in data used for training a polynomial response surface. The data is generated with CFD using the opensource code SU2 (https://su2code.github.io). In total, 1000 simulations were run with designs from a 20node FFD space (with 20 degrees of freedom). Out of these, 3 diverged and are removed from the data. Then, we split the data such that 797 designs are used for training and 200 for validation.
rng = np.random.default_rng(0)
X = np.loadtxt('design')
y = np.loadtxt('loss')
inf_indices= np.where(y==np.inf)
X = np.delete(X, inf_indices[0], axis=0)
y = np.delete(y, inf_indices[0], axis=0)
K, d = X.shape
indices = rng.permutation(K)
training_idx = indices[:K200]
test_idx = indices[K200:K]
X_train = X[training_idx,:]
X_test = X[test_idx,:]
y_train = y[training_idx]
y_test = y[test_idx]
Using EQ, we fit a cubic response surface on the design space (the hypercube [1, 1]^{20} with a uniform distribution) on a total order isotropic basis. The cardinality of the basis is 1771, which is larger than the number of training data points we have. Thus, we use the compressed sensing method to find a sparse solution for the polynomial coefficients.
s = Parameter(distribution='uniform', lower=1., upper=1., order=3)
myparameters = [s for _ in range(0, d)]
mybasis = Basis('totalorder', orders=[3 for _ in range(0,d)])
mypoly = Poly(parameters=myparameters, basis=mybasis, method='compressivesensing', \
sampling_args= {'mesh': 'userdefined', 'samplepoints': X_train, 'sampleoutputs': y_train}, \
solver_args={'verbose':True})
mypoly.set_model()
The noise level
here specifies the amount of mismatch between the training data and the polynomial predictions. Compared to the standard deviation of the data, it should not be too high—implying a lack of fit, but also should not be too low—a sign of overfitting. Using the specified random seed, we get a value of 0.001, which is a sensible number.
To determine if the fit is indeed reasonable, let’s check its predictions against validation data
poly_test = np.squeeze(mypoly.get_polyfit(X_test))
poly_train = np.squeeze(mypoly.get_polyfit(X_train))
R2 = linregress(poly_test, y_test)[2]**2
print(R2)
This gives us an R^2 of about 0.9, which ensures that the model is accurate enough. Just to be sure, we can plot the validation results:
fig = plt.figure()
plt.rc('axes', axisbelow=True)
ax = fig.add_subplot(111)
plt.plot(poly_train, y_train, 'o', ms=12, color='dodgerblue', alpha=0.5, label='Training', lw=0.5, markeredgecolor='k')
plt.plot(poly_test, y_test, 'o', ms=12, color='crimson', alpha=0.7, label='Testing', lw=0.5, markeredgecolor='k')
plt.xlabel('Global polynomial approximation for $Y_p$', fontsize=15)
plt.legend()
plt.ylabel('CFD values for $Y_p$', fontsize=15)
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.show()
Equipped with the polynomial model, it is possible to evaluate gradients of the model. Note that it is also possible to evaluate gradients via adjoints with the CFD code, but we do not assume this capability in our work. Using the activesubspace
method in Subspaces
, the average gradient outer product is evaluated and decomposed to its eigenvectors, from which the active and inactive subspaces are found (see Part I of our paper for more details).
mysubspace = Subspaces(full_space_poly=mypoly, method='activesubspace', \
polynomial_degree=3, subspace_dimension=1)
mysubspace.polynomial_degree = 3
W = mysubspace.get_subspace()
e = mysubspace.get_eigenvalues()
fig = plt.figure()
plt.rc('axes', axisbelow=True)
ax = fig.add_subplot(111)
plt.semilogy(np.arange(0,d)+1, e, 'o', color='gold', ms=12, lw=0.5, markeredgecolor='k')
plt.ylabel('Eigenvalues (logscale)', fontsize=15)
plt.xlabel('Design parameters', fontsize=15)
plt.xticks([5,10,15,20])
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.show()
Gaps in the eigenvalue spectrum indicate the presence of low dimensional structures, ensuring that the active subspace captures the majority of the output variation; and—conversely—ensuring output invariance in the inactive subspace. Below, we plot the eigenvalue spectrum to determine the appropriate dimension for the active subspace.
(Figure reproduced from [1] under the CCBYNCND 4.0 license.)
The large gap between the first and second eigenvalue indicates that a onedimensional active subspace is appropriate. Plotting the loss as a function of the projected coordinates, we see that most of the variation can be controlled by this projected coordinate. In other words, fixing this active coordinate, we achieve approximate invariance in loss.
(Figure reproduced from [1] under the CCBYNCND 4.0 license.)
Thus, we fix the active coordinate at 0 (which includes the unperturbed nominal design) and generate samples with this active coordinate. That is, we generate samples in the inactive subspace corresponding to this point in the active subspace. Executing the following command gives us the design space coordinates of 1000 invariant samples:
X_loss = mysubspace.get_samples_constraining_active_coordinates(1000, np.array([0]))
The routine used leverages the hitandrun sampling algorithm—see our paper for more details. We should expect that these samples have largely similar losses as the nominal loss. Is this the case? Let’s verify this by running CFD on some invariant designs. In the interest of time, we show the results from the paper directly here.
(Figure reproduced from [1] under the CCBYNCND 4.0 license.)
As seen, the invariant designs (yellow) have much less loss variation than random designs from the design space (blue).
Next, we need to characterise the geometries from the inactive subspace with their spatial coordinates. To do this, we need to use the geometry deformation scripts from the CFD code, which we omit in this demo. Instead, let’s load in three groups of geometries, namely

train_y
: Training geometries from the inactive subspace used to calculate the mean and covariance of the blade envelope. 
y_random
: Geometries randomly sampled from the design space, and 
y_loss
: Geometries from the inactive subspace, but distinct fromtrain_y
.
For the second and third group, we compute the average Mahalanobis distance from the distribution defined using the training geometries as follows:
from scipy.spatial.distance import mahalanobis
train_y = np.loadtxt('new_y_nzm')
N_train = 5000
y_random = np.loadtxt('y_coords_random')
y_loss = np.loadtxt('y_coords_loss')
mu = np.mean(train_y[:5000], axis=0)
S = np.cov(train_y[:5000].T)
def my_inv(A):
Z = A + 1e7 * np.eye(A.shape[0])
L = np.linalg.cholesky(Z)
inv_L = np.linalg.inv(L)
return np.dot(inv_L.T, inv_L)
S_inv = my_inv(S)
y_loss_dist = [mahalanobis(y_loss[i,:] , mu, S_inv) for i in range(y_loss.shape[0])]
y_random_dist = [mahalanobis(y_random[i,:], mu, S_inv) for i in range(y_random.shape[0])]
print(np.mean(y_loss_dist))
print(np.mean(y_random_dist))
We should find that the average distance for the lossinvariant geometries should be lower than that of random geometries, as the following figure shows.
(Figure reproduced from [1] under the CCBYNCND 4.0 license.)
Passing these distance values through a logistic function, we have the following, which gives us the confidence value for whether a component should be kept or scrapped.
(Figure reproduced from [1] under the CCBYNCND 4.0 license.)
Finally, we can verify that the process is independent of the design space used to generate the BE by testing it with geometries derived from a different design space, one with 30 FFD deformation nodes. The following is the computed distances for these geometries, colour coded with their loss values.
(Figure reproduced from [1] under the CCBYNCND 4.0 license.)
It is clear that the Mahalanobis distance is effective at distinguishing between geometries with similar loss to nominal and those without, and can do so without extra CFD solves.
Next steps
In this post, we gave an overview of the BE considering invariance in a single objective. What if we wish to impose further material, mechanical or aerodynamic constraints? In Part II of our paper, we address this issue by considering the intersection of inactive subspaces. In addition, we consider vectorvalued objectives such as the surface isentropic Mach number distribution and their invariance, and connect these ideas to inverse design. Instead of finding a single geometry given flow profile constraints, we find a distribution of geometries that approximately satisfy looser flow profile specifications.
Apart from being a quantitative way of computing tolerance guidelines, BE is also a versatile tool for visualising the design of tolerance bounds. Indeed, the control zone and sample profiles mentioned earlier are amenable to various visualisation methods. In our work, we have used Blender to render the BE in 3D. Other technologies such as Augmented Reality and 3D printing can also be deployed to bring the BE to reality on the desktop. Please look forward to a further post on these exciting technologies!
[1]: Chun Yui Wong, Pranay Seshadri, Ashley Scillitoe, Andrew B. Duncan, Geoffrey Parks. Blade Envelopes Part I: Concept and Methodology. [2011.11636] Blade Envelopes Part I: Concept and Methodology .
[2]: Chun Yui Wong, Pranay Seshadri, Ashley Scillitoe, Bryn Noel Ubald, Andrew B. Duncan, Geoffrey Parks. Blade Envelopes Part II: Multiple Objectives and Inverse Design. [2012.15579] Blade Envelopes Part II: Multiple Objectives and Inverse Design .