# Analytic weight functions

Presently in Effective Quadratures, we use the following syntax for generating recurrence coefficients with respect to a custom distribution:

s = Parameter(distribution='custom', data=data, order=3)

It would be very useful to prescribe analytic weight functions.

1 Like

The main bottleneck is the computation of the recurrence coefficients for a custom analytical weight function (and its verification!). So for instance if we wanted to estimate the first 10 Gauss quadrature points between [0.00001, ~23] (inspired by this example) with respect to the weight function

w(z) = \frac{\text{exp} \left( -z \right)}{\sqrt{z}}

then one could try

RECURRENCE_PDF_SAMPLES = 50000

# Weight function!
def weight_function(z):
return np.exp(-z)/np.sqrt(z)

support = [0.00001, -np.log(1e-10)]
def get_recurrence_coefficients(weight, support, order):
x = np.linspace(support[0], support[1], RECURRENCE_PDF_SAMPLES)
w = weight(x)
# Allocate memory for recurrence coefficients
order = int(order)+1
w = w / np.sum(w)
ab = np.zeros((order+1,2))

# Negate "zero" components
nonzero_indices = []
for i in range(0, len(x)):
if w[i] != 0:
nonzero_indices.append(i)

ncap = len(nonzero_indices)
x = x[nonzero_indices] # only keep entries at the non-zero indices!
w = w[nonzero_indices]
s = np.sum(w)

temp = w/s
ab[0,0] = np.dot(x, temp.T)
ab[0,1] = s

if order == 1:
return ab
p1 = np.zeros((1, ncap))
p2 = np.ones((1, ncap))

for j in range(0, order):
p0 = p1
p1 = p2
p2 = ( x - ab[j,0] ) * p1 - ab[j,1] * p0
p2_squared = p2**2
s1 = np.dot(w, p2_squared.T)
inner = w * p2_squared
s2 = np.dot(x, inner.T)
ab[j+1,0] = s2/s1
ab[j+1,1] = s1/s
s = s1
return ab
print(get_recurrence_coefficients(weight_function, support, 10))

Note that this is a modification of the get_recurrence_coefficients function from the existing custom.py class (under distributions). The above code yields:

[[ 0.46986159  1.        ]
[ 2.486252    0.48402306]
[ 4.49016711  2.97612722]
[ 6.49043934  7.46954769]
[ 8.46137656 13.93607251]
[10.22121463 21.98940049]
[11.33779003 29.59349632]
[11.67113704 33.52379102]
[11.64458236 34.0604234 ]
[11.58506832 33.69644955]
[11.55380415 33.43563167]
[11.5391853  33.31329096]]

Then one can feed these into the standard quadrature workflow via something like:

param = Parameter(distribution='Uniform', lower=-1., upper=1., order=10)

where p and w would be the quadrature points and weights.

The above thread has been motivated by a few requests from mechanical / aerospace engineers; their query is whether one can incorporate custom probability distributions for which the analytical forms are known, i.e., the triangular distribution?

Below, I detail one way to go about implementing this:

param = Parameter(distribution='analytical', weight_function=pdf, order=3)

where the input pdf is an instance of the Weight class, given below:

import numpy as np

ORDER_LIMIT = 1000
RECURRENCE_PDF_SAMPLES = 50000

class Weight(object):
"""
The class offers a template to input bespoke weight (probability density) functions.

:param lambda weight_function: A function call.
:param list support: Lower and upper bounds of the weight respectively.
:param bool pdf: If set to True, then the weight_function is assumed to be normalised to integrate to unity. Otherwise,
the integration constant is computed and used to normalise weight_function.
:param float mean: User-defined mean for distribution. When provided, the code does not compute the mean of the weight_function over its support.
:param float variance: User-defined variance for distribution. When provided, the code does not compute the variance of the weight_function over its support.

**Sample constructor initialisations**::

import numpy as np

pdf = Weight(lambda x: exp(-x)/ np.sqrt(x), [0.00001, -np.log(1e-10)], pdf=False)
"""
def __init__(self, weight_function, support, pdf=False, mean=None, variance=None):
self.weight_function = weight_function
self.pdf = pdf
self.support = support
self.lower = self.support[0]
self.upper = self.support[1]
self._verify_probability_density()
self.mean = mean
self.variance = variance
self.data = self.weight_function(np.linspace(self.lower, self.upper, RECURRENCE_PDF_SAMPLES))
if self.mean is None:
self._set_mean()
if self.variance is None:
self._set_variance()

def _verify_probability_density(self):
if (np.abs(integral - 1.0) >= 1e-2) or (self.pdf is False):
self.integration_constant = 1.0/integral
elif (np.abs(integral - 1.0) < 1e-2) or (self.pdf is True):
self.integration_constant = 1.0

param = Parameter(distribution='uniform',lower=self.lower, upper=self.upper,order=order)
basis = Basis('univariate')
poly = Poly(method='numerical-integration',parameters=param,basis=basis)
points, weights = poly.get_points_and_weights()
return points, weights * (self.upper - self.lower)

def _set_mean(self):
# Modified integrand for estimating the mean
mean_integrand = lambda x: x * self.weight_function(x) * self.integration_constant

# Keep increasing the order till we reach ORDER_LIMIT
integral_before = 10.0
integral = float(np.dot(wts, evaluate_model(pts, integrand)))
integral_before = integral
raise(RuntimeError, 'Even with '+str(ORDER_LIMIT+1)+' points, an error in the mean of '+str(1e-4)+'cannot be obtained.')

def _set_variance(self):
# Modified integrand for estimating the mean
variance_integrand = lambda x: (x  - self.mean)**2 * self.weight_function(x) * self.integration_constant

The plan moving forward is to roll-out this functionality into the next version of the code, permitting:

pdf = Weight(lambda z: np.exp(-z)/np.sqrt(z) , [0.000001, -np.log(1e-10)], pdf=False)

Stay tuned!

Here is a quick demo of the new weight class in action, where we use a triangular distribution.

import numpy as np
import matplotlib.pyplot as plt

# A triangular distribution
a = 3.
b = 6.
c = 4.
mean = (a + b + c)/3.
var = (a**2 + b**2 + c**2 - a*b - a*c - b*c)/18.
pdf = Weight(lambda x : 2*(x-a)/((b-a)*(c-a)) if (a <= x < c) \
else( 2/(b-a) if (x == c) \
else( 2*(b-x)/((b-a)*(b-c)))), \
support=[a, b], pdf=True)

s = Parameter(distribution='analytical', weight_function=pdf, order=2)
s_samples = s.get_samples(500000)

fig = plt.figure()
plt.plot(pdf.x_range_for_pdf, pdf.data, 'r-')
n, bins, patches = plt.hist(s_samples, 60, normed=1, facecolor='navy', alpha=0.75, edgecolor='w')
plt.xlabel('$s$', fontsize=1)
plt.ylabel('PDF (Histogram)', fontsize=13)
plt.grid()
plt.show()

basis = Basis('univariate')
poly = Poly(s, basis, method='numerical-integration')

# Model!
def model(input):
return input**2

poly.set_model(model)
feval = evaluate_model(s_samples, model)
mean2, variance2 = poly.get_mean_and_variance()
print(mean2, variance2)
print(np.mean(feval), np.var(feval) )

The output plot is shown below:

Computed statistics and comparison with Monte Carlo yields:

19.16666666733338 30.70555551782622
19.162835376783196 30.687223820864958

The latest version of the code has been pushed! Note that this commit does not have the custom data-driven option in Parameter; after much thought it has been decided to eliminate that functionality. Older users of Effective Quadrature may be familiar with the custom distribution option where the user provided a data input, which was then fed into a kernel density estimation (KDE) routine. One issue with the KDE approach is that it used Scott’s rule for bandwidth selection, which may not be ideal for a user’s data set. The fact that the code hid this from the user prompted us to re-visit this approach. If a user now wishes to feed in a custom data-driven distribution, then they can do this via:

from scipy import stats
pdf = Weight( stats.gaussian_kde(data, bw_method='silverman') , support=[-3, 3.2])
s = Parameter(distribution='analytical', weight_function=pdf, order=2)
1 Like