Google Summer of Code 2022 Projects

Hi all, it’s that time of year again! :slight_smile: GSoC 2022 is almost upon us. Here is a working list of project ideas:

1. Accelerated task-centric wrappers
2. Computing univariate polynomial roots
3. Bayesian polynomial regression

Since this year the project length can vary between a shorter 175 hours project (Medium project) like last year, or a 350 hours project (Large project) spread up to 22 weeks, there is quite a bit of flexibility. For more details regarding the changes in GSoC 2022, please have a look here.


Project 1 title: Accelerated task-centric wrappers

Project description:

equadratures has numerous approaches for uncertainty quantification, optimisation, and more broadly regression. For novice users, the number of lines of code and its syntax may be a deterrent. Thus, as part of this project, we would like to add four key functions that significantly reduce the syntax required for uncertainty quantification, regression, dimension reduction, and optimisation:

  1. eq.fit(X, y)
  2. eq.uq(parameters, function, correlation)
  3. eq.dr(X, y)
  4. eq.optimise(function, range)

For instance, eq.fit(X,y) will perform a series of checks on the data before fitting a model. This will include checking the parameter distributions and subsequently scaling the inputs; identifying whether some parameters are dependent, and whether there exists a more appropriately orthogonal representation of the data (via PCA); fitting both least-squares and compressed-sensing methods, and verifying their accuracy by splitting the provided data into test and train sets. Expected steps / outputs for eq.fit(X, y) may include:

checking input data & parameter distributions...

fitting baseline polynomial model...

verifying accuracy...

running orthogonal decomposition...

fitting secondary polynomial model...

verifying accuracy...

success!

generating report...

Similar steps can be designed for the other functions above. For instance, the eq.uq(parameters, function, correlation), will provide a basic polynomial chaos functionality with output moments and Sobol’ indices.

Expected outcomes:

  • The development of a eq.py file that has the above four functions coded. As part of this deliverable, the GSoC applicant will have a working understanding of existing workflows for the above tasks.
  • The ability to output a PDF report summarising the main steps involved in each task.
  • The development of a test_eq.py file that will systematically test the above code.

Project mentors:

  • Pranay Seshadri
  • Bryn Noel Ubald
  • Andrew Duncan

Skills required:

  • Working python background with familiarity on numpy and scipy functionality.
  • GSoC contributor should have taken a college-level linear algebra course and probability course.

Project rating:

  • Medium

Project size:

  • 350 hours

Useful references:


Project 2 title: Computing univariate polynomial roots

Project description:

It is well known that the roots of a univariate orthogonal polynomial expansion can be computed by solving an eigenvalue problem. This requires expressing the polynomial expansion in terms of a colleague matrix, which is based on a three-term-recurrence relationship. In this project, the student / GSoC contributor will add such a utility to the base distributions in equadratures. Furthermore, they will demonstrate how this functionality may be used within an optimisation context, i.e., via derivative polynomials.

Expected outcomes:

  • The development of a function in distributions/template.py, the other distribution files, and parentparameter.py that returns the roots of a univariate Poly instance.
  • The development of a test_roots.py file that will systematically test the above code.
  • A demonstration / tutorial of this capability towards an analytical optimisation problem.

Project mentors:

  • Pranay Seshadri
  • Bryn Noel Ubald
  • Tiziano Ghisu

Required skills:

  • Working python background with familiarity on numpy and scipy functionality.
  • GSoC contributor should have taken a college-level linear algebra course.

Project rating:

  • Medium

Project size:

  • 175 hours

Useful references:


Project 3 title: Bayesian polynomial regression

Project description:

This project is deliberately designed to be more open-ended than the aforementioned projects. Our core development team has been building various prototypes focused on Bayesian polynomial regression. Essentially, one uses bits of equadratures code within a probabilistic package, e.g., numpyro or pymc3, for either

  • building multiple similar (or coregional) polynomial models. This may be used to build models for phenomenon or data that have some intrinsic similarity, but yet have different data, or
  • for constructing a hierarchical polynomial model where the coefficients are effectively constrained, e.g., sparsity promoting models.

The primary ingredient used from probabilistic packages is the Markov chain Monte Carlo sampler.

Expected outcomes:

  • The development of templates (building upon some of our existing work) that combine equadratures building blocks with probabilistic elements.
  • Test functions in a test_polybayes.py file that appropriately test the above utilities for data-driven problems.
  • One to two tutorials that discuss how the above templates can be used for addressing a broader class of problems.

Project mentors:

  • Pranay Seshadri
  • Bryn Noel Ubald
  • Andrew Duncan
  • Chun Yui Wong

Skills required:

  • Working python background with familiarity on numpy and scipy functionality, and should have had some prior experience with either pymc3 or numpyro.
  • GSoC contributor should have taken a college-level linear algebra course.
  • GSoC contributor should be familiar with the basics of Bayesian inference and polynomial regression.

Project rating

  • Hard

Project size

  • 350 hours
2 Likes