Hi all, it’s that time of year again! GSoC 2022 is almost upon us. Here is a working list of project ideas:
1. Accelerated task-centric wrappers
2. Computing univariate polynomial roots
3. Bayesian polynomial regression
Since this year the project length can vary between a shorter 175 hours project (Medium project) like last year, or a 350 hours project (Large project) spread up to 22 weeks, there is quite a bit of flexibility. For more details regarding the changes in GSoC 2022, please have a look here.
Project 1 title: Accelerated task-centric wrappers
Project description:
equadratures has numerous approaches for uncertainty quantification, optimisation, and more broadly regression. For novice users, the number of lines of code and its syntax may be a deterrent. Thus, as part of this project, we would like to add four key functions that significantly reduce the syntax required for uncertainty quantification, regression, dimension reduction, and optimisation:
eq.fit(X, y)
eq.uq(parameters, function, correlation)
eq.dr(X, y)
eq.optimise(function, range)
For instance, eq.fit(X,y)
will perform a series of checks on the data before fitting a model. This will include checking the parameter distributions and subsequently scaling the inputs; identifying whether some parameters are dependent, and whether there exists a more appropriately orthogonal representation of the data (via PCA); fitting both least-squares
and compressed-sensing
methods, and verifying their accuracy by splitting the provided data into test and train sets. Expected steps / outputs for eq.fit(X, y)
may include:
checking input data & parameter distributions...
fitting baseline polynomial model...
verifying accuracy...
running orthogonal decomposition...
fitting secondary polynomial model...
verifying accuracy...
success!
generating report...
Similar steps can be designed for the other functions above. For instance, the eq.uq(parameters, function, correlation)
, will provide a basic polynomial chaos functionality with output moments and Sobol’ indices.
Expected outcomes:
- The development of a
eq.py
file that has the above four functions coded. As part of this deliverable, the GSoC applicant will have a working understanding of existing workflows for the above tasks. - The ability to output a PDF report summarising the main steps involved in each task.
- The development of a
test_eq.py
file that will systematically test the above code.
Project mentors:
- Pranay Seshadri
- Bryn Noel Ubald
- Andrew Duncan
Skills required:
- Working python background with familiarity on
numpy
andscipy
functionality. - GSoC contributor should have taken a college-level linear algebra course and probability course.
Project rating:
- Medium
Project size:
- 350 hours
Useful references:
Project 2 title: Computing univariate polynomial roots
Project description:
It is well known that the roots of a univariate orthogonal polynomial expansion can be computed by solving an eigenvalue problem. This requires expressing the polynomial expansion in terms of a colleague matrix, which is based on a three-term-recurrence relationship. In this project, the student / GSoC contributor will add such a utility to the base distributions in equadratures. Furthermore, they will demonstrate how this functionality may be used within an optimisation context, i.e., via derivative polynomials.
Expected outcomes:
- The development of a function in
distributions/template.py
, the other distribution files, andparentparameter.py
that returns the roots of a univariatePoly
instance. - The development of a
test_roots.py
file that will systematically test the above code. - A demonstration / tutorial of this capability towards an analytical optimisation problem.
Project mentors:
- Pranay Seshadri
- Bryn Noel Ubald
- Tiziano Ghisu
Required skills:
- Working python background with familiarity on
numpy
andscipy
functionality. - GSoC contributor should have taken a college-level linear algebra course.
Project rating:
- Medium
Project size:
- 175 hours
Useful references:
- Chapter 18 in Trefethen (2013), “Approximation theory and approximation practice”, SIAM.
- Higham, N. blog post on “What is a companion matrix?”
Project 3 title: Bayesian polynomial regression
Project description:
This project is deliberately designed to be more open-ended than the aforementioned projects. Our core development team has been building various prototypes focused on Bayesian polynomial regression. Essentially, one uses bits of equadratures code within a probabilistic package, e.g., numpyro or pymc3, for either
- building multiple similar (or coregional) polynomial models. This may be used to build models for phenomenon or data that have some intrinsic similarity, but yet have different data, or
- for constructing a hierarchical polynomial model where the coefficients are effectively constrained, e.g., sparsity promoting models.
The primary ingredient used from probabilistic packages is the Markov chain Monte Carlo sampler.
Expected outcomes:
- The development of templates (building upon some of our existing work) that combine
equadratures
building blocks with probabilistic elements. - Test functions in a
test_polybayes.py
file that appropriately test the above utilities for data-driven problems. - One to two tutorials that discuss how the above templates can be used for addressing a broader class of problems.
Project mentors:
- Pranay Seshadri
- Bryn Noel Ubald
- Andrew Duncan
- Chun Yui Wong
Skills required:
- Working python background with familiarity on
numpy
andscipy
functionality, and should have had some prior experience with eitherpymc3
ornumpyro
. - GSoC contributor should have taken a college-level linear algebra course.
- GSoC contributor should be familiar with the basics of Bayesian inference and polynomial regression.
Project rating
- Hard
Project size
- 350 hours