 # Piecewise Polynomials with Ridge Detection

Hello,

I have a regression problem for which I would like to try the Piecewise Polynomials with Ridge Detection capability in the code. I was wondering if you could give me a hand in this task as I don’t have much experience with this topic.

I have 3 quantities of interest that are described by 35 parameters. I would like to create analytical models for each of these such that I can employ them with a search algorithm for optimisation. For this task, I have sampled 284 candidates for which I know the true value of the functions.

How should I proceed for creating the polynomials? Should I be normalising my data in some way prior to fitting them? What splitting criterion is recommended?

I’m able to share my dataset if anyone’s interested in having a look, just let me know.

1 Like

Hi @psesh. You can find it here:

It is a (284 x 38) matrix, where each row after the header is a sample. The first 35 columns contain the parameter values, and the last three, the values for the quantities of interest.

Cheers!

Hi @DiegoLopez. I’ve had a first attempt at this. Given the few datapoints you have, I’ve only be able to apply linear regression trees. My original thought process was that once we have polynomials defined over each node, we can then compute their active subspace. This is possible because regression trees output a series of `Poly` instances, which can then be fed into the `Subspace` class.

For the former, see the code below:

``````import equadratures as eq
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

N, d = data.shape
input_var_names = ['v'+str(i) for i in range(1, d-2)]

X = pd.DataFrame(data, columns=input_var_names).values

tree_1.fit(X,data['qoi1'].values)
tree_1.get_splits()
``````

This should return:

``````[[-0.6428257804365949, 18],
[-0.2806524228882106, 23],
[0.5000000092592589, 17],
[0.3320158013032749, 12]]

individual_polys = tree_1.get_polys()
individual_polys
``````

This should return a list of polynomials:

``````[<equadratures.poly.Poly at 0x1329a0250>,
``````

In terms of output, here is what each polynomial looks yields:

``````for little_polys in individual_polys:
little_polys.plot_model_vs_data()
fig, ax = little_polys.plot_sobol(order=1, show=False)
ax.set_xticklabels(input_var_names, fontsize=8)
``````          One can also use the online GraphViz tool to view the tree structure:

So what do we do next? My suggestion would be to see if you can generate more data / simulations, and also see if there is a strong physical rationale for the split locations. Hope this helps.

Hi @psesh. Thanks for this. I wanted to give your code a try myself but I run into an issue when constructing the tree_1 object.

```````TypeError: __init__() got an unexpected keyword argument 'splitting_criterion'`
``````

I’m using equdratures version 9.0.0 - This is the latest one I could get by doing

```````pip3 install equadratures --upgrade`
``````

I was able to clone the latest version from github, but this produced the following error:

```````Exception: invalid splitting_criterion`
``````

It seems the only options available for splitting_criterion in this version are “model_aware” and “model_agnostic”

Do you have any suggestions as for how I might get the “loss_gradient” option working for me?

Comment on the number of samples: each evaluation of qoi1, qoi2 and qoi3 is very expensive, so unfortunately I cannot afford to obtain more than I have at the moment. I just need to get the best performing model possible for this number of runs and try to make the best out of it.

Hi @DiegoLopez, sorry for the confusion here. The loss gradient splitting is currently in the `develop` branch, we plan to merge it into `master` (and update pip) this coming Friday.

In the meantime you could also try `model_aware` and `model_agnostic`, but you will also be missing some of the plotting functionality @psesh used above. FYI `model_aware` attempts to fit polynomials at each candidate split location (and then choses the split which minimises the loss), hence it can be quite slow when there are a large number of dimensions. `model_agnostic` instead uses the CART tree induction algoritm to build the tree, and then fits polynomials to the resulting tree. It is very fast, but the resulting model isn’t always as accurate.

Alternatively, if you’d like to install the latest version from `develop`, you can install this for now with:

``````pip install git+https://github.com/Effective-Quadratures/equadratures.git@develop
``````

Or you could use git to maintain a local version:

``````git clone https://github.com/Effective-Quadratures/equadratures.git
git checkout --track origin/develop
``````

and then install your local version via pip:

``````pip install -e <location_of_local_equadratures_repo>
``````
2 Likes