Solver subclasses

ascillitoe · 22 February 2021 14:26

Hi all,

I’ve gone ahead and rewritten the solvers within solver.py so that each solver is now a subclass of the base Solver() class. This is in the feature_solverclasses branch at the moment, but I will merge into develop shortly.

Previously the Solver() class looked like the below, with the actual solver stored as a lambda function in Solver().solver. This was then stored in Poly().solver, and polynomial coefficients were obtained with x=Poly().solver(A,b).

class Solver(object):
      """
      Returns solver functions for solving Ax=b
     :param string method: The method used for solving the linear system. Options include: ``compressed-sensing``, ``least-squares``, ``minimum-norm``, ``numerical-integration``, ``least-squares-with-gradients``, ``least-absolute-residual``, ``huber`` and ``elastic-net``.
      :param dict solver_args: Optional arguments centered around the specific solver.
              :param numpy.ndarray noise-level: The noise-level to be used in the basis pursuit de-noising solver.
              :param bool verbose: Default value of this input is set to ``False``; when ``True`` a string is printed to the screen detailing the solver convergence and condition number of the matrix.
      """
 2    def __init__(self, method, solver_args):
          self.method = method
          self.solver_args = solver_args
          self.param1 = None
          self.verbose = False
          if self.solver_args is not None:
              if 'param1' in self.solver_args: self.param1 = solver_args.get('param1')
              if 'verbose' in self.solver_args: self.verbose = solver_args.get('verbose')
 
          if self.method.lower() == 'least-squares':
              self.solver = lambda A, b: least_squares(A, b, self.verbose)

In the new approach, parameters and methods shared across all the solvers are defined in the base Solver() class, and each solver is then a subclass of this class i.e.

# Least squares solver subclass.
################################
 class least_squares(Solver):
      """
      Ordinary least squares solver.
      :param dict solver_args: Optional arguments for the solver.
          :param bool verbose: Default value of this input is set to ``False``; when ``True``, the solver prints information to screen.
      """
      def __init__(self,solver_args={}):
          # Init base class
          super().__init__('least-squares',solver_args)
         
      def solve(self, A, b):
          if np.__version__ < '1.14':
              alpha = np.linalg.lstsq(A, b)
          else:
              alpha = np.linalg.lstsq(A, b, rcond=None)
          if self.verbose is True:
              print('The condition number of the matrix is '+str(np.linalg.cond(A))+'.')
          self.coefficients = alpha[0]

Now the entire solver subcass is stored within Poly().solver, and the coefficients are obtained with x=Poly().solver.get_coefficients(A, b). This approach has a number of advantages:

Cleaner code i.e. avoids polluting namespace when multiple solvers have similar parameters or methods.
Each solver can have its own methods i.e. I’m working on a plotting method to plot the regularisation path of the elastic-net solver.
For more complex iterative solvers, we could do things like warm-starting with solver solution from previously fitted Poly().
Can play around with the solvers themselves (see below!).

Creating Polynomials

In terms of the operation of Poly(), nothing has changed on the outside e.g. to define a polynomial with the elastic-net solver:

mypoly = eq.Poly(parameters=myparam, basis=mybasis, method='elastic-net', 
            sampling_args={'mesh':'user-defined', 
            'sample-points':X, 'sample-outputs':y},
            solver_args={'verbose':False,'path':True})

Creating Solver objects

If you want to experiment with the solvers with your own \mathbf{A}\mathbf{x}=\mathbf{b} system, you can now define and use the solver objects independently. For example:

import equadratures.solver as solver 
# select_solver will return the requested solver subclass
mysolver = solver.Solver.select_solver('least-squares',solver_args={'verbose':True})
# Or you can select the solversubclass directly
#mysolver = solver.least_squares(solver_args={'verbose':True})

# You can manually run solve and then get_coefficients
#mysolver.solve(A,b)
#x = mysolver.get_coefficients()

# Or providing get_coefficients() with A and b will automatically run solve()
x = mysolver.get_coefficients(A,b)

Custom solver class

I’ve also added a custom-solver subclass, where the user can provide their own solver function within solver_args. The only limitations of the custom solver function are that it must accept A and b, and return the coefficients. Additional arguments can also be included in solver_args, and they will be piped into the custom function as **kwargs.

# As an example just copy the standard least-squares solver
def lsq(A, b, verbose=False):
    alpha = np.linalg.lstsq(A, b, rcond=None)
    if verbose is True:
        print('The condition number of the matrix is '+str(np.linalg.cond(A))+'.')
    return alpha[0]

mypoly = eq.Poly(parameters=myparam, basis=mybasis, method='custom-solver', 
            sampling_args={'mesh':'user-defined', 'sample-points':X, 'sample-outputs':y},
              solver_args={'solve':lsq,'verbose':True})

This could be useful for prototyping new solvers!

psesh · 23 February 2021 22:20

Fantastic stuff! Below is some code for the Kaczmarz method for solving a linear system iteratively. This works best if the linear system is square.

def kaczmarz(A, b, verbose=False):
    m, n = A.shape
    x = np.random.rand(n, 1) # initial guess
    MAXITER = 50
    for iters in range(0, MAXITER):
        for i in range(0, m):
            a_row = A[i, :].reshape(1, n)
            term_3 = float(1.0/(np.dot(a_row , a_row.T)))
            term_1 = float(b[i] - float(np.dot(a_row , x)))
            x = x + (term_1 * term_3 * a_row.T)
    return x

Beyond playing around with MAXITER, ( I think you can get by even with 10-20) I think the next step would be to figure out how to solve the system without requiring A in memory. Note that in the above, we only access a row of A at a time. However, at the same time, it doesn’t make sense to call all the rows of A MAXITER times as it would be considerably slow. I shall re-visit this soon.

ascillitoe · 23 February 2021 23:16

Very cool to see a nice sample problem to kick off our new solver prototyping functionality!

Re the bit about not storing all of A in memory, it would be interesting to look at what sort of problems (if any) we might actually be hitting memory limits with. I think the memory of a numpy array (assuming dtype=float64) is simply m*n*8/1e9 GB. So with n=10000 we’d be looking at m=40e3 to get us into problematic memory issues. I’ve never actually monitored memory usage when trying to fit a high order very high dimensional Poly(), so not sure if the reason that gets slow is A being dumped into swap, or just numpy taking a long time to do it’s business!

I suppose all the above assumes that we could use cython or the like to speed up the above iterative solver enough to make it feasible for larger problems. Interesting stuff in any case!