The GenericOpt scikit
Constrained and unconstrained optimization
Framework for generic optimizers
This new framework proposes usual pieces of optimizers that can be used together efficiently and changed easily.
- The optimizer module provides the current state to each submodule and call them in the correct order
- The criterion module exposes some usual stopping criterion
- The step module proposes different steps (descent directions) with different arguments specific to each one
- The line search module is responsible for the search of a minimum on a line proposed by the step computed before
- The helpers module provides a collection of usual functions in optimization such as quadratic fit function
Idioms used in this framework :
- object-oriented so that each module is responsible for its actions (the function provides the cost, but the gradient as well if needed, ...)
- separation principle : each sub-module uses the optimizer state to communicate with one another and this allows to reuse them if needed
The separation principle is implemented this a state dictionary that is passed around in the different modules. Each time a criticial value is computed (gradient, step, step modification, ...), it should be added to the state so that other module can use it if they need it.
This state dictionary can be passed as keyword arguments to a record object that is registered to the optimizer with the formal argument record. This object is then called after each iteration and the internal of the optimizer can be saved.
What is a function?
A function is an instance that possess some methods :
- at least __call__(self, x) that returns the cost at point x
- possibly gradient(self, x) that will return the gradient at point x
- the hessian(self, x) returning the hessian at point x
- hessianvect(self, x, v) that will return hessian(self, x) * v (in an optimized way)
A fitting function is somewhat different :
- __call__(self, x, params) that will call f(x) for some parameters, returning an array of values of the cost function. Its shape must be (len(x), d) where d is the number of values returned by a call (if f is 2D and x has 10 points, the return array will have (10, 2) as its shape)
- gradient(self, x, params) will return the gradient of f(x), the size must be (x.shape, d, x.shape)
- hessian(self, x, params) will return the hessian of f(x), the size must be (x.shape, d, x.shape, x.shape)
- hessianvect(self, x, v, params) will return the hessian of f(x) times a vector v, the size must be (x.shape, d, x.shape)
This is a list of things to add to the framework.
Parameters based criteria Gradient based criterion Composite criterion with several other criteria Modify current criteria to make them more modular and so that they do not include iteration limit check ?
- Criteria depending on a sliding window
Cubic interpolation (last exact line search that is missing in the framework IIRC) Goldstein conditions
- Barzilai & Borwein
- second order Armijo rule
- second order Wolfe-Powell rule
- decorator to start the search from a factor time the last step
from a fixed factor time the last step size adaptive step from the last and current gradient
- line searches from a fixed number of initial alpha steps (depending on polynomes ?)
- Golfeld et al.
- Conjugate gradient
Fletcher-Reeves PRP modification
- inexact Newton's method
Partial step (decorator that returns a random part of a direction)
Quadratic sum for fitting data
- Other sums for fitting data
Finite Differences class
- Need to define hessianvect when it will be needed
- Trust-region methods investigation
- Constrained optimization