Algorithms

Generic optimization algorithms.

biogeme.algorithms module

Optimization algorithms.

author

Michel Bierlaire

date

Sun Apr 5 16:48:54 2020

biogeme.algorithms.bfgs(H, d, y)[source]

Update the BFGS matrix. Formula (13.12) of Bierlaire (2015) where the method proposed by Powell (1977) is applied

Parameters
  • H (numpy.array (2D)) – current approximation of the inverse of the Hessian

  • d (numpy.array (1D)) – difference between two consecutive iterates.

  • y (numpy.array (1D)) – difference between two consecutive gradients.

Returns

updated approximation of the inverse of the Hessian.

Return type

numpy.array (2D)

biogeme.algorithms.bfgsLineSearch(fct, x0, initBfgs=None, eps=6.06273418136464e-06, maxiter=1000)[source]

BFGS method with inexact line search (Wolfe conditions)

Parameters
  • fct (optimization.functionToMinimize) – object to calculate the objective function and its derivatives.

  • x0 (numpy.array) – starting point

  • initBfgs (numpy.array) – matrix used to initialize BFGS. If None, the identity matrix is used. Default: None.

  • eps (float) – the algorithm stops when this precision is reached. Default: \(\varepsilon^{\frac{1}{3}}\)

  • maxiter (int) – the algorithm stops if this number of iterations is reached. Default: 1000

Returns

tuple x, messages, where

  • x is the solution found,

  • messages is a dictionary reporting various aspects related to the run of the algorithm.

Return type

numpy.array, dict(str:object)

Raises

biogeme.exceptions.biogemeError – if the dimensions of the matrix initBfgs do not match the length of x0.

biogeme.algorithms.bfgsTrustRegion(fct, x0, initBfgs=None, delta0=1.0, eps=6.06273418136464e-06, dl=False, maxiter=1000, eta1=0.01, eta2=0.9)[source]

BFGS method with trust region

Parameters
  • fct (optimization.functionToMinimize) – object to calculate the objective function and its derivatives.

  • x0 (numpy.array) – starting point

  • initBfgs (numpy.array) – matrix used to initialize BFGS. If None, the identity matrix is used. Default: None.

  • delta0 (float) – initial radius of the trust region. Default: 100.

  • eps (float) – the algorithm stops when this precision is reached. Default: \(\varepsilon^{\frac{1}{3}}\)

  • dl (bool) – If True, the Dogleg method is used to solve the trut region subproblem. If False, the truncated conjugate gradient is used. Default: False.

  • maxiter (int) – the algorithm stops if this number of iterations is reached. Default: 1000.

  • eta1 (float) – threshold for failed iterations. Default: 0.01.

  • eta2 (float) – threshold for very successful iterations. Default 0.9.

Returns

tuple x, messages, where

  • x is the solution found,

  • messages is a dictionary reporting various aspects related to the run of the algorithm.

Return type

numpy.array, dict(str:object)

Raises

biogeme.exceptions.biogemeError – if the dimensions of the matrix initBfgs do not match the length of x0.

class biogeme.algorithms.bioBounds(b)[source]

Bases: object

This class is designed for the management of simple bound constraints

__init__(b)[source]
Parameters

b (list(tuple)) – list of tuples (ell,u) containing the lower and upper bounds for each free parameter.

Raises

biogeme.exceptions.biogemeError – if the bounds are incompatible

activity(x, epsilon=2.220446049250313e-16)[source]

Determines the activity status of each variable.

Parameters
  • x (numpy.array) – point for which the activity must be determined.

  • epsilon (float) – a bound is considered active if the distance to it is less rhan epsilon.

Returns

a vector, same length as x, where each entry reports the activity of the corresponding variable:

  • 0 if no bound is active

  • -1 if the lower bound is active

  • 1 if the upper bound is active

Return type

numpy.array

Raises
bounds

list of tuples (ell,u) containing the lower and upper bounds for each free parameter.

breakpoints(x, d)[source]

Projects the direction d, starting from x, on the intersection of the bound constraints

Parameters
  • x (numpy.array) – current point

  • d (numpy.array) – search direction

Returns

list of tuple (index, value), where index is the index of the variable, and value the value of the corresponding breakpoint.

Return type

list(tuple(int,float))

Raises
feasible(x)[source]

Check if point verifies the bound constraints

Parameters

x (numpy.array) – point to project

Returns

True if x is feasible, False otherwise.

Return type

bool

Raises

biogeme.exceptions.biogemeError – if the dimensions are inconsistent

generalizedCauchyPoint(xk, gk, H, direction)[source]

Implementation of Step 2 of the Specific Algorithm by Conn et al. (1988).

The quadratic model is defined as

\[m(x) = f(x_k) + (x - x_k)^T g_k + \frac{1}{2} (x-x_k)^T H (x-x_k).\]
Parameters
  • xk (numpy.array. Dimension n.) – current point

  • gk (numpy.array. Dimension n.) – vector g involved in the quadratic model definition.

  • H (numpy.array. Dimension n x n.) – matrix H involved in the quadratic model definition.

  • direction (numpy.array. Dimension n.) – direction along which the GCP is searched.

Returns

generalized Cauchy point based on inexact line search.

Return type

numpy.array. Dimension n.

Raises
intersect(otherBounds)[source]

Create a bounds object representing the intersection of two regions.

Parameters

otherBounds (class bioBounds) – other bound object that must be intersected.

Returns

bound object, intersection of the two.

Return type

class bioBounds

Raises

biogeme.exceptions.biogemeError – if the dimensions are inconsistent

intersectionWithTrustRegion(x, delta)[source]

Create a bioBounds object representing the intersection between the feasible domain and the trust region.

Parameters
  • x (numpy.array) – center of the trust region

  • delta (float) – radius of the tust region (infinity norm)

Returns

intersection between the feasible region and the trus region

Return type

class bioBounds

Raises

biogeme.exceptions.biogemeError – if the dimensions are inconsistent

lowerBounds

List of lower bounds

maximumStep(x, d)[source]

Calculates the maximum step that can be performed along a direction while staying feasible.

Parameters
  • x (numpy.array) – reference point

  • d (numpy.array) – direction

Returns

the largest alpha such that x + alpha * d is feasible and the list of indices achieving this value.

Return type

float, int

Raises

biogeme.exceptions.biogemeError – if the point is infeasible

n

number of optimization variables

project(x)[source]

Project a point onto the feasible domain defined by the bounds.

Parameters

x (numpy.array) – point to project

Returns

projected point

Return type

numpy.array

Raises

biogeme.exceptions.biogemeError – if the dimensions are inconsistent

subspace(selectedVariables)[source]

Generate a bioBounds object for selected variables

Parameters

selectedVariables (numpy.array(bool)) – boolean vector. If an entry is True, the corresponding variables is considered.

Returns

bound object

Return type

class bioBounds

Raises

biogeme.exceptions.biogemeError – if the dimensions are inconsistent

upperBounds

List of upper bounds

biogeme.algorithms.cauchyNewtonDogleg(g, H)[source]

Calculate the Cauchy, the Newton and the dogleg points.

The Cauchy point is defined as

\[d_c = - \frac{\nabla f(x)^T \nabla f(x)} {\nabla f(x)^T \nabla^2 f(x) \nabla f(x)} \nabla f(x)\]

The Newton point \(d_n\) verifies Newton equation:

\[H_s d_n = - \nabla f(x)\]

where \(H_s\) is a positive definite matrix generated with the method by Schnabel and Eskow (1999).

The Dogleg point is

\[d_d = \eta d_n\]

where

\[\eta = 0.2 + 0.8 \frac{\alpha^2}{\beta |\nabla f(x)^T d_n|}\]

and \(\alpha= \nabla f(x)^T \nabla f(x)\), \(\beta=\nabla f(x)^T \nabla^2 f(x)\nabla f(x).\)

Parameters
  • g (numpy.array) – gradient \(\nabla f(x)\)

  • H (numpy.array) – hessian \(\nabla^2 f(x)\)

Returns

tuple with Cauchy point, Newton point, Dogleg point

Return type

numpy.array, numpy.array, numpy.array

Raises

biogeme.exceptions.biogemeError – if the quadratic model is not convex.

biogeme.algorithms.dogleg(g, H, delta)[source]

Find an approximation of the trust region subproblem using the dogleg method

Parameters
  • g (numpy.array) – gradient of the quadratic model.

  • H (numpy.array) – hessian of the quadratic model.

  • delta (float) – radius of the trust region.

Returns

d, diagnostic where

  • d is an approximate solution of the trust region subproblem

  • diagnostic is the nature of the solution:

    • -2 if negative curvature along Newton direction

    • -1 if negative curvature along Cauchy direction (i.e. along the gradient)

    • 1 if partial Cauchy step

    • 2 if Newton step

    • 3 if partial Newton step

    • 4 if Dogleg

Return type

numpy.array, int

class biogeme.algorithms.functionToMinimize[source]

Bases: object

This is an abstract class. The actual function to minimize must be implemented in a concrete class deriving from this one.

abstract f(batch=None)[source]

Calculate the value of the function

Parameters

batch (float) – for data driven functions (such as a log likelikood function), it is possible to approximate the value of the function using a sample of the data called a batch. This argument is a value between 0 and 1 representing the percentage of the data that should be used for thre random batch. If None, the full data set is used. Default: None pass

Returns

value of the function

Return type

float

abstract f_g(batch=None)[source]

Calculate the value of the function and the gradient

Parameters

batch (float) – for data driven functions (such as a log likelikood function), it is possible to approximate the value of the function using a sample of the data called a batch. This argument is a value between 0 and 1 representing the percentage of the data that should be used for the random batch. If None, the full data set is used. Default: None pass

Returns

value of the function and the gradient

Return type

tuple float, numpy.array

abstract f_g_bhhh(batch=None)[source]

Calculate the value of the function, the gradient and the BHHH matrix

Parameters

batch (float) – for data driven functions (such as a log likelikood function), it is possible to approximate the value of the function using a sample of the data called a batch. This argument is a value between 0 and 1 representing the percentage of the data that should be used for the random batch. If None, the full data set is used. Default: None pass

Returns

value of the function, the gradient and the BHHH

Return type

tuple float, numpy.array, numpy.array

abstract f_g_h(batch=None)[source]

Calculate the value of the function, the gradient and the Hessian

Parameters

batch (float) – for data driven functions (such as a log likelikood function), it is possible to approximate the value of the function using a sample of the data called a batch. This argument is a value between 0 and 1 representing the percentage of the data that should be used for the random batch. If None, the full data set is used. Default: None pass

Returns

value of the function, the gradient and the Hessian

Return type

tuple float, numpy.array, numpy.array

abstract setVariables(x)[source]

Set the values of the variables for which the function has to be calculated.

Parameters

x (numpy.array) – values

biogeme.algorithms.inverseBfgs(Hinv, d, y)[source]

Update the inverse BFGS matrix. Formula (13.13) of Bierlaire (2015)

Parameters
  • Hinv (numpy.array (2D)) – current approximation of the inverse of the Hessian

  • d (numpy.array (1D)) – difference between two consecutive iterates.

  • y (numpy.array (1D)) – difference between two consecutive gradients.

Returns

updated approximation of the inverse of the Hessian.

Return type

numpy.array (2D)

biogeme.algorithms.lineSearch(fct, x, f, g, d, alpha0=1.0, beta1=0.0001, beta2=0.99, lbd=2.0)[source]

Calculate a step along a direction that satisfies both Wolfe conditions

Parameters
  • fct (optimization.functionToMinimize) – object to calculate the objective function and its derivatives.

  • x (numpy.array) – current iterate.

  • f (float) – value of the objective function at the current iterate.

  • g (numpy.array) – value of the gradient at the current iterate.

  • d (numpy.array) – descent direction.

  • alpha0 (float) – first step to test.

  • beta1 (float) – parameter of the first Wolfe condition.

  • beta2 (float) – parameter of the second Wolfe condition.

  • lbd (float) – expansion factor for a short step.

Returns

a step verifing both Wolfe conditions

Return type

float

Raises
biogeme.algorithms.newtonLineSearch(fct, x0, eps=6.06273418136464e-06, maxiter=100)[source]

Newton method with inexact line search (Wolfe conditions)

Parameters
  • fct (optimization.functionToMinimize) – object to calculate the objective function and its derivatives.

  • x0 (numpy.array) – starting point

  • eps (float) – the algorithm stops when this precision is reached. Default: \(\varepsilon^{\frac{1}{3}}\)

  • maxiter (int) – the algorithm stops if this number of iterations is reached. Defaut: 100

Returns

x, messages

  • x is the solution generated by the algorithm,

  • messages is a dictionary describing information about the lagorithm

Return type

numpay.array, dict(str:object)

biogeme.algorithms.newtonTrustRegion(fct, x0, delta0=1.0, eps=6.06273418136464e-06, dl=False, maxiter=1000, eta1=0.01, eta2=0.9)[source]

Newton method with trust region

Parameters
  • fct (optimization.functionToMinimize) – object to calculate the objective function and its derivatives.

  • x0 (numpy.array) – starting point

  • delta0 (float) – initial radius of the trust region. Default: 100.

  • eps (float) – the algorithm stops when this precision is reached. Default: \(\varepsilon^{\frac{1}{3}}\)

  • dl (bool) – If True, the Dogleg method is used to solve the trut region subproblem. If False, the truncated conjugate gradient is used. Default: False.

  • maxiter (int) – the algorithm stops if this number of iterations is reached. Default: 1000.

  • eta1 (float) – threshold for failed iterations. Default: 0.01.

  • eta2 (float) – threshold for very successful iterations. Default 0.9.

Returns

tuple x, messages, where

  • x is the solution found,

  • messages is a dictionary reporting various aspects related to the run of the algorithm.

Return type

numpy.array, dict(str:object)

biogeme.algorithms.relativeChange(x, xpred, typx)[source]

Calculates the relative change.

It is typically used as stopping criterion.

Parameters
  • x (numpy.array) – current iterate.

  • xpred (numpy.array) – previous iterate.

  • typx (numpy.array) – typical value for x.

Returns

relative change:

\[\max_i \frac{|(x_k)_i - (x_{k-1})_i|}{\max(|(x_k)_i|, \text{typx}_i)}\]
Return type

float

biogeme.algorithms.relativeGradient(x, f, g, typx, typf)[source]

Calculates the relative gradients.

It is typically used as stopping criterion.

Parameters
  • x (numpy.array) – current iterate.

  • f (float) – value of f(x)

  • g (numpy.array) – \(\nabla f(x)\), gradient of f at x

  • typx (numpy.array) – typical value for x.

  • typf (float) – typical value for f.

Returns

relative gradient

\[\max_{i=1,\ldots,n}\frac{(\nabla f(x))_i \max(x_i,\text{typx}_i)} {\max(|f(x)|, \text{typf})}\]
Return type

float

biogeme.algorithms.schnabelEskow(A, tau=6.06273418136464e-06, taubar=3.6756745753887175e-11, mu=0.1)[source]

Modified Cholesky factorization by Schnabel and Eskow (1999).

If the matrix is ‘safely’ positive definite, the output is the classical Cholesky factor. If not, the diagonal elements are inflated in order to make it positive definite. The factor \(L\) is such that \(A + E = PLL^TP^T\), where \(E\) is a diagonal matrix contaninig the terms added to the diagonal, \(P\) is a permutation matrix, and \(L\) is w lower triangular matrix.

Parameters
Returns

tuple \(L\), \(E\), \(P\), where \(A + E = PLL^TP^T\).

Return type

numpy.array, numpy.array, numpy.array

Raises
biogeme.algorithms.simpleBoundsNewtonAlgorithm(fct, bounds, x0, proportionTrueHessian=1.0, infeasibleConjugateGradient=False, delta0=1.0, tol=6.06273418136464e-06, steptol=1e-05, cgtol=6.06273418136464e-06, maxiter=1000, eta1=0.01, eta2=0.9, enlargingFactor=10)[source]

Trust region algorithm for problems with simple bounds

Parameters
  • fct (optimization.functionToMinimize) – object to calculate the objective function and its derivatives.

  • bounds (class bounds) – bounds on the variables

  • x0 (numpy.array) – starting point

  • proportionTrueHessian (float) – proportion of the iterations where the true hessian is calculated. When not, the BFGS update is used. If 1.0, it is used for all iterations. If 0.0, it is not used at all.

  • infeasibleConjugateGradient (bool) – if True, the conjugate gradient algorithm may generate until termination. The result will then be projected on the feasible domain. If False, the algorithm stops as soon as an infeasible iterate is generated. Default: False.

  • delta0 (float) – initial radius of the trust region. Default: 100.

  • tol (float) – the algorithm stops when this precision is reached. Default: \(\varepsilon^{\frac{1}{3}}\)

  • steptol (float) – the algorithm stops when the relative change in x is below this threshold. Basically, if p significant digits of x are needed, steptol should be set to 1.0e-p. Default: \(10^{-5}\)

  • cgtol (float) – the conjugate gradient algorithm stops when this precision is reached. Default: \(\varepsilon^{\frac{1}{3}}\)

  • maxiter (int) – the algorithm stops if this number of iterations is reached. Default: 1000.

  • eta1 (float) – threshold for failed iterations. Default: 0.01.

  • eta2 (float) – threshold for very successful iterations. Default 0.9.

  • enlargingFactor (float) – if an iteration is very successful, the radius of the trust region is multiplied by this factor. Default 10.

Returns

x, messages

  • x is the solution generated by the algorithm,

  • messages is a dictionary describing information about the lagorithm

Return type

numpay.array, dict(str:object)

Raises

biogeme.exceptions.biogemeError – if the dimensions of the matrix initBfgs do not match the length of x0.

biogeme.algorithms.truncatedConjugateGradient(g, H, delta)[source]

Find an approximation of the trust region subproblem using the truncated conjugate gradient method

Parameters
  • g (numpy.array) – gradient of the quadratic model.

  • H (numpy.array) – hessian of the quadrartic model.

  • delta (float) – radius of the trust region.

Returns

d, diagnostic, where

  • d is the approximate solution of the trust region subproblem,

  • diagnostic is the nature of the solution:

    • 1 for convergence,

    • 2 if out of the trust region,

    • 3 if negative curvature detected.

    • 4 if a numerical problem has been encountered

Return type

numpy.array, int

biogeme.algorithms.truncatedConjugateGradientSubspace(xk, gk, Hk, delta, bounds, infeasibleIterate=False, tol=6.06273418136464e-06)[source]

Find an approximation of the solution of the trust region subproblem using the truncated conjugate gradient method within the subspace of free variables. Free variables are those corresponding to inactive constraints at the generalized Cauchy point.

Parameters
  • xk (numpy.array) – current iterate.

  • gk (numpy.array) – gradient of the quadratic model.

  • Hk (numpy.array) – hessian of the quadrartic model.

  • delta (float) – radius of the trust region.

  • bounds (class bioBounds) – bounds on the variables.

  • infeasibleIterate (bool) – if True, the algorithm may continue until termination. The result will then be projected on the feasible domain. If False, the algorithm stops as soon as an infeasible iterate is generated. Default: False.

  • tol (float) – tolerance used for stopping criterion.

Returns

d, diagnostic, where

  • d is the approximate solution of the trust region subproblem,

  • diagnostic is the nature of the solution:

    • 1 for convergence,

    • 2 if out of the trust region,

    • 3 if negative curvature detected.

    • 4 if a numerical problem has been encountered

Return type

numpy.array, int

Raises

biogeme.exceptions.biogemeError – if the dimensions are inconsistent

biogeme.algorithms.trustRegionIntersection(dc, d, delta)[source]

Calculates the intersection with the boundary of the trust region.

Consider a trust region of radius \(\delta\), centered at \(\hat{x}\). Let \(x_c\) be in the trust region, and \(d_c = x_c - \hat{x}\), so that \(\|d_c\| \leq \delta\). Let \(x_d\) be out of the trust region, and \(d_d = x_d - \hat{x}\), so that \(\|d_d\| \geq \delta\). We calculate \(\lambda\) such that

\[\| d_c + \lambda (d_d - d_c)\| = \delta\]
Parameters
  • dc (numpy.array) – xc - xhat.

  • d (numpy.array) – dd - dc.

  • delta (float) – radius of the trust region.

Returns

\(\lambda\) such that \(\| d_c + \lambda (d_d - d_c)\| = \delta\)

Return type

float