module optimization: examples of use of each function

This webpage is for programmers who need examples of use of the functions of the class. The examples are designed to illustrate the syntax. They do not correspond to any meaningful model. For examples of models, visit biogeme.epfl.ch.

In [1]:
import datetime
print(datetime.datetime.now())
2019-09-25 14:00:46.945438
In [2]:
import biogeme.version as ver
print(ver.getText())
biogeme 3.2.2 [September 25, 2019]
Version entirely written in Python
Home page: http://biogeme.epfl.ch
Submit questions to https://groups.google.com/d/forum/biogeme
Michel Bierlaire, Transport and Mobility Laboratory, Ecole Polytechnique Fédérale de Lausanne (EPFL)

In [3]:
import biogeme.optimization as opt
In [4]:
import numpy as np
In [5]:
import biogeme.messaging as msg
logger = msg.bioMessage()
logger.setDetailed()

Modified Cholesky factorization by Schnabel and Eskow (1999)

Example by Eskow and Schnabel, 1991

In [6]:
A = np.array([[0.3571,-0.1030,0.0274,-0.0459],[-0.1030,0.2525,0.0736,-0.3845],[0.0274,0.0736,0.2340,-0.2878],[-0.0459,-0.3845,-0.2878,0.5549]])
A
Out[6]:
array([[ 0.3571, -0.103 ,  0.0274, -0.0459],
       [-0.103 ,  0.2525,  0.0736, -0.3845],
       [ 0.0274,  0.0736,  0.234 , -0.2878],
       [-0.0459, -0.3845, -0.2878,  0.5549]])
In [7]:
L,E,P = opt.schnabelEskow(A)
L
Out[7]:
array([[ 0.7449161 ,  0.        ,  0.        ,  0.        ],
       [-0.06161768,  0.59439319,  0.        ,  0.        ],
       [-0.38635223,  0.00604629,  0.46941286,  0.        ],
       [-0.51616551, -0.22679419, -0.26511936,  0.00152451]])

The factor $L$ is such that $A + E = PLL^TP^T$. Therefore, the expression below should be the null matrix.

In [8]:
P @ L @ L.T @ P.T - E - A
Out[8]:
array([[-5.55111512e-17, -1.38777878e-17,  0.00000000e+00,
         0.00000000e+00],
       [-1.38777878e-17,  5.55111512e-17,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00]])

Example by Schnabel and Eskow (1999)

In [9]:
A2 = np.array([[1890.3,-1705.6,-315.8,3000.3],[-1705.6,1538.3,284.9,-2706.6],[-315.8,284.9,52.5,-501.2],[3000.3,-2706.6,-501.2,4760.8]])
A2
Out[9]:
array([[ 1890.3, -1705.6,  -315.8,  3000.3],
       [-1705.6,  1538.3,   284.9, -2706.6],
       [ -315.8,   284.9,    52.5,  -501.2],
       [ 3000.3, -2706.6,  -501.2,  4760.8]])
In [10]:
L,E,P = opt.schnabelEskow(A2)
L
Out[10]:
array([[ 6.89985507e+01,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00],
       [-7.26392069e+00,  3.19413321e-01,  0.00000000e+00,
         0.00000000e+00],
       [-3.92269109e+01, -1.28891153e-01,  4.44731720e-01,
         0.00000000e+00],
       [ 4.34835220e+01,  1.90522168e-01,  3.34584739e-01,
         1.71484739e-03]])

The factor $L$ is such that $A + E = PLL^TP^T$. Therefore, the expression below should be the null matrix.

In [11]:
P @ L @ L.T @ P.T - E - A2
Out[11]:
array([[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
        -9.09494702e-13]])

Rosenbrock problem

In [12]:
def rosenbrock(x,hessian=False):
    n = len(x)
    f = sum(100.0 * (x[i+1]-x[i]**2)**2 + (1.0-x[i])**2 for i in range(n-1))
    g = np.zeros(n)
    for i in range(n-1):
        g[i] = g[i] - 400 * x[i] * (x[i+1]-x[i]**2)  - 2 * (1-x[i])
        g[i+1] = g[i+1] + 200 * (x[i+1]-x[i]**2)
    if hessian:
        H = np.zeros((n,n))
        for i in range(n-1):
            H[[i],[i]] = H[[i],[i]] - 400 * x[i+1] + 1200 * x[i]**2 + 2
            H[[i+1],[i]] = H[[i+1],[i]] - 400 * x[i]
            H[[i],[i+1]] = H[[i],[i+1]] - 400 * x[i]
            H[[i+1],[i+1]] = H[[i+1],[i+1]] + 200
        return f,g,H
    else:
        return f,g,None

Line search algorithm

In [13]:
x = np.array([-1.5,1.5])
def function(x):
    f,g,h = rosenbrock(x)
    return f,g
f,g = function(x)
alpha,nfev = opt.lineSearch(function,x,-g)
print(f"alpha={alpha} nfev={nfev}")
alpha=0.0009765625 nfev=12

Newton with line search

In [14]:
x0 = np.array([-1.5,1.5])
xstar,nit,nfev,msg = opt.newtonLineSearch(rosenbrock,x0)
xstar
Out[14]:
array([1., 1.])
In [15]:
print(f"nit={nit} nfev={nfev} {msg}")
nit=23 nfev=79 Relative gradient = 8.914327054272941e-11 <= 6.06273418136464e-06

Biogeme example

In [16]:
import pandas as pd
import biogeme.biogeme as bio
import biogeme.database as db
from biogeme.expressions import *
df = pd.DataFrame({'Person':[1,1,1,2,2],
                   'Exclude':[0,0,1,0,1],
                   'Variable1':[1,2,3,4,5],
                   'Variable2':[10,20,30,40,50],
                   'Choice':[1,2,3,1,2],
                   'Av1':[0,1,1,1,1],
                   'Av2':[1,1,1,1,1],
                   'Av3':[0,1,1,1,1]})
myData = db.Database('test',df)

Choice=Variable('Choice')
Variable1=Variable('Variable1')
Variable2=Variable('Variable2')
beta1 = Beta('beta1',0,None,None,0)
beta2 = Beta('beta2',0,None,None,0)
V1 = beta1 * Variable1
V2 = beta2 * Variable2
V3 = 0
V ={1:V1,2:V2,3:V3}

likelihood = bioLogLogit(V,av=None,choice=Choice)
myBiogeme = bio.BIOGEME(myData,likelihood)
myBiogeme.modelName = 'simpleExample'
print(myBiogeme)
[14:00:47] < General >   Remove 5 unused variables from the database as only 3 are used.
simpleExample: database [test]{'loglike': bioLogLogit1:(beta1(0) * Variable1),2:(beta2(0) * Variable2),3:`0`)}
simpleExample: database [test]{'loglike': bioLogLogit1:(beta1(0) * Variable1),2:(beta2(0) * Variable2),3:`0`)}

First with the default optimization algorithm, from scipy. We include an irrelevant parameter to illustrate the warning.

In [17]:
results = myBiogeme.estimate(algoParameters={'myparam':3})
results.getEstimatedParameters()
[14:00:47] < General >   Log likelihood:  -5.493061
[14:00:47] < General >   Minimize with tol 9.999999999999999e-05
[14:00:47] < General >   Log likelihood:  -5.493061 Gradient norm:      2e+01
[14:00:47] < General >   Log likelihood:  -80.00009 Gradient norm:      8e+01
[14:00:47] < General >   Log likelihood:  -6.407938 Gradient norm:      5e+01
[14:00:47] < General >   Log likelihood:  -5.338499 Gradient norm:          1
[14:00:47] < General >   Log likelihood:  -5.337302 Gradient norm:          1
[14:00:47] < General >   Log likelihood:  -5.335695 Gradient norm:          1
[14:00:47] < General >   Log likelihood:  -5.329761 Gradient norm:          3
[14:00:47] < General >   Log likelihood:  -5.318083 Gradient norm:          5
[14:00:47] < General >   Log likelihood:  -5.295959 Gradient norm:          6
[14:00:47] < General >   Log likelihood:  -5.274077 Gradient norm:          4
[14:00:47] < General >   Log likelihood:   -5.26784 Gradient norm:          1
[14:00:47] < General >   Log likelihood:  -5.266889 Gradient norm:        0.3
[14:00:47] < General >   Log likelihood:    -5.2668 Gradient norm:       0.01
[14:00:47] < General >   Log likelihood:  -5.266799 Gradient norm:      0.002
[14:00:47] < General >   Log likelihood:  -5.266799 Gradient norm:      2e-05
[14:00:47] < General >   Log likelihood:  -5.266799 Gradient norm:      2e-05
[14:00:48] < General >   Results saved in file simpleExample~02.html
[14:00:48] < General >   Results saved in file simpleExample~03.pickle
/Users/michelbierlaire/anaconda3/envs/python37/lib/python3.7/site-packages/biogeme-3.2.2-py3.7-macosx-10.9-x86_64.egg/biogeme/optimization.py:89: OptimizeWarning: Unknown solver options: myparam
  results =  sc.minimize(f_and_grad,initBetas,bounds=bounds,jac=True,options=opts)
Out[17]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144546 0.391017 0.369666 0.711632 0.366198 0.394720 0.693049
beta2 0.023502 0.036947 0.636086 0.524720 0.034280 0.685573 0.492982
In [18]:
print(f"nit={results.data.numberOfIterations} nfev={results.data.numberOfFunctionEval}: {results.data.optimizationMessage}")
nit=12 nfev=15: b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
In [19]:
results = myBiogeme.estimate(algorithm=opt.newtonLineSearchForBiogeme)
results.getEstimatedParameters()
[14:00:48] < General >   Log likelihood:  -5.493061
[14:00:48] < General >   Log likelihood:  -5.493061 Gradient norm:      2e+01
[14:00:48] < General >   Log likelihood:  -5.493061 Gradient norm:      2e+01
[14:00:48] < General >   Log likelihood:  -5.271161 Gradient norm:        0.8
[14:00:48] < General >   Log likelihood:  -5.271161 Gradient norm:        0.8
[14:00:48] < General >   Log likelihood:  -5.271161 Gradient norm:        0.8
[14:00:48] < General >   Log likelihood:  -5.266802 Gradient norm:       0.08
[14:00:48] < General >   Log likelihood:  -5.266802 Gradient norm:       0.08
[14:00:48] < General >   Log likelihood:  -5.266802 Gradient norm:       0.08
[14:00:48] < General >   Log likelihood:  -5.266799 Gradient norm:      2e-06
[14:00:48] < General >   Log likelihood:  -5.266799 Gradient norm:      2e-06
[14:00:48] < General >   Log likelihood:  -5.266799 Gradient norm:      2e-06
[14:00:48] < General >   Results saved in file simpleExample~03.html
[14:00:48] < General >   Results saved in file simpleExample~04.pickle
Out[19]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144545 0.391017 0.369665 0.711632 0.366198 0.394720 0.693050
beta2 0.023502 0.036947 0.636086 0.524720 0.034280 0.685573 0.492982
In [20]:
print(f"nit={results.data.numberOfIterations} nfev={results.data.numberOfFunctionEval}: {results.data.optimizationMessage}")
nit=3 nfev=9: Relative gradient = 3.4306790844032097e-07 <= 6.06273418136464e-06

Changing the requested precision

In [21]:
results = myBiogeme.estimate(algorithm=opt.newtonLineSearchForBiogeme,algoParameters = {'tolerance': 0.1})
results.getEstimatedParameters()
[14:00:48] < General >   Log likelihood:  -5.493061
[14:00:48] < General >   Log likelihood:  -5.493061 Gradient norm:      2e+01
[14:00:48] < General >   Log likelihood:  -5.493061 Gradient norm:      2e+01
[14:00:48] < General >   Log likelihood:  -5.271161 Gradient norm:        0.8
[14:00:48] < General >   Log likelihood:  -5.271161 Gradient norm:        0.8
[14:00:48] < General >   Log likelihood:  -5.271161 Gradient norm:        0.8
[14:00:48] < General >   Log likelihood:  -5.266802 Gradient norm:       0.08
[14:00:48] < General >   Log likelihood:  -5.266802 Gradient norm:       0.08
[14:00:48] < General >   Log likelihood:  -5.266802 Gradient norm:       0.08
[14:00:48] < General >   Results saved in file simpleExample~04.html
[14:00:48] < General >   Results saved in file simpleExample~05.pickle
Out[21]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144317 0.390742 0.369340 0.711874 0.365691 0.394641 0.693108
beta2 0.023428 0.036930 0.634377 0.525835 0.034256 0.683887 0.494047
In [22]:
print(f"nit={results.data.numberOfIterations} nfev={results.data.numberOfFunctionEval}: {results.data.optimizationMessage}")
nit=2 nfev=6: Relative gradient = 0.015380029144199428 <= 0.1

CFSQP algorithm

In [23]:
results = myBiogeme.estimate(algorithm=None)
results.getEstimatedParameters()
[14:00:48] < General >   Log likelihood:  -5.493061
[14:00:48] < General >   Log likelihood:  -5.266799 Gradient norm:     0.0003
[14:00:48] < General >   Results saved in file simpleExample~05.html
[14:00:48] < General >   Results saved in file simpleExample~06.pickle
Out[23]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144544 0.391018 0.369661 0.711635 0.366199 0.394714 0.693054
beta2 0.023502 0.036947 0.636090 0.524718 0.034280 0.685578 0.492979
In [24]:
print(f"nit={results.data.numberOfIterations} nfev={results.data.numberOfFunctionEval}: {results.data.optimizationMessage}")
nit=7 nfev=20: b'Normal termination. Obj: 6.05545e-06 Const: 6.05545e-06'

Newton with trust region

In [25]:
x0 = np.array([-1.5,1.5])
xstar,nit,nfev,msg = opt.newtonTrustRegion(rosenbrock,x0)
xstar
[14:00:48] < Detailed >  f=    6.1675 relgrad=   1.2 delta= 2e+02 ++
[14:00:48] < Detailed >  f=    6.1675 relgrad=   1.2 delta=   3.7 -
[14:00:48] < Detailed >  f=    6.1675 relgrad=   1.2 delta=   1.8 -
[14:00:48] < Detailed >  f=    6.1675 relgrad=   1.2 delta=  0.92 -
[14:00:48] < Detailed >  f=  5.433107 relgrad=   9.3 delta=  0.92 +
[14:00:48] < Detailed >  f=  4.266973 relgrad=   2.7 delta=   1.8 ++
[14:00:48] < Detailed >  f=  4.266973 relgrad=   2.7 delta=  0.56 -
[14:00:48] < Detailed >  f=  3.606177 relgrad=   6.1 delta=  0.56 +
[14:00:48] < Detailed >  f=  2.820959 relgrad=   3.2 delta=   1.1 ++
[14:00:48] < Detailed >  f=  2.783838 relgrad=     7 delta=   1.1 +
[14:00:48] < Detailed >  f=  1.658737 relgrad=   1.9 delta=   2.2 ++
[14:00:48] < Detailed >  f=  1.658737 relgrad=   1.9 delta=   0.4 -
[14:00:48] < Detailed >  f=  1.658737 relgrad=   1.9 delta=   0.2 -
[14:00:48] < Detailed >  f=  1.272544 relgrad=   3.2 delta=   0.4 ++
[14:00:48] < Detailed >  f=  1.018202 relgrad=   9.3 delta=   0.4 +
[14:00:48] < Detailed >  f= 0.6558529 relgrad=   2.2 delta=  0.79 ++
[14:00:48] < Detailed >  f= 0.6558529 relgrad=   2.2 delta=  0.18 -
[14:00:48] < Detailed >  f= 0.4644766 relgrad=     7 delta=  0.35 ++
[14:00:48] < Detailed >  f= 0.2968939 relgrad=   4.6 delta=  0.71 ++
[14:00:48] < Detailed >  f= 0.1855262 relgrad=   5.8 delta=   1.4 ++
[14:00:48] < Detailed >  f=0.09881178 relgrad=   2.3 delta=   2.8 ++
[14:00:48] < Detailed >  f=0.05800672 relgrad=   7.9 delta=   5.7 ++
[14:00:48] < Detailed >  f=0.01907516 relgrad=  0.58 delta=    11 ++
[14:00:48] < Detailed >  f= 0.0114917 relgrad=   6.1 delta=    11 +
[14:00:48] < Detailed >  f=0.0005979128 relgrad= 0.047 delta=    23 ++
[14:00:48] < Detailed >  f=3.210182e-05 relgrad=  0.36 delta=    45 ++
[14:00:48] < Detailed >  f=4.95668e-09 relgrad=0.00013 delta=    91 ++
[14:00:48] < Detailed >  f=2.471163e-15 relgrad=3.1e-06 delta=1.8e+02 ++
Out[25]:
array([0.99999999, 0.99999998])
In [26]:
print(f"nit={nit} nfev={nfev} {msg}")
nit=28 nfev=28 Relative gradient = 3.1444351059358915e-06 <= 6.06273418136464e-06

Biogeme example

In [27]:
trBiogeme = bio.BIOGEME(myData,likelihood)
[14:00:48] < General >   Remove 0 unused variables from the database as only 3 are used.
In [28]:
results = trBiogeme.estimate(algorithm=opt.newtonTrustRegionForBiogeme)
results.getEstimatedParameters()
[14:00:48] < General >   Log likelihood:  -5.493061
[14:00:48] < General >   Log likelihood:  -5.493061 Gradient norm:      2e+01
[14:00:48] < General >   Log likelihood:  -5.271161 Gradient norm:        0.8
[14:00:48] < Detailed >  f=  5.271161 relgrad=  0.14 delta= 2e+02 ++
[14:00:48] < General >   Log likelihood:  -5.266802 Gradient norm:       0.08
[14:00:48] < Detailed >  f=  5.266802 relgrad= 0.015 delta= 4e+02 ++
[14:00:48] < General >   Log likelihood:  -5.266799 Gradient norm:      2e-06
[14:00:48] < Detailed >  f=  5.266799 relgrad=3.4e-07 delta= 8e+02 ++
[14:00:48] < General >   Log likelihood:  -5.266799 Gradient norm:      2e-06
[14:00:48] < General >   Results saved in file biogemeModelDefaultName.html
[14:00:48] < General >   Results saved in file biogemeModelDefaultName.pickle
Out[28]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144545 0.391017 0.369665 0.711632 0.366198 0.394720 0.693050
beta2 0.023502 0.036947 0.636086 0.524720 0.034280 0.685573 0.492982
In [29]:
print(f"nit={results.data.numberOfIterations} nfev={results.data.numberOfFunctionEval}: {results.data.optimizationMessage}")
nit=3 nfev=3: Relative gradient = 3.4306790844032097e-07 <= 6.06273418136464e-06

We illustrate the parameters. We use the truncated conjugate gradient instead of dogleg for the trust region subproblem, starting with a small trust region of radius 0.001, and a maximum of 3 iterations.

In [30]:
results = trBiogeme.estimate(algorithm=opt.newtonTrustRegionForBiogeme,algoParameters={'dogleg':False,'radius':0.001,'maxiter':3})
results.getEstimatedParameters()
[14:00:48] < General >   Log likelihood:  -5.493061
[14:00:48] < General >   Log likelihood:  -5.493061 Gradient norm:      2e+01
[14:00:48] < General >   Log likelihood:  -5.473675 Gradient norm:      2e+01
[14:00:48] < Detailed >  f=  5.473675 relgrad=   3.4 delta= 0.002 ++
[14:00:48] < General >   Log likelihood:  -5.438633 Gradient norm:      2e+01
[14:00:48] < Detailed >  f=  5.438633 relgrad=     3 delta= 0.004 ++
[14:00:48] < General >   Log likelihood:  -5.383858 Gradient norm:      1e+01
[14:00:48] < Detailed >  f=  5.383858 relgrad=   2.1 delta= 0.008 ++
[14:00:48] < General >   Log likelihood:  -5.383858 Gradient norm:      1e+01
[14:00:48] < General >   Results saved in file biogemeModelDefaultName~00.html
[14:00:48] < General >   Results saved in file biogemeModelDefaultName~00.pickle
Out[30]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.000053 0.348093 0.000151 0.99988 0.308910 0.000170 0.999864
beta2 0.007000 0.032583 0.214830 0.82990 0.030019 0.233173 0.815627
In [31]:
print(f"nit={results.data.numberOfIterations} nfev={results.data.numberOfFunctionEval}: {results.data.optimizationMessage}")
nit=3 nfev=3: Maximum number of iterations reached: 3

Changing the requested precision

In [32]:
results = trBiogeme.estimate(algorithm=opt.newtonTrustRegionForBiogeme,algoParameters={'tolerance':0.1})
results.getEstimatedParameters()
[14:00:48] < General >   Log likelihood:  -5.493061
[14:00:48] < General >   Log likelihood:  -5.493061 Gradient norm:      2e+01
[14:00:48] < General >   Log likelihood:  -5.271161 Gradient norm:        0.8
[14:00:48] < Detailed >  f=  5.271161 relgrad=  0.14 delta= 2e+02 ++
[14:00:48] < General >   Log likelihood:  -5.266802 Gradient norm:       0.08
[14:00:48] < Detailed >  f=  5.266802 relgrad= 0.015 delta= 4e+02 ++
[14:00:48] < General >   Log likelihood:  -5.266802 Gradient norm:       0.08
[14:00:48] < General >   Results saved in file biogemeModelDefaultName~01.html
[14:00:48] < General >   Results saved in file biogemeModelDefaultName~01.pickle
Out[32]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144317 0.390742 0.369340 0.711874 0.365691 0.394641 0.693108
beta2 0.023428 0.036930 0.634377 0.525835 0.034256 0.683887 0.494047
In [33]:
print(f"nit={results.data.numberOfIterations} nfev={results.data.numberOfFunctionEval}: {results.data.optimizationMessage}")
nit=2 nfev=2: Relative gradient = 0.015380029144207524 <= 0.1
In [ ]: