This webpage is for programmers who need examples of use of the functions of the class. The examples are designed to illustrate the syntax. They do not correspond to any meaningful model. For examples of models, visit biogeme.epfl.ch.
import datetime
print(datetime.datetime.now())
import biogeme.version as ver
print(ver.getText())
import biogeme.biogeme as bio
import biogeme.database as db
import pandas as pd
import numpy as np
from biogeme.expressions import Beta, Variable, exp
Define the verbosity of Biogeme
import biogeme.messaging as msg
logger = msg.bioMessage()
logger.setDetailed()
df = pd.DataFrame({'Person':[1,1,1,2,2],
'Exclude':[0,0,1,0,1],
'Variable1':[1,2,3,4,5],
'Variable2':[10,20,30,40,50],
'Choice':[1,2,3,1,2],
'Av1':[0,1,1,1,1],
'Av2':[1,1,1,1,1],
'Av3':[0,1,1,1,1]})
myData = db.Database('test',df)
Variable1=Variable('Variable1')
Variable2=Variable('Variable2')
beta1 = Beta('beta1',-1.0,-3,3,0)
beta2 = Beta('beta2',2.0,-3,10,0)
likelihood = -beta1**2 * Variable1 - exp(beta2*beta1) * Variable2 - beta2**4
simul = beta1 / Variable1 + beta2 / Variable2
dictOfExpressions = {'loglike':likelihood,'beta1':beta1,'simul':simul}
myBiogeme = bio.BIOGEME(myData,dictOfExpressions)
myBiogeme.modelName = 'simpleExample'
print(myBiogeme)
Note that, by default, Biogeme removes the unused variables from the database to optimize space.
myBiogeme.database.data.columns
myBiogeme.calculateInitLikelihood()
x = myBiogeme.betaInitValues
xplus = [v+1 for v in x]
print(xplus)
myBiogeme.calculateLikelihood(xplus,scaled=True)
It is possible to calculate the likelihood based only on a sample of the data
myBiogeme.calculateLikelihood(xplus, scaled=True, batch=0.5)
myBiogeme.database.data
myBiogeme.calculateLikelihood(xplus, scaled=True, batch=0.6)
myBiogeme.database.data
By default, each observation has the same probability to be selected in the sample. It is possible to define the selection probability to be proportional to the values of a column of the database, using the parameter 'weights'.
myBiogeme.columnForBatchSamplingWeights = 'Variable2'
myBiogeme.calculateLikelihood(xplus, scaled=True, batch=0.6)
myBiogeme.database.data
f,g,h,bhhh = myBiogeme.calculateLikelihoodAndDerivatives(xplus,scaled=True,hessian=True,bhhh=True)
print(f'f={f}')
print(f'g={g}')
print(f'h={h}')
print(f'bhhh={bhhh}')
Now the unscaled version
f,g,h,bhhh = myBiogeme.calculateLikelihoodAndDerivatives(xplus,scaled=False,hessian=True,bhhh=True)
print(f'f={f}')
print(f'g={g}')
print(f'h={h}')
print(f'bhhh={bhhh}')
Using only a sample of the data
f,g,h,bhhh = myBiogeme.calculateLikelihoodAndDerivatives(xplus,scaled=True,batch=0.5,hessian=True,bhhh=True)
print(f'f={f}')
print(f'g={g}')
print(f'h={h}')
print(f'bhhh={bhhh}')
myBiogeme.likelihoodFiniteDifferenceHessian(xplus)
f,g,h,gdiff,hdiff = myBiogeme.checkDerivatives(verbose=True)
print(f'f={f}')
print(f'g={g}')
print(f'h={h}')
print(f'gdiff={gdiff}')
print(f'hdiff={hdiff}')
hdiff
During estimation, it is possibler to save intermediate results, in case the estimation must be interrupted.
results = myBiogeme.estimate(bootstrap=10,saveIterations=True)
results.getEstimatedParameters()
The values of the intermediate results saved can be retrieved as follows.
Formula before
myBiogeme.loglike
Retrieving the values
myBiogeme.loadSavedIteration()
myBiogeme.loglike
A file name can be given. If the file does not exist, the statement is ignored.
myBiogeme.loadSavedIteration(filename='fileThatDoesNotExist.txt')
# Simulate with the default values for the parameters
simulationWithDefaultBetas = myBiogeme.simulate()
simulationWithDefaultBetas
# Simulate with the estimated values for the parameters
print(results.getBetaValues())
simulationWithEstimatedBetas = myBiogeme.simulate(results.getBetaValues())
simulationWithEstimatedBetas
drawsFromBetas = results.getBetasForSensitivityAnalysis(myBiogeme.freeBetaNames)
left, right = myBiogeme.confidenceIntervals(drawsFromBetas)
left
right