Results

Estimation results

biogeme.results module

Implementation of class contaning and processing the estimation results.

author

Michel Bierlaire

date

Tue Mar 26 16:50:01 2019

class biogeme.results.GeneralStatistic(value, format)

Bases: tuple

format

Alias for field number 1

value

Alias for field number 0

class biogeme.results.beta(name, value, bounds)[source]

Bases: object

Class gathering the information related to the parameters of the model

__init__(name, value, bounds)[source]

Constructor

Parameters
  • name (string) – name of the parameter.

  • value (float) – value of the parameter.

  • bounds (float,float) – tuple (l,b) with lower and upper bounds

bootstrap_pValue

p-value calculated from bootstrap

bootstrap_stdErr

Std error calculated from bootstrap

bootstrap_tTest

t-test calculated from bootstrap

isBoundActive(threshold=1e-06)[source]

Check if one of the two bound is ‘numerically’ active. Being numerically active means that the distance between the value of the parameter and one of its bounds is below the threshold.

Parameters

threshold (float) – distance below which the bound is considered to be active. Default: \(10^{-6}\)

Returns

True is one of the two bounds is numericall y active.

Return type

bool

Raises

biogemeError – if threshold is negative.

lb

Lower bound

name

Name of the parameter

pValue

p-value

robust_pValue

Robust p-value

robust_stdErr

Robust standard error

robust_tTest

Robust t-test

setBootstrapStdErr(se)[source]

Records the robust standard error calculated by bootstrap, and calculates and records the corresponding t-statistic and p-value

Parameters

se (float) – standard error calculated by bootstrap.

setRobustStdErr(se)[source]

Records the robust standard error, and calculates and records the corresponding t-statistic and p-value

Parameters

se (float) – robust standard error

setStdErr(se)[source]

Records the standard error, and calculates and records the corresponding t-statistic and p-value

Parameters

se (float) – standard error.

stdErr

Standard error

tTest

t-test

ub

Upper bound

value

Current value

class biogeme.results.bioResults(theRawResults=None, pickleFile=None)[source]

Bases: object

Class managing the estimation results

__init__(theRawResults=None, pickleFile=None)[source]

Constructor

Parameters
  • theRawResults (biogeme.results.rawResults) – object with the results of the estimation. Default: None.

  • pickleFile (string) – name of the file containing the raw results in pickle format. It can be a URL. Default: None.

Raises

biogeme.exceptions.biogemeError – if no data is provided.

data

Object of type biogeme.results.rawResults contaning the raw estimation results.

getBetaValues(myBetas=None)[source]

Retrieve the values of the estimated parameters, by names.

Parameters

myBetas (list(string)) – names of the requested parameters. If None, all available parameters will be reported. Default: None.

Returns

dict containing the values, where the keys are the names.

Return type

dict(string:float)

Raises

biogeme.exceptions.biogemeError – if some requested parameters are not available.

getBetasForSensitivityAnalysis(myBetas, size=100, useBootstrap=True)[source]

Generate draws from the distribution of the estimates, for sensitivity analysis.

Parameters
  • myBetas (list(string)) – names of the parameters for which draws are requested.

  • size (int) – number of draws. If useBootstrap is True, the value is ignored and a warning is issued. Default: 100.

  • useBootstrap (bool) – if True, the bootstrap estimates are directly used. The advantage is that it does not reyl on the assumption that the estimates follow a normal distribution. Default: True.

Raises

biogeme.exceptions.biogemeError – if useBootstrap is True and the bootstrap results are not available

Returns

list of dict. Each dict has a many entries as parameters. The list has as many entries as draws.

Return type

list(dict)

getBootstrapVarCovar()[source]

Obtain the bootstrap variance covariance matrix as a Pandas data frame.

Returns

bootstrap variance covariance matrix, or None if not available

Return type

pandas.DataFrame

getCorrelationResults(subset=None)[source]

Get the statistics about pairs of coefficients as a Pandas dataframe

Parameters

subset (list(str)) – produce the results only for a subset of parameters. If None, all the parameters are involved. Default: None

Returns

Pandas data frame with the correlation results

Return type

pandas.DataFrame

getEstimatedParameters(onlyRobust=True)[source]

Gather the estimated parameters and the corresponding statistics in a Pandas dataframe.

Parameters

onlyRobust (bool) – if True, only the robust statistics are included

Returns

Pandas dataframe with the results

Return type

pandas.DataFrame

getF12(robustStdErr=True)[source]

F12 is a format used by the software ALOGIT to report estimation results.

Parameters

robustStdErr (bool) – if True, the robust standard errors are reports. If False, the Rao-Cramer are.

Returns

results in F12 format

Return type

string

getGeneralStatistics()[source]

Format the results in a dict

Returns

dict with the results. The keys describe each content. Each element is a GeneralStatistic tuple, with the value and its preferred formatting.

Example:

'Init log likelihood': (-115.30029248549191, '.7g')
Return type

dict(string:float,string)

getHtml()[source]

Get the results coded in HTML

Returns

HTML code

Return type

string

getLaTeX(onlyRobust=True)[source]

Get the results coded in LaTeX

Parameters

onlyRobust (bool) – if True, only the robust statistics are included

Returns

LaTeX code

Return type

string

getRobustVarCovar()[source]

Obtain the robust variance covariance matrix as a Pandas data frame.

Returns

robust variance covariance matrix

Return type

pandas.DataFrame

getVarCovar()[source]

Obtain the Rao-Cramer variance covariance matrix as a Pandas data frame.

Returns

Rao-Cramer variance covariance matrix

Return type

pandas.DataFrame

likelihood_ratio_test(other_model, significance_level=0.05)[source]

This function performs a likelihood ratio test between a restricted and an unrestricted model. The “self” model can be either the restricted or the unrestricted.

Parameters
  • other_model (biogeme.results.bioResults) – other model to perform the test.

  • significance_level (float) – level of significance of the test. Default: 0.05

Returns

a tuple containing:

  • a message with the outcome of the test

  • the statistic, that is minus two times the difference between the loglikelihood of the two models

  • the threshold of the chi square distribution.

Return type

LRTuple(str, float, float)

numberOfFreeParameters()[source]

This is the number of estimated parameters, minus those that are at their bounds

printGeneralStatistics()[source]

Print the general statistics of the estimation.

Returns

general statistics

Example:

Number of estimated parameters: 2
Sample size:    5
Excluded observations:  0
Init log likelihood:    -67.08858
Final log likelihood:   -67.06549
Likelihood ratio test for the init. model:      0.04618175
Rho-square for the init. model: 0.000344
Rho-square-bar for the init. model:     -0.0295
Akaike Information Criterion:   138.131
Bayesian Information Criterion: 137.3499
Final gradient norm:    3.9005E-07
Bootstrapping time:     0:00:00.042713
Nbr of threads: 16

Return type

str

shortSummary()[source]

Provides a short summary of the estimation results

writeF12(robustStdErr=True)[source]

Write the results in F12 file.

writeHtml()[source]

Write the results in an HTML file.

writeLaTeX()[source]

Write the results in a LaTeX file.

writePickle()[source]

Dump the data in a file in pickle format.

Returns

name of the file.

Return type

string

biogeme.results.calcPValue(t)[source]

Calculates the p value of a parameter from its t-statistic.

The formula is

\[2(1-\Phi(|t|)\]

where \(\Phi(\cdot)\) is the CDF of a normal distribution.

Parameters

t (float) – t-statistics

Returns

p-value

Return type

float

biogeme.results.compileEstimationResults(dict_of_results, statistics=('Number of estimated parameters', 'Sample size', 'Final log likelihood', 'Akaike Information Criterion', 'Bayesian Information Criterion'), include_parameter_estimates=True)[source]

Compile estimation results into a common table

Parameters
  • dict_of_results (dict(str:bioResults) or dict(str:str)) – dictionary where the keys are the names of the models, and the values are either the estimation results, or the name of the pickle file where to find them.

  • statistics (tuple(str)) – list of statistics to include in the summary table

  • include_parameter_estimates (bool) – if True, the parameter estimates are included.

Returns

pandas dataframe with the requested results.

Return type

pandas.DataFrame

class biogeme.results.rawResults(theModel, betaValues, fgHb, bootstrap=None)[source]

Bases: object

Class containing the raw results from the estimation

F12FileName

Name of the F12 output file

H

Value of the hessian of the loglik. function

__init__(theModel, betaValues, fgHb, bootstrap=None)[source]

Constructor

Parameters
  • theModel (biogeme.BIOGEME) – object with the model

  • betaValues (list(float)) – list containing the estimated values of the parameters

  • fgHb (float,numpy.array, numpy.array, numpy.array) –

    tuple f,g,H,bhhh containing

    • f: the value of the function,

    • g: the gradient,

    • H: the second derivative matrix,

    • bhhh: the BHHH matrix.

  • bootstrap (numpy.array) –

    output of the bootstrapping. numpy array, of size B x K, where

    • B is the number of bootstrap iterations

    • K is the number of parameters to estimate

    Default: None.

betaNames

Names of the parameters

betaValues

Values of the parameters

betas

List of objects of type results.beta

bhhh

Value of the BHHH matrix of the loglikelihood function

bootstrap

output of the bootstrapping. numpy array, of size B x K, where

  • B is the number of bootstrap iterations

  • K is the number of parameters to estimate

bootstrap_time

Time needed to perform the bootstrap

dataname

Name of the database

drawsProcessingTime

Time needed to process the draws

excludedData

Number of excluded data

g

Value of the gradient of the loglik. function

gradientNorm

Norm of the gradient

htmlFileName

Name of the HTML output file

initLogLike

Value of the likelihood function with the initial value of the parameters

latexFileName

Name of the LaTeX output file

logLike

Value of the loglikelihood function

modelName

Name of the model

monteCarlo

True if the model involved Monte Carlo integration

nparam

Number of parameters

nullLogLike

Value of the likelihood function with equal probability model

numberOfDraws

Number of draws for Monte Carlo integration

numberOfObservations

Number of observations

numberOfThreads

Number of threads used for parallel computing

optimizationMessages

Diagnostics given by the optimization algorithm

pickleFileName

Name of the pickle outpt file

sampleSize

Sample size (number of individuals if panel data)

secondOrderTable

Second order statistics

typesOfDraws

Types of draws for Monte Carlo integration

userNotes

User notes