Biogeme is a open source Python package designed for the maximum likelihood estimation of parametric models in general, with a special emphasis on discrete choice models. It relies on the package Python Data Analysis Library called Pandas.
Biogeme used to be a stand alone software package, written in C++. All the material related to the previous versions of Biogeme are available on the old webpage.
BIOGEME is distributed free of charge. We ask each user
- to register to Biogeme's users group, and
- to mention explicitly the use of the package when publishing results, using the following reference:
- New optimization algorithms are
available for estimation See the documentation of
estimatefunction, and the
optimizationmodule. See also an example.
- Stochastic log likelihood
- It is now possible to calculate the log likelihood function on a sample (a batch) of the full data file. This is particularly useful with large databases. It can be used in the implementation of a stochastic gradient algorithm, for instance. See documentation.
- User's notes
- It is possible to include your own notes in the HTML
file using the
userNotesparameter of the
biogemeobject. See documentation. See example.
- It is possible to have Biogeme suggesting the scales
of the variables in the database using
suggestScalesparameter of the
biogemeobject. See documentation.
- A new function
quickEstimateperforms the estimation of the parameters, and skips the calculation of the statistics. See documentation.
- A new function in the
databasemodule allows to split the database in order to prepare an estimation and a validation sets, for out-of-sample validation. See documentation. It is used by the new function
biogememodule. See documentation. See example.
- A new function allows to extract all the messages
generated during a
run. See documentation. See
is also possible to make the logger temporarily silent
using the functions
Several versions of Biogeme have been developed over the years. Several names of animals appear: Gnu, Bison, Python, and now, Pandas.
- Version -1: HieLoW
Around 1990, Michel Bierlaire wrote a software package called HieLoW: Hierarchical Logit for Windows. It was written in Borland C++, and was the first discrete choice estimation software with a graphical user interface. It was designed for the estimation of logit and nested logit models. The user had to specify the models through a graphical user interface. This software was distributed by Stratec SA, Brussels.
- Version 1: BisonBiogeme
Around 2000, the first version of Biogeme was released. Written in GNU C++, it was the first open source discrete choice software. It was designed to estimate the parameters of a list of predetermined discrete choice models such as logit, binary probit, nested logit, cross-nested logit, multivariate extreme value models, discrete and continuous mixtures of multivariate extreme value models, models with nonlinear utility functions, models designed for panel data, and heteroscedastic models. The modeling language was designed to be simple, and was developed using a a general-purpose parser generator called GNU Bison. Later, it will be referred to as BisonBiogeme. The distributions can be found here.
- Version 2: PythonBiogeme
Around 2010, a more flexible version was designed for general purpose parametric models. The modeling language was extended, and based on the Python language. A series of discrete choice models were precoded for an easy use. Also written in GNU C++, the distributions can be found here.
- Version 3: PandasBiogeme
In 2018, a completely new version of the software was released. It was not anymore a standalone executable, but a Python package. The package is written in Python, with the exception of the core calculations of the models, that are written in C++ for the sake of efficiency. The motivation was to combine the simplicity of the usage (especially for teaching purposes), with the sophistication provided by Python (for research and applications purposes). Morever, the management of the data relies on the package Pandas, which has become the workhorse of data scientists. Therefore, the name PandasBiogeme has been adopted. It is distributed on the Python Package Index repository.
I would like to thank the following persons who played various roles in the development of Biogeme along the years. The list is certainly not complete, and I apologize for those who are omitted: Alexandre Alahi, Nicolas Antille, Gianluca Antonini, Kay Axhausen, John Bates, Denis Bolduc, David Bunch, Andrew Daly, Anna Fernandez Antolin, Mamy Fetiarison, Mogens Fosgerau, Emma Frejinger, Carmine Gioia, Marie-Hélène Godbout, Stephane Hess, Tim Hillel, Richard Hurni, Eva Kazagli, Jasper Knockaert, Xinjun Lai, Gael Lederrey, Virginie Lurkin, Nicholas Molyneaux, Carolina Osorio, Meritxell Pacheco Paneque, Thomas Robin, Pascal Scheiben, Matteo Sorci, Michael Thémans, Joan Walker.
I would like to express a special thank to Moshe Ben-Akiva and Daniel McFadden for their friendship, and for the immense influence that they had and still have on my work.