Xie, S., Hillel, T., and Jin, Y. (2021)

An Early Stopping Bayesian Data Assimilation Approach for Mixed-Logit Estimation

The mixed-logit model is a flexible tool in transportation choice analysis, which provides valuable insights into inter and intra-individual behavioural heterogeneity. However, applications of mixed-logit models are limited by the high computational and data requirements for model estimation. When estimating on small samples, the Bayesian estimation approach becomes vulnerable to over and under-fitting. This is problematic for investigating the behaviour of specific population sub-groups or market segments with low data availability. Similar challenges arise when transferring an existing model to a new location or time period, e.g., when estimating post-pandemic travel behaviour. We propose an Early Stopping Bayesian Data Assimilation (ESBDA) simulator for estimation of mixed-logit which combines a Bayesian statistical approach with Machine Learning methodologies. The aim is to improve the transferability of mixed-logit models and to enable the estimation of robust choice models with low data availability. This approach can provide new insights into choice behaviour where the traditional estimation of mixed-logit models was not possible due to low data availability, and open up new opportunities for investment and planning decisions support. The ESBDA estimator is benchmarked against the Direct Application approach, a basic Bayesian simulator with random starting parameter values and a Bayesian Data Assimilation (BDA) simulator without early stopping. The ESBDA approach is found to effectively overcome under and over-fitting and non-convergence issues in simulation. Its resulting models clearly outperform those of the reference simulators in predictive accuracy. Furthermore, models estimated with ESBDA tend to be more robust, with significant parameters with signs and values consistent with behavioural theory, even when estimated on small samples.

Download PDF