Ortelli, N., de Lapparent, M., and Bierlaire, M. (2023)
Faster estimation of discrete choice models via weighted dataset reduction
11th Symposium of the European Association for Research in Transportation, Zurich, Switzerland
When estimating discrete choice models, the prospect of using ever-larger datasets is limited by the poor scalability of maximum likelihood estimation. This paper proposes a simple and fast dataset reduction method that is specifically designed to preserve the richness of observations originally present in a dataset, while reducing its size. Our approach leverages locality-sensitive hashing to create clusters of similar observations, from which representative observations are then sampled and weighted. We demonstrate the efficacy of our approach by applying it on a real-world mode choice dataset; the obtained results confirm that a carefully selected and weighted subsample of observations is capable of providing close-to-identical estimation results while being, by definition, less computationally demanding.