Lederrey, G., Hillel, T., and Bierlaire, M.

DATGAN: Integrating expert knowledge into deeplearning for population synthesis

Speaker: Lederrey Gael

STRC 2021

September 14, 2021

Agent-based simulations and activity-based models used to analyse nationwide transport networks require detailed synthetic populations. These applications are becoming more and more complex and thus require more precise synthetic data. However, standard statistical techniques such as Iterative Proportional Fitting (IPF) or Gibbs sampling fail to provide data with a high enough standard, e.g. these techniques fail to generate rare combinations of attributes, also known as sampling zeros in the literature. Researchers have, thus, been investigating new deep learning techniques such as Generative Adversarial Networks (GANs) for population synthesis. These methods have already shown great success in other fields. However, one fundamental limitation is that GANs are data-driven techniques, and it is thus not possible to integrate expert knowledge in the data generation process. This can lead to the following issues: lack of representativity in the generated data, the introduction of bias, and the possibility of overfitting the sample's noise. To address these limitations, we present the Directed Acyclic Tabular GAN (DATGAN) to integrate expert knowledge in deep learning models for synthetic populations. This approach allows the interactions between variables to be specified explicitly using a Directed Acyclic Graph (DAG). The DAG is then converted to a network of modified Long Short-Term Memory (LSTM) cells. Two types of multi-input LSTM cells have been developed to allow such structure in the generator. The DATGAN is then tested on the Chicago travel survey dataset. We show that our model outperforms state-of-the-art methods on Machine Learning efficacy and statistical metrics.


Warning: Undefined array key "PDF" in E:\Inetpub\transp-or.epfl.ch\web\php\abstract.php on line 216