TY - JOUR
T1 - Scalable Population Synthesis with Deep Generative Modeling
AU - Borysov, Stanislav S.
AU - Rich, Jeppe
AU - Pereira, Francisco Camara
PY - 2019
Y1 - 2019
N2 - Population synthesis is concerned with the generation of synthetic yet realistic representations of populations. It is a fundamental problem in the modeling of transport where the synthetic populations of micro agents represent a key input to most agent-based models. In this paper, a new methodological framework for how to grow pools of micro agents is presented. This is accomplished by adopting a deep generative modeling approach from machine learning based on a Variational Autoencoder (VAE) framework. Compared to the previous population synthesis approaches based on Iterative Proportional Fitting (IPF), Markov Chain Monte Carlo (MCMC) sampling or traditional generative models, the proposed method allows unparalleled scalability with respect to the number and types of attributes. In contrast to the approaches that rely on approximating the joint distribution in the observed data space, VAE learns its compressed latent representation. The advantage of the compressed representation is that it avoids the problem of the generated samples being trapped in local minima when the number of attributes becomes large. The problem is illustrated using the Danish National Travel Survey data, where the Gibbs sampler fails to generate a population with 21 attributes (corresponding to the 121-dimensional joint distribution). At the same time, VAE shows acceptable performance when 47 attributes (corresponding to the 357-dimensional joint distribution) are used. Moreover, VAE allows for growing agents that are virtually different from those in the original data but have similar statistical properties and correlation structure. The presented approach will help modelers to generate better and richer populations with a high level of detail, including smaller zones, personal details and travel preferences.
AB - Population synthesis is concerned with the generation of synthetic yet realistic representations of populations. It is a fundamental problem in the modeling of transport where the synthetic populations of micro agents represent a key input to most agent-based models. In this paper, a new methodological framework for how to grow pools of micro agents is presented. This is accomplished by adopting a deep generative modeling approach from machine learning based on a Variational Autoencoder (VAE) framework. Compared to the previous population synthesis approaches based on Iterative Proportional Fitting (IPF), Markov Chain Monte Carlo (MCMC) sampling or traditional generative models, the proposed method allows unparalleled scalability with respect to the number and types of attributes. In contrast to the approaches that rely on approximating the joint distribution in the observed data space, VAE learns its compressed latent representation. The advantage of the compressed representation is that it avoids the problem of the generated samples being trapped in local minima when the number of attributes becomes large. The problem is illustrated using the Danish National Travel Survey data, where the Gibbs sampler fails to generate a population with 21 attributes (corresponding to the 121-dimensional joint distribution). At the same time, VAE shows acceptable performance when 47 attributes (corresponding to the 357-dimensional joint distribution) are used. Moreover, VAE allows for growing agents that are virtually different from those in the original data but have similar statistical properties and correlation structure. The presented approach will help modelers to generate better and richer populations with a high level of detail, including smaller zones, personal details and travel preferences.
M3 - Journal article
JO - ArXiv
JF - ArXiv
IS - arXiv:1808.06910
ER -