Abstract
Smart card data from the Automatic Fare Collecting systems (AFC) and timetable information, such as Automatic Vehicle Location (AVL), are used in combination by practitioners and researchers to gain a deeper understanding of the public transit network. In some cases, AVL data are not available due to records being missing in the system. In such cases, people resort to the used schedule timetable such as General Transit Feed Specification (GTFS) to match smart card data to the transit network. Since delays or changes to the timetable are not contained in the scheduled timetable, it can result in wrong matches between the smart card data and the transit network. This paper shows how the uncertainty of arrival and departure times affects passengers to train assignments and proposes a method for estimating the missing arrival time of trains when the recorded timetable information is not available. The method uses the knowledge of how the tap-outs are distributed in a hierarchical, latent Bayesian model to predict the arrival times of trains. Evaluated on 15,136 train arrivals, the model can infer 70% of the arrivals times with an average error of 28 to 32 seconds depending on the station.
Original language | English |
---|---|
Journal | IEEE Open Journal of Intelligent Transportation Systems |
Volume | 2 |
Pages (from-to) | 160-172 |
ISSN | 2687-7813 |
DOIs | |
Publication status | Published - 2021 |
Keywords
- AFC
- Automatic Fare Collection
- AVL
- Automatic Vehicle Location
- Bayes Statistics
- Machine Learning
- Missing Data
- Smart Card
- Train Logs