Application of data clustering to railway delay pattern recognition

Fabrizio Cerreto*, Bo Friis Nielsen, Otto Anker Nielsen, Steven Harrod

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

326 Downloads (Pure)

Abstract

K-means clustering is employed to identify recurrent delay patterns on a high traffic railway line north of Copenhagen, Denmark. The clusters identify behavioral patterns in the very large (“big data”) data sets generated automatically and continuously by the railway signal system. The results reveal where corrective actions are necessary, showing where recurrent delay patterns take place. Delay profiles and delay-change profiles are generated from timestamps to compare different train runs, and to partition the set of observations into groups of similar elements. K-means clustering can identify and discriminate different patterns affecting the same stations, which is otherwise difficult in previous approaches based on visual inspection. Classical methods of univariate analysis do not reveal these patterns. The demonstrated methodology is scalable and can be applied to any system of transport.
Original languageEnglish
Article number6164534
JournalJournal of Advanced Transportation
Volume2018
Number of pages18
ISSN0197-6729
DOIs
Publication statusPublished - 2018

Cite this

@article{fa76387110224edba740fc2ea9ffb600,
title = "Application of data clustering to railway delay pattern recognition",
abstract = "K-means clustering is employed to identify recurrent delay patterns on a high traffic railway line north of Copenhagen, Denmark. The clusters identify behavioral patterns in the very large (“big data”) data sets generated automatically and continuously by the railway signal system. The results reveal where corrective actions are necessary, showing where recurrent delay patterns take place. Delay profiles and delay-change profiles are generated from timestamps to compare different train runs, and to partition the set of observations into groups of similar elements. K-means clustering can identify and discriminate different patterns affecting the same stations, which is otherwise difficult in previous approaches based on visual inspection. Classical methods of univariate analysis do not reveal these patterns. The demonstrated methodology is scalable and can be applied to any system of transport.",
author = "Fabrizio Cerreto and Nielsen, {Bo Friis} and Nielsen, {Otto Anker} and Steven Harrod",
year = "2018",
doi = "10.1155/2018/6164534",
language = "English",
volume = "2018",
journal = "Journal of Advanced Transportation",
issn = "0197-6729",
publisher = "John Wiley & Sons Ltd",

}

Application of data clustering to railway delay pattern recognition. / Cerreto, Fabrizio; Nielsen, Bo Friis; Nielsen, Otto Anker; Harrod, Steven.

In: Journal of Advanced Transportation, Vol. 2018, 6164534, 2018.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - Application of data clustering to railway delay pattern recognition

AU - Cerreto, Fabrizio

AU - Nielsen, Bo Friis

AU - Nielsen, Otto Anker

AU - Harrod, Steven

PY - 2018

Y1 - 2018

N2 - K-means clustering is employed to identify recurrent delay patterns on a high traffic railway line north of Copenhagen, Denmark. The clusters identify behavioral patterns in the very large (“big data”) data sets generated automatically and continuously by the railway signal system. The results reveal where corrective actions are necessary, showing where recurrent delay patterns take place. Delay profiles and delay-change profiles are generated from timestamps to compare different train runs, and to partition the set of observations into groups of similar elements. K-means clustering can identify and discriminate different patterns affecting the same stations, which is otherwise difficult in previous approaches based on visual inspection. Classical methods of univariate analysis do not reveal these patterns. The demonstrated methodology is scalable and can be applied to any system of transport.

AB - K-means clustering is employed to identify recurrent delay patterns on a high traffic railway line north of Copenhagen, Denmark. The clusters identify behavioral patterns in the very large (“big data”) data sets generated automatically and continuously by the railway signal system. The results reveal where corrective actions are necessary, showing where recurrent delay patterns take place. Delay profiles and delay-change profiles are generated from timestamps to compare different train runs, and to partition the set of observations into groups of similar elements. K-means clustering can identify and discriminate different patterns affecting the same stations, which is otherwise difficult in previous approaches based on visual inspection. Classical methods of univariate analysis do not reveal these patterns. The demonstrated methodology is scalable and can be applied to any system of transport.

U2 - 10.1155/2018/6164534

DO - 10.1155/2018/6164534

M3 - Journal article

VL - 2018

JO - Journal of Advanced Transportation

JF - Journal of Advanced Transportation

SN - 0197-6729

M1 - 6164534

ER -