Analytical, Big Data and Simulation Models of Railway Delays

Fabrizio Cerreto

    Research output: Book/ReportPh.D. thesis

    719 Downloads (Pure)


    Punctuality of railway networks depends on several factors, associated with the planning phase or the operation phase. In the planning phase, robust timetables are designed to withstand the variability in operation and to contain the generation of primary delays. Also, railway planners seek timetable stability to absorb primary delays reducing the propagation into secondary delays and return quickly to the unperturbed condition. Besides, primary delays that occur in the operations phase can be reduced by improving the industrial processes behind the railway services. Understanding how delays generate and propagate is central to the efficient design of robust timetables and corrective measures of service production processes. The purpose of this study is the examination of the phenomena related to delays in railways, from both theoretical and empirical perspectives. The theoretical structure of delays is examined in analytical models. The effects of selected timetabling decisions are investigated in simulation models. Empirical studies on delay records from the realized operation are provided to identify recurrent patterns in the delay generation and recovery. In the first section, the study evaluates commonly used indicators for timetable stability and robustness and compares their sensitivity to changes in traffic volume, heterogeneity, and the infrastructure layout. The comparison includes analytical measures
    based on the timetable structure and measures based on simulation of operation under known perturbations. On the one hand, ex-ante analytical measures focus typically on traffic heterogeneity and line exploitation, often considering individual characteristics of the timetable only separately. For instance, delay recovery is usually modeled through either running time supplements or headway buffers between trains. On the other hand, simulation of operation mimics the behavior of railway systems and provides a more detailed insight. Simulation tools allow different types of measurements, such as the individual train delays recorded at different timing points, which can be evaluated in different methods. The accuracy of simulation comes, though, at the price of higher demand for computational time and resources. In this section, aggregate delay as a function of primary delays is measured in a microsimulation environment, and it is escribed as a valid indicator of timetable reliability. However, the extensive calculation performed in microsimulation makes this method unsuitable for applications where the velocity of calculation counts. For instance, online applications for decision support tools need fast responses, in a few seconds, and heuristic optimization algorithms often require recursive calculations, so the overall response times dilate quickly. In this thesis, methods to reduce the amount of simulation are also investigated, based on the same robustness measures under evaluation. The first section of this thesis identifies a valid measure of timetable robustness in the aggregate line delay related to known incidents. One of the major obstacles to the
    application of this type of measure in real-time traffic management and optimization is its dependence on simulation, which is a time-consuming process. The following section presents alternative methods that combine analytical and simulation models to estimate the aggregate line delay as a function of primary delays with reduced resources requirement, paving the way to applications that require prompt responses. In the second section, an analytical model is presented to describe the delay propagation in a closed form function, allowing quick calculation of the reliability indicators identified in the previous section, including aggregate line delay. Analytical models are typically much faster than microsimulation and are therefore more suitable for optimization environments and online decision support tools. The mathematical model provides insight into the relationship between primary delays and the consequent total disturbance on railway lines. This relationship is described by a composite polynomial, which spans from first to third degree, depending on the magnitude of primary delay relative to the size of the study domain. Timetable design parameters can be adjusted in this model, and different settings can be quickly compared. The robustness given by different values of running time supplements, headway buffers, and punctuality threshold can be assessed. The model is initially formulated for homogeneous traffic on railway lines. It is later integrated with stochastic simulation to support heterogeneous traffic and to include the delay generation process. This process consists of three parts. The first part, the incident simulation, mimics events that block the railway, such a temporary track blockage, or signal failure, described by the distributions of initial time and duration. In the second part, the model generates primary delays combining the incident with the timetable structure. Lastly, the primary delay is propagated to the subsequent trains and the downstream stations. In the stochastic simulation model for heterogeneous traffic, the total delay is estimated as a consequence of an incident that affects an individual train service, and a weighted average is then used to derive the total delay function associated to the whole timetable. In addition to the aggregate line delay, the model provides the individual delays of every train recorded at each station and can be extended, therefore, to implement several metrics. Both the analytical and simulation models presented in the previous sections rely on simplifying assumptions. One of the most influential assumptions, yet one of the most frequent, is that trains always use all the slack available in the timetable to recover from delays, in the absence of further circulation conflicts. In reality, delay recovery is a stochastic process itself, and it is ruled by several factors, driving behavior, rolling stock performance, and passenger comfort among others. Furthermore, possible recovery depends on the allocation of timetable slack along the path. In the timetabling phase, railway planners typically allocate the slack according to general rules from practice. Investigation of recurrent patterns in delay development and recovery in railway operation can improve this process, giving the opportunity to tailor the slack according to specific
    characteristics of individual train services. The whole railway operation can also be improved identifying the factors that cause recurrent delays so that individual critical processed can be fixed, and specific delay mitigation measures can be designed. In the third section, this study lastly analyses empirical records from railway operation to extract information for modeling and to identify systematic delays that require specific countermeasures. Distributions of realized running times are studied to understand the real maximum performance of trains and the minimum feasible running time on a line section. The actual use of running time supplement to recover from delays highlights points of lack or excess of timetable slack. In this way, the real potential delay recovery available in the timetable can be determined to support robustness analyses of the timetable. Big data techniques are successively applied to empirical records to identify recurrent delay patterns to be associated with specific service characteristics, such as time factors and rolling stock performances. Timestamps from railway operation are arranged in delay profiles of individual service runs, which are then classified in clusters of services that develop their delay in similar ways. The method identifies locations where the delay changes recurrently in the same way, which may suggest changes in the schedules, or in the processes linked to the railway operation. The K-means clustering method finds application in very different fields, and it is generally appreciated for its simplicity and velocity. The resulting classes of delay profiles are eventually linked to the characteristics of individual trains, so that specific and focused corrective measures can be designed for the railway service production processes. In summary, based on the knowledge developed in this study, it is possible to design robust timetables and to investigate the influence of selected parameters already in the planning phase. The study contributes the literature with an analytical delay propagation model, with the application of data analysis of the realized operation, and covers, besides, methods for appraisal of service reliability. The total delay generated on a railway line as a function of primary delays is identified as the indicator that is most sensitive to variations in traffic volume and infrastructure improvements. Methods to estimate this measure without using microsimulation are proposed, making analyses quicker, and opening the possibilities to include such statistics in online applications and optimization models. Additionally, the empirical analyses presented permit the identification of recurrent delay patterns in railway operation, supporting the design of dedicated corrective measures of productive processes.
    Original languageEnglish
    Number of pages180
    Publication statusPublished - 2018


    Dive into the research topics of 'Analytical, Big Data and Simulation Models of Railway Delays'. Together they form a unique fingerprint.

    Cite this