Learning-based control of AMoD in competitive environments

Research output: Chapter in Book/Report/Conference proceedingConference abstract in proceedingsResearchpeer-review

2 Downloads (Orbit)

Abstract

Autonomous Mobility-on-Demand (AMoD), where customers request trips from their origin and are assigned an autonomous vehicle from a fleet to take them to their destination, has the potential to play a crucial role in future sustainable transport. AMoD offers passengers a personalized mobility service while eliminating the maintenance and parking costs associated with owning a private vehicle. Due to its high flexibility, AMoD is gaining enormous popularity around the world. However, a core challenge for the AMoD paradigm lies in the spatio-temporal nature of urban mobility, where trip origins and destinations are asymmetrically distributed (e.g., commuting downtown in the morning and vice-versa in the evening), making the overall system imbalanced and sensitive to disturbances. Operators can try to overcome this issue by manually rebalancing vehicles to anticipate future demand, or by developing dynamic pricing strategies to (dis)encourage trips between particular origin-destination pairs to promote a more desirable distribution of the vehicle supply. However, this presents a challenging control problem. While traditionally, the problems of vehicle rebalancing and dynamic pricing have been tackled either through the lenses of heuristics and optimization (Zardini et al., 2022), the most recent literature focuses on learning-based approaches, mainly due to their scalability and ability to handle dynamic stochastic environments- see Qin et al. (2022) for an extended survey. However, existing approaches consider a single-operator scenario. In modern liberal economies, this assumption is highly unrealistic. Therefore, in this work, we consider a multi-operator scenario, which we formulate as a multi-agent reinforcement learning (RL) problem, where each agent centrally controls the vehicles in its own fleet without having knowledge about the competitor’s states and actions. To the best of our knowledge, this is the first work to demonstrate that learning-based approaches are robust to the added stochasticity in the environment, being able to rebalance their fleet and dynamically set prices accounting for the interplay with the competitors and to empirically show that the learned policies converge to an equilibrium. Furthermore, we leverage this multi-agent RL setup to empirically study the market dynamics and the achieved equilibrium (e.g., regarding fleet size and the introduction of new competitors).
Original languageEnglish
Title of host publicationProceedings of the 12th Triennial Symposium on Transportation Analysis conference
Publication date2025
Publication statusPublished - 2025
Event12th Triennial Symposium on Transportation Analysis conference: The 12th Triennial Symposium on Transportation Analysis - Okinawa, Japan
Duration: 22 Jun 202527 Jun 2025
Conference number: 12
https://tristan2025.org/

Conference

Conference12th Triennial Symposium on Transportation Analysis conference
Number12
Country/TerritoryJapan
CityOkinawa
Period22/06/202527/06/2025
Internet address

Fingerprint

Dive into the research topics of 'Learning-based control of AMoD in competitive environments'. Together they form a unique fingerprint.

Cite this