GESA: A GEneral Scenario-Agnostic Reinforcement Learning for Traffic Signal Control

Haoyuan Jiang, Ziyue Li*, Zhishuai Li, Lei Bai, Hangyu Mao, Wolfgang Ketter, Rui Zhao

*Corresponding author for this work

Research output: Contribution to journalConference articlePopular

35 Downloads (Pure)

Abstract

Reinforcement learning (RL) can automatically learn a better policy through a trial-and-error paradigm and has been adopted to revolutionize and optimize traditional traffic signal control systems that are usually based on handcrafted methods. However, most existing RL-based models are either based on a single scenario or multiple independent scenarios, where each scenario has a separate simulation environment with predefined road network topology and traffic signal settings. These models implement training and testing in the same scenario, thus being strictly tied up with the specific setting and sacrificing model generalization heavily. While a few recent models could be trained by multiple scenarios, they require a huge amount of manual labor to label the intersection structure, hindering the model’s generalization. In this work, we aim at a general framework that could eliminate heavy labeling and model a variety of scenarios simultaneously. To this end, we propose a general Scenario-Agnostic (GESA) reinforcement learning framework for traffic signal control with: (1) A general plug-in module to map all different intersections into a unified structure, freeing us from the heavy manual labor to specify the structure of intersections; (2) A unified state and action space design to keep the model input and output consistently structured; (3) A large-scale co-training with multiple scenarios, leading to a generic traffic signal control algorithm. GESA can automatically handle various structured intersections from various cities without human labeling, and it co-trains a generalist agent to control traffic signals for multiple cities together, which also demonstrates superior transferability in zero-shot settings. In experiments, we demonstrate our algorithm as the first one that can be co-trained with seven different scenarios without manual annotation and gets 13.27% higher rewards than baselines. When dealing with a new scenario, our model can still achieve 9.39% higher rewards. The code, scenarios, and demos are available here. The full paper is available at [1].

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume3827
Publication statusPublished - 2024
Event3rd International Workshop on Spatio-Temporal Reasoning and Learning, STRL 2024 - Jeju Island, Korea, Republic of
Duration: 5 Aug 2024 → …

Bibliographical note

Publisher Copyright:
© 2024 Copyright for this paper by its authors.

Fingerprint

Dive into the research topics of 'GESA: A GEneral Scenario-Agnostic Reinforcement Learning for Traffic Signal Control'. Together they form a unique fingerprint.

Cite this