Deep reinforcement learning for transit signal priority in a connected environment

Meng Long, Xiexin Zou, Yue Zhou, Edward Chung

Research output: Journal article publicationJournal articleAcademic researchpeer-review

16 Citations (Scopus)


Transit signal priority (TSP) is an effective measure to reduce traffic congestion and improve bus efficiency in metropolises. The connected vehicle technology and reinforcement learning (RL) algorithms can respectively provide more detailed and accurate information and more robust algorithms to traffic signal control systems, to develop smarter TSP strategies. This paper proposes an extended Dueling Double Deep Q-learning with invalid action masking (eD3QNI) algorithm for TSP strategy in a connected environment. The algorithm considers multiple conflicting bus priority requests and the constraints on the traffic light and phase skipping rule, aiming to improve the person delay of buses. Its performance is evaluated by simulation for a single intersection with two traffic demands and random arrivals, schedule deviations, occupancies of buses. Results demonstrate that eD3QNI produces lower average person delay and schedule delay than fixed-time signal, active TSP strategies, and other common RL methods. It also shows that the invalid action masking (IAM) method is superior to the usual variable decision points (VDP) method in terms of high convergence speed, effective performance improvement, and application of domain knowledge on the RL algorithm. The penetration rates of connected buses do not affect the converging speed of the proposed method, and an environment with a higher penetration rate will show better performance. Moreover, under the proposed method, different specific reward functions can be incorporated as desired to realize different operational goals for the TSP strategies.

Original languageEnglish
Article number103814
JournalTransportation Research Part C: Emerging Technologies
Publication statusPublished - Sept 2022


  • Connected environment
  • Invalid action masking
  • Reinforcement learning
  • Traffic signal control
  • Transit signal priority

ASJC Scopus subject areas

  • Civil and Structural Engineering
  • Automotive Engineering
  • Transportation
  • Management Science and Operations Research


Dive into the research topics of 'Deep reinforcement learning for transit signal priority in a connected environment'. Together they form a unique fingerprint.

Cite this