TY - GEN
T1 - Hierarchical Reinforcement Learning with Opponent Modeling for Distributed Multi-agent Cooperation
AU - Liang, Zhixuan
AU - Cao, Jiannong
AU - Jiang, Shan
AU - Saxena, Divya
AU - Xu, Huafeng
N1 - Funding Information:
The research is supported by Hong Kong RGC TRS (project T41-603), RIF (projects R5009-21 and R5060-19), CRF (projects C5026-18G and C5018-20GF), and GRF (project 15204921).
Publisher Copyright:
© 2022 IEEE.
PY - 2022/10
Y1 - 2022/10
N2 - Many real-world applications can be formulated as multi-agent cooperation problems, such as network packet routing and coordination of autonomous vehicles. The emergence of deep reinforcement learning (DRL) provides a promising approach for multi-agent cooperation through the interaction of the agents and environments. However, traditional DRL solutions suffer from the high dimensions of multiple agents with continuous action space during policy search. Besides, the dynamicity of agents' policies makes the training non-stationary. To tackle the issues, we propose a hierarchical reinforcement learning approach with high-level decision-making and low-level individual control for efficient policy search. In particular, the cooperation of multiple agents can be learned in high-level discrete action space efficiently. At the same time, the low-level individual control can be reduced to single-agent reinforcement learning. In addition to hierarchical reinforcement learning, we propose an opponent modeling network to model other agents' policies during the learning process. In contrast to end-to-end DRL approaches, our approach reduces the learning complexity by decomposing the overall task into sub-tasks in a hierarchical way. To evaluate the efficiency of our approach, we conduct a real-world case study in the cooperative lane change scenario. Both simulation and real-world experiments show the superiority of our approach in the collision rate and convergence speed.
AB - Many real-world applications can be formulated as multi-agent cooperation problems, such as network packet routing and coordination of autonomous vehicles. The emergence of deep reinforcement learning (DRL) provides a promising approach for multi-agent cooperation through the interaction of the agents and environments. However, traditional DRL solutions suffer from the high dimensions of multiple agents with continuous action space during policy search. Besides, the dynamicity of agents' policies makes the training non-stationary. To tackle the issues, we propose a hierarchical reinforcement learning approach with high-level decision-making and low-level individual control for efficient policy search. In particular, the cooperation of multiple agents can be learned in high-level discrete action space efficiently. At the same time, the low-level individual control can be reduced to single-agent reinforcement learning. In addition to hierarchical reinforcement learning, we propose an opponent modeling network to model other agents' policies during the learning process. In contrast to end-to-end DRL approaches, our approach reduces the learning complexity by decomposing the overall task into sub-tasks in a hierarchical way. To evaluate the efficiency of our approach, we conduct a real-world case study in the cooperative lane change scenario. Both simulation and real-world experiments show the superiority of our approach in the collision rate and convergence speed.
KW - Deep Reinforcement Learning
KW - Hierarchical Reinforcement Learning
KW - Multi-agent Cooperation
UR - http://www.scopus.com/inward/record.url?scp=85140909443&partnerID=8YFLogxK
U2 - 10.1109/ICDCS54860.2022.00090
DO - 10.1109/ICDCS54860.2022.00090
M3 - Conference article published in proceeding or book
AN - SCOPUS:85140909443
T3 - Proceedings - International Conference on Distributed Computing Systems
SP - 884
EP - 894
BT - Proceedings - 2022 IEEE 42nd International Conference on Distributed Computing Systems, ICDCS 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 42nd IEEE International Conference on Distributed Computing Systems, ICDCS 2022
Y2 - 10 July 2022 through 13 July 2022
ER -