TY - JOUR
T1 - A mean-field Markov decision process model for spatial-temporal subsidies in ride-sourcing markets
AU - Zhu, Zheng
AU - Ke, Jintao
AU - Wang, Hai
N1 - Funding Information:
This work is partially supported by the China National Natural Science Foundation grant 71890974, the Hong Kong Research Grants Council under projects HKUST16208920 and NHKUST627/18, and the Hong Kong University of Science and Technology?Didi Chuxing (HKUST-DiDi) Joint Laboratory. The third author gratefully acknowledges support by the Lee Kong Chian (LKC) Fellowship awarded by Singapore Management University. The opinions in this paper do not necessarily reflect the official views of the HKUST-DiDi Joint Laboratory. The authors are responsible for all statements. The authors would also like to acknowledge all the reviewers for their constructive comments.
Funding Information:
This work is partially supported by the China National Natural Science Foundation grant 71890974, the Hong Kong Research Grants Council under projects HKUST16208920 and NHKUST627/18 , and the Hong Kong University of Science and Technology – Didi Chuxing ( HKUST-DiDi ) Joint Laboratory. The third author gratefully acknowledges support by the Lee Kong Chian (LKC) Fellowship awarded by Singapore Management University . The opinions in this paper do not necessarily reflect the official views of the HKUST-DiDi Joint Laboratory. The authors are responsible for all statements. The authors would also like to acknowledge all the reviewers for their constructive comments.
Publisher Copyright:
© 2021 Elsevier Ltd
PY - 2021/8
Y1 - 2021/8
N2 - Ride-sourcing services are increasingly popular because of their ability to accommodate on-demand travel needs. A critical issue faced by ride-sourcing platforms is the supply-demand imbalance, as a result of which drivers may spend substantial time on idle cruising and picking up remote passengers. Some platforms attempt to mitigate the imbalance by providing relocation guidance for idle drivers who may have their own self-relocation strategies and decline to follow the suggestions. Platforms then seek to induce drivers to system-desirable locations by offering them subsidies. This paper proposes a mean-field Markov decision process (MF-MDP) model to depict the dynamics in ride-sourcing markets with mixed agents, whereby the platform aims to optimize some objectives from a system perspective using spatial-temporal subsidies with predefined subsidy rates, and a number of drivers aim to maximize their individual income by following certain self-relocation strategies. To solve the model more efficiently, we further develop a representative-agent reinforcement learning algorithm that uses a representative driver to model the decision-making process of multiple drivers. This approach is shown to achieve significant computational advantages, faster convergence, and better performance. Using case studies, we demonstrate that by providing some spatial-temporal subsidies, the platform is able to well balance a short-term objective of maximizing immediate revenue and a long-term objective of maximizing service rate, while drivers can earn higher income.
AB - Ride-sourcing services are increasingly popular because of their ability to accommodate on-demand travel needs. A critical issue faced by ride-sourcing platforms is the supply-demand imbalance, as a result of which drivers may spend substantial time on idle cruising and picking up remote passengers. Some platforms attempt to mitigate the imbalance by providing relocation guidance for idle drivers who may have their own self-relocation strategies and decline to follow the suggestions. Platforms then seek to induce drivers to system-desirable locations by offering them subsidies. This paper proposes a mean-field Markov decision process (MF-MDP) model to depict the dynamics in ride-sourcing markets with mixed agents, whereby the platform aims to optimize some objectives from a system perspective using spatial-temporal subsidies with predefined subsidy rates, and a number of drivers aim to maximize their individual income by following certain self-relocation strategies. To solve the model more efficiently, we further develop a representative-agent reinforcement learning algorithm that uses a representative driver to model the decision-making process of multiple drivers. This approach is shown to achieve significant computational advantages, faster convergence, and better performance. Using case studies, we demonstrate that by providing some spatial-temporal subsidies, the platform is able to well balance a short-term objective of maximizing immediate revenue and a long-term objective of maximizing service rate, while drivers can earn higher income.
KW - Markov decision process
KW - Mean-field
KW - Mixed agents
KW - Ride-sourcing
KW - Subsidy
UR - http://www.scopus.com/inward/record.url?scp=85110352014&partnerID=8YFLogxK
U2 - 10.1016/j.trb.2021.06.014
DO - 10.1016/j.trb.2021.06.014
M3 - Journal article
AN - SCOPUS:85110352014
SN - 0191-2615
VL - 150
SP - 540
EP - 565
JO - Transportation Research Part B: Methodological
JF - Transportation Research Part B: Methodological
ER -