Cooperative Multi-Agent Reinforcement Learning in Express System

CIKM '20: The 29th ACM International Conference on Information and Knowledge Management Virtual Event Ireland October, 2020(2020)

引用 13|浏览108
暂无评分
摘要
Express systems are widely deployed in many major cities. One type of important tasks in the system is to pick up packages from customers in time. As pick-up requests come in real time and there are many couriers picking up packages, how to dispatch couriers to ensure the cooperation among them and to complete more pick-up tasks in a long time, is very important but challenging. In this paper, we propose a reinforcement learning based framework to learn courier dispatching policies. At first, we divide the city into independent regions, inner each of which a constant number of couriers pick up packages at the same time. Besides reducing problem complexity, city division has practical operation benefits. Afterwards, we focus on each region separately. For each region, we propose a Cooperative Multi-Agent Reinforcement Learning model, i.e. CMARL, to learn the optimal courier dispatching policy in it. CMARL tries to maximize the total number of completed pick-up tasks by all couriers in a long time. Our model achieves this target by combining two Markov Decision Processes, one to guarantee the cooperation among couriers, and the other one to ensure the long-term optimization. After obtaining the value functions of these two MDPs, a new value function is designed to trade off them, based on which we can infer the courier dispatching policy. Experiments based on real-world road network data and historical express data from Beijing are conducted, to confirm the superiority of our model compared with nine baselines.
更多
查看译文
关键词
Express system, multi-agent reinforcement learning, cooperation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要