A Deep Reinforcement Learning Approach for Online Parcel Assignment

Hao Zeng,Qiong Wu, Kunpeng Han, Junying He,Haoyuan Hu

AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems(2023)

引用 0|浏览3
暂无评分
摘要
In this paper, we investigate the online parcel assignment (OPA) problem, in which each stochastically generated parcel order needs to be assigned to a candidate route for delivery with the objective to minimize the total delivery cost under certain business constraints. The OPA problem is challenging due to its stochastic nature: each parcel's candidate routes, which depend on the parcel's attributes, are unknown until its order is placed, and the total parcel volume to be assigned is uncertain in advance. To tackle this problem, we propose an algorithm based on deep reinforcement learning, namelyPPO-OPA, that shows competitive performance. More specifically, we introduce a novel Markov Decision Process (MDP) to model the decision-making process in the OPA problem, and develop a policy gradient algorithm that adopts attention networks for policy evaluation. By designing a dedicated reward function, our proposed algorithm can achieve a lower total cost with a smaller violation of constraints, compared to the traditional method used in the industry that assigns parcels to candidate routes proportionally. In addition, the performances of our proposed algorithm and the Primal-Dual algorithm are comparable, while the later assumes a known total parcel volume in advance, which is unrealistic in practice.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要