A Simulation Based Online Planning Algorithm for Multi-Agent Cooperative Environments.

International Joint Conference on Autonomous Agents and Multi-agent Systems(2022)

引用 0|浏览7
暂无评分
摘要
Multi-agent Markov Decision Process (MMDP) has been an effective way of modelling sequential decision making algorithms for multi-agent cooperative environments. However, challenges such as exponential size of action space and dynamic changes limit the efficacy of proposed solutions. This paper propose a scalable and robust algorithm that can effectively solve MMDPs in real time. Simulation, pruning, and prediction are the three key components of the algorithm. The simulation component enables real time solutions by using a novel iterative pruning technique which in turn makes use of the prediction component trained with self play data. The algorithm is self-sustained as it generates new training data from simulation and gradually becomes better. Furthermore, we show empirical results demonstrating the capabilities of the algorithm and compare them with existing MMDP solvers.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要