Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement Learning
arxiv(2024)
摘要
Mean Field Games (MFGs) have the ability to handle large-scale multi-agent
systems, but learning Nash equilibria in MFGs remains a challenging task. In
this paper, we propose a deep reinforcement learning (DRL) algorithm that
achieves population-dependent Nash equilibrium without the need for averaging
or sampling from history, inspired by Munchausen RL and Online Mirror Descent.
Through the design of an additional inner-loop replay buffer, the agents can
effectively learn to achieve Nash equilibrium from any distribution, mitigating
catastrophic forgetting. The resulting policy can be applied to various initial
distributions. Numerical experiments on four canonical examples demonstrate our
algorithm has better convergence properties than SOTA algorithms, in particular
a DRL version of Fictitious Play for population-dependent policies.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要