Adaptive Learning Rates for Multi-Agent Reinforcement Learning

AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems(2023)

引用 0|浏览11
暂无评分
摘要
In multi-agent reinforcement learning (MARL), the learning rates of actors and critic are mostly hand-tuned and fixed. This not only requires heavy tuning but more importantly limits the learning. With adaptive learning rates according to gradient patterns, some optimizers have been proposed for general optimizations, which however do not take into consideration the characteristics of MARL. In this paper, we propose AdaMa to bring adaptive learning rates to cooperative MARL. AdaMa evaluates the contribution of actors' updates to the improvement of Q-value and adaptively updates the learning rates of actors to the direction of maximally improving the Q-value. AdaMa could also dynamically balance the learning rates between the critic and actors according to their varying effects on the learning. Moreover, AdaMa can incorporate the second-order approximation to capture the contribution of pairwise actors' updates and thus more accurately updates the learning rates of actors. Empirically, we show that AdaMa could accelerate learning and improve performance in a variety of multi-agent scenarios. More importantly, AdaMa does not require heavy hyperparameter tuning and thus significantly reduces the training cost. The visualizations of learning rates during training clearly explain how and why AdaMa works.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要