GHQ: grouped hybrid Q-learning for cooperative heterogeneous multi-agent reinforcement learning

Complex & Intelligent Systems(2024)

引用 0|浏览0
暂无评分
摘要
Previous deep multi-agent reinforcement learning (MARL) algorithms have achieved impressive results, typically in symmetric and homogeneous scenarios. However, asymmetric heterogeneous scenarios are prevalent and usually harder to solve. In this paper, the main discussion is about the cooperative heterogeneous MARL problem in asymmetric heterogeneous maps of the Starcraft Multi-Agent Challenges (SMAC) environment. Recent mainstream approaches use policy-based actor-critic algorithms to solve the heterogeneous MARL problem with various individual agent policies. However, these approaches lack formal definition and further analysis of the heterogeneity problem. Therefore, a formal definition of the Local Transition Heterogeneity (LTH) problem is first given. Then, the LTH problem in SMAC environment can be studied. To comprehensively reveal and study the LTH problem, some new asymmetric heterogeneous maps in SMAC are designed. It has been observed that baseline algorithms fail to perform well in the new maps. Then, the authors propose the Grouped Individual-Global-Max (GIGM) consistency and a novel MARL algorithm, Grouped Hybrid Q-Learning (GHQ). GHQ separates agents into several groups and keeps individual parameters for each group. To enhance cooperation between groups, GHQ maximizes the mutual information between trajectories of different groups. A novel hybrid structure for value factorization in GHQ is also proposed. Finally, experiments on the original and the new maps show the fabulous performance of GHQ compared to other state-of-the-art algorithms.
更多
查看译文
关键词
Heterogeneous multi-agent reinforcement learning,Cooperative multi-agent reinforcement learning,Value function factorization,StarCraftII multi-agent challenge
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要