Qatten: A General Framework for Cooperative Multiagent Reinforcement Learning

arxiv(2020)

引用 183|浏览817
暂无评分
摘要
In many real-world settings, a team of cooperative agents must learn to coordinate their behavior with private observations and communication constraints. Deep multiagent reinforcement learning algorithms (Deep-MARL) have shown superior performance in these realistic and difficult problems but still suffer from challenges. One branch is the multiagent value decomposition, which decomposes the global shared multiagent Q-value $Q_{tot}$ into individual Q-values $Q^{i}$ to guide individuals' behaviors. However, previous work achieves the value decomposition heuristically without valid theoretical groundings, where VDN supposes an additive formation and QMIX adopts an implicit inexplicable mixing method. In this paper, for the first time, we theoretically derive a linear decomposing formation from $Q_{tot}$ to each $Q^{i}$. Based on this theoretical finding, we introduce the multi-head attention mechanism to approximate each term in the decomposing formula with theoretical explanations. Experiments show that our method outperforms state-of-the-art MARL methods on the widely adopted StarCraft benchmarks across different scenarios, and attention analysis is also investigated with sights.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要