Learning Multiple Coordinated Agents under Directed Acyclic Graph Constraints

Jaeyeon Jang,Diego Klabjan,Han Liu, Nital S. Patel, Xiuqi Li,Balakrishnan Ananthanarayanan,Husam Dauod, Tzung-Han Juang

CoRR(2023)

引用 0|浏览28
暂无评分
摘要
This paper proposes a novel multi-agent reinforcement learning (MARL) method to learn multiple coordinated agents under directed acyclic graph (DAG) constraints. Unlike existing MARL approaches, our method explicitly exploits the DAG structure between agents to achieve more effective learning performance. Theoretically, we propose a novel surrogate value function based on a MARL model with synthetic rewards (MARLM-SR) and prove that it serves as a lower bound of the optimal value function. Computationally, we propose a practical training algorithm that exploits new notion of leader agent and reward generator and distributor agent to guide the decomposed follower agents to better explore the parameter space in environments with DAG constraints. Empirically, we exploit four DAG environments including a real-world scheduling for one of Intel's high volume packaging and test factory to benchmark our methods and show it outperforms the other non-DAG approaches.
更多
查看译文
关键词
multiple coordinated agents,directed acyclic graph
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要