Evolutionary Dynamics Of Regret Minimization

ECML PKDD'10: Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II(2010)

引用 18|浏览17
暂无评分
摘要
Learning in multi-agent systems (MAS) is a complex task. Current learning theory for single-agent systems does not extend to multi-agent problems. In a MAS the reinforcement an agent receives may depend on the actions taken by the other agents present in the system. Hence, the Markov property no longer holds and convergence guarantees are lost. Currently there does not exist a general formal theory describing and elucidating the conditions under which algorithms for multi-agent learning (MAL) are successful. Therefore it is important to fully understand the dynamics of multi-agent reinforcement learning, and to be able to analyze learning behavior in terms of stability and resilience of equilibria. Recent work has considered the replicator dynamics of evolutionary game theory for this purpose. In this paper we contribute to this framework. More precisely, we formally derive the evolutionary dynamics of the Regret Minimization polynomial weights learning algorithm, which will be described by a system of differential equations. Using these equations we can easily investigate parameter settings and analyze the dynamics of multiple concurrently learning agents using regret minimization. In this way it is clear why certain attractors are stable and potentially preferred over others, and what the basins of attraction look like. Furthermore, we experimentally show that the dynamics predict the real learning behavior and we test the dynamics also in non-self play, comparing the polynomial weights algorithm against the previously derived dynamics of Q-learning and various Linear Reward algorithms in a set of benchmark normal form games.
更多
查看译文
关键词
current learning theory,multi-agent learning,multi-agent reinforcement learning,real learning behavior,multi-agent system,evolutionary game theory,general formal theory,Regret Minimization polynomial weight,evolutionary dynamic,polynomial weights algorithm,Evolutionary dynamic,regret minimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要