Nonparametric Bayesian Learning Of Other Agents' Policies In Interactive Pomdps
Autonomous Agents and Multi-Agent Systems(2015)
摘要
We consider an autonomous POMDP agent facing a multi-agent environment with unknown opponents, that are modeled as finite state controllers. The agent first learns the models from (imperfectly) observed behavior, and subsequently exploits them in planning for its own optimal policy by constructing an interactive POMDP. In the learning phase, Bayesian nonparametric methods are used to sample from the posterior distribution over the in finite-dimensional space of all possible controllers, resulting in models whose size scales with the complexity of observed behavior. Experimental results show that learning improves the agent's performance, which increases with the amount of data collected during the learning phase.
更多查看译文
关键词
Multiagent Systems,Opponent Modeling,Probabilistic Inference,Bayesian Nonparametrics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络