A Semi-Markov Decision Model With Inverse Reinforcement Learning For Recognizing The Destination Of A Maneuvering Agent In Real Time Strategy Games

IEEE ACCESS(2020)

引用 6|浏览98
暂无评分
摘要
Recognizing the destination of a maneuvering agent is important to create intelligent AI players in Real Time Strategy (RTS) games. Among different ways of problem formulation, goal recognition can be solved as a model-based planning problem using off-the-shelf planners. However, the common problem in these frameworks is that they usually lack of the modeling of the action duration as in real-world scenarios the agent may take several steps to transit between grids. To solve this problem, a semi-Markov decision model (SMDM), which explicitly models the duration of an action, is proposed in this paper. Besides, most of the current works do not establish a behavioral model of the identified person, and there is almost no work modeling individual behavioral preference, which limits the accuracy of the recognition results. In this paper, the Inverse Reinforcement Learning (IRL) method is adopted in opponent behavior learning for the destination recognition problem. To adapt to the dynamic environment, the Maximum Entropy Inverse Reinforcement Learning (MaxEnt IRL) method is transformed by defining a Fitness index to measure the effect of weight and use the Nelder-Mead polyhedron search to find the optimal weight. In experiments, we build the game scenario in the Unreal Engine 4 environment and collect the moving trajectories from the human players in several different tasks for evaluating the performance of our methods. The results show that the recognizer using IRL can recognize the destination effectively even if the intention changes during the midway, and it performs better than other models in terms of several most frequently-used metrics.
更多
查看译文
关键词
Real time strategy games, goal recognition, inverse reinforcement learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要