Application of Deep Reinforcement Learning in Guandan Game

Jiahong Pan, Zhongtian Zhang, Hengheng Shen,Yi Zeng,Lei Wu

2022 34th Chinese Control and Decision Conference (CCDC)(2022)

引用 0|浏览4
暂无评分
摘要
In recent years, imperfect information game has become an important touchstone to test the level of artificial intelligence. There are many imperfect information game scenarios in the real-world, such as economic transactions, military games, automatic driving. Therefore, the study of imperfect information game problems has very important practical significance. Guandan is a type of imperfect information card game with four players which are divided into two teams. The mass hidden information in the Guandan game leads to a high-dimensional game state. Reinforcement learning algorithm has efficient ability in strategy search of computer games. But it cannot converge under the condition of imperfect information and high-dimensional state space which caused by Guandan Game. According to these problems, this paper introduces the Proximal Policy Optimization (PPO) algorithm based on deep reinforcement learning to solve the problem of imperfect information, high-dimensional state space, and action space. It enables the agent to perceive high-dimensional information and makes decisions according to the acquisition information. The experiment result shows that the decision model based on the Proximal Policy Optimization algorithm is better than the intelligence level of the Policy Gradient algorithm and A2C algorithm, which proves that the system has a self-learning, ability to improve the game level of Guandan.
更多
查看译文
关键词
Imperfect Information Game,Guandan,Deep Reinforcement Learning,Proximal Policy Optimization Algorithm,Self-Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要