Alpha-Mini: Minichess Agent with Deep Reinforcement Learning

Michael Sun, Robert Tan

arxiv(2021)

引用 0|浏览2
暂无评分
摘要
We train an agent to compete in the game of Gardner minichess, a downsized variation of chess played on a 5x5 board. We motivated and applied a SOTA actor-critic method Proximal Policy Optimization with Generalized Advantage Estimation. Our initial task centered around training the agent against a random agent. Once we obtained reasonable performance, we then adopted a version of iterative policy improvement adopted by AlphaGo to pit the agent against increasingly stronger versions of itself, and evaluate the resulting performance gain. The final agent achieves a near (.97) perfect win rate against a random agent. We also explore the effects of pretraining the network using a collection of positions obtained via self-play.
更多
查看译文
关键词
deep reinforcement learning,minichess agent,alpha-mini
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要