Generating Behavior-Diverse Game AIs with Evolutionary Multi-Objective Deep Reinforcement Learning

IJCAI, pp. 3371-3377, 2020.

Cited by: 0|Bibtex|Views62|Links
EI
Keywords:
game companyMarkov Decision ProcessJustice Onlinepopulation-based trainingdiverse behaviorMore(16+)
Weibo:
We introduce a new framework, named EMOGI, which can automatically generate desirable styles with almost no domain knowledge

Abstract:

Generating diverse behaviors for game artificial intelligence (Game AI) has been long recognized as a challenging task in the game industry. Designing a Game AI with a satisfying behavioral characteristic (style) heavily depends on the domain knowledge and is hard to achieve manually. Deep reinforcement learning sheds light on advancing t...More

Code:

Data:

0
Introduction
  • Gaming is at the heart of the entertainment business, and the game market is rapidly growing along with fierce competition.
  • According to the latest Global Games Market Report [Newzoo, 2019], there are over 2.5 billion active gamers across the world, and over 164 billion will be spent on games in 2020.
  • With such a vast and competitive market, the game quality like the entertainment and attraction becomes especially important, as they greatly determinate the success of a game.
  • A more effective approach to create behavior-diverse Game AIs is of great importance and meaning for game companies
Highlights
  • Gaming is at the heart of the entertainment business, and the game market is rapidly growing along with fierce competition
  • Evaluation results of game artificial intelligence generated by Evolutionary Multi-Objective Game Intelligence and A3C are summarized in Tab. 1
  • Both A3C and Evolutionary Multi-Objective Game Intelligence are trained using the same number of runs
  • The busy and lazy AIs created by Evolutionary Multi-Objective Game Intelligence achieve higher busy (0.954) and lazy degree (0.966) than the ones learned by A3C, and be able to win the game. This demonstrates the effectiveness of Evolutionary Multi-Objective Game Intelligence in learning complex style automatically. Another advantage is that the neutral AI, generated by Evolutionary Multi-Objective Game Intelligence, achieves 0.557 busy degree and 0.531 lazy degree, making itself a suitable neutral game artificial intelligence, which, is hard to obtained by A3C
  • This paper proposes Evolutionary Multi-Objective Game Intelligence, aiming to efficiently generate behavior-diverse game artificial intelligence by leveraging evolutionary algorithm, PMOO and DRL
  • Empirical results show the effectiveness of Evolutionary Multi-Objective Game Intelligence in creating diverse and complex behaviors
Methods
  • Asynchronous Methods for Deep Reinforcement Learning

    ICML, 2016.

    [Mouret and Clune, 2015] Jean-Baptiste Mouret and Jeff Clune.
  • Asynchronous Methods for Deep Reinforcement Learning.
  • [Mouret and Clune, 2015] Jean-Baptiste Mouret and Jeff Clune.
  • Illuminating search spaces by mapping elites.
  • ArXiv:1504.04909, 2015.
  • [Newzoo, 2019] Newzoo.
  • Global games market report.
  • Https://newzoo.com/solutions/standard/market-forecasts/ global-games-market-report, 2019.
  • [Oh et al, 2019] Inseok Oh, Seungeun Rho, Sangbin Moon, Seongho Son, Hyoil Lee, and Jinyun Chung.
  • Creating ProLevel AI for a Real-Time Fighting Game Using Deep Reinforcement Learning.
  • ArXiv.org, April 2019.
Results
  • The busy and lazy AIs created by EMOGI achieve higher busy (0.954) and lazy degree (0.966) than the ones learned by A3C, and be able to win the game
  • This demonstrates the effectiveness of EMOGI in learning complex style automatically.
  • Another advantage is that the neutral AI, generated by EMOGI, achieves 0.557 busy degree and 0.531 lazy degree, making itself a suitable neutral Game AI, which, is hard to obtained by A3C
Conclusion
  • This paper proposes EMOGI, aiming to efficiently generate behavior-diverse Game AIs by leveraging EA, PMOO and DRL.
  • Empirical results show the effectiveness of EMOGI in creating diverse and complex behaviors.
  • To deploy AIs in commercial games, the robustness of the generated AIs is worth investigating as future work [Sun et al, 2020]
Summary
  • Introduction:

    Gaming is at the heart of the entertainment business, and the game market is rapidly growing along with fierce competition.
  • According to the latest Global Games Market Report [Newzoo, 2019], there are over 2.5 billion active gamers across the world, and over 164 billion will be spent on games in 2020.
  • With such a vast and competitive market, the game quality like the entertainment and attraction becomes especially important, as they greatly determinate the success of a game.
  • A more effective approach to create behavior-diverse Game AIs is of great importance and meaning for game companies
  • Methods:

    Asynchronous Methods for Deep Reinforcement Learning

    ICML, 2016.

    [Mouret and Clune, 2015] Jean-Baptiste Mouret and Jeff Clune.
  • Asynchronous Methods for Deep Reinforcement Learning.
  • [Mouret and Clune, 2015] Jean-Baptiste Mouret and Jeff Clune.
  • Illuminating search spaces by mapping elites.
  • ArXiv:1504.04909, 2015.
  • [Newzoo, 2019] Newzoo.
  • Global games market report.
  • Https://newzoo.com/solutions/standard/market-forecasts/ global-games-market-report, 2019.
  • [Oh et al, 2019] Inseok Oh, Seungeun Rho, Sangbin Moon, Seongho Son, Hyoil Lee, and Jinyun Chung.
  • Creating ProLevel AI for a Real-Time Fighting Game Using Deep Reinforcement Learning.
  • ArXiv.org, April 2019.
  • Results:

    The busy and lazy AIs created by EMOGI achieve higher busy (0.954) and lazy degree (0.966) than the ones learned by A3C, and be able to win the game
  • This demonstrates the effectiveness of EMOGI in learning complex style automatically.
  • Another advantage is that the neutral AI, generated by EMOGI, achieves 0.557 busy degree and 0.531 lazy degree, making itself a suitable neutral Game AI, which, is hard to obtained by A3C
  • Conclusion:

    This paper proposes EMOGI, aiming to efficiently generate behavior-diverse Game AIs by leveraging EA, PMOO and DRL.
  • Empirical results show the effectiveness of EMOGI in creating diverse and complex behaviors.
  • To deploy AIs in commercial games, the robustness of the generated AIs is worth investigating as future work [Sun et al, 2020]
Tables
  • Table1: Averaged evaluation results (10 runs) regarding related indicators of generated Game AIs in Atari game
  • Table2: Averaged evaluation results (30 runs) regarding related indicators of generated Game AIs in JO game
Download tables as Excel
Funding
  • The work is supported by the National key R&D program of China (Grant Nos.: 2018YFB1701700), National Natural Science Foundation of China (Grant Nos.: 61702362, U1836214), new Generation of Artificial Intelligence Science and Technology Major Project of Tianjin (Grant Nos.: 19ZXZNGX00010), Singapore National Research Foundation, under its National Cybersecurity R&D Program (Grant Nos.: NRF2018NCR-NCR005-0001), National Satellite of Excellence in Trustworthy Software System (Grant Nos.: NRF2018NCR-NSOE003-0001) and NRF Investigatorship (Grant Nos.: NRFI06-2020-0022)
Reference
  • [Agapitos et al., 2008] Alexandros Agapitos, Julian Togelius, Simon M. Lucas, Jurgen Schmidhuber, and Andreas Konstantinidis. Generating diverse opponents with multiobjective evolution. In IEEE Symposium On Computational Intelligence and Games. IEEE, 2008.
    Google ScholarLocate open access versionFindings
  • [Alt, 2004] Greg Alt. The suffering: A game ai case study. In Challenges in Game AI workshop, Nineteenth national conference on Artificial Intelligence, pages 134– 138, 2004.
    Google ScholarLocate open access versionFindings
  • [Deb and Agrawal, 1994] Kalyanmoy Deb and Ram Bhusan Agrawal. Simulated binary crossover for continuous search space. Complex Systems, 9(2):115–148, 1994.
    Google ScholarLocate open access versionFindings
  • [Deb et al., 2000] Kalyanmoy Deb, Samir Agrawal, Amrit Pratap, and T Meyarivan. A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation - NSGA-II. PPSN, 2000.
    Google ScholarLocate open access versionFindings
  • [Drachen et al., 2009] Anders Drachen, Alessandro Canossa, and Georgios N Yannakakis. Player modeling using self-organization in tomb raider: Underworld. In 2009 IEEE symposium on computational intelligence and games, pages 1–8. IEEE, 2009.
    Google ScholarLocate open access versionFindings
  • [Holmgard et al., 2014] Christoffer Holmgard, Antonios Liapis, Julian Togelius, and Georgios N Yannakakis. Personas versus clones for player decision modeling. In International Conference on Entertainment Computing, pages 159–166.
    Google ScholarLocate open access versionFindings
  • [Isla, 2008] Damian Isla. Halo 3-building a better battle. In Game Developers Conference, GDC, 2008.
    Google ScholarLocate open access versionFindings
  • [Jaderberg et al., 2017] Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, et al. Population based training of neural networks. arXiv preprint arXiv:1711.09846, 2017.
    Findings
  • [Lehman and Stanley, 2011] Joel Lehman and Kenneth O. Stanley. Evolving a diversity of creatures through novelty search and local competition. In Genetic and Evolutionary Computation Conference, pages 211–218, July 2011.
    Google ScholarLocate open access versionFindings
  • [Li et al., 2019] Ang Li, Ola Spyra, Sagi Perel, Valentin Dalibard, Max Jaderberg, Chenjie Gu, David Budden, Tim Harley, and Pramod Gupta. A Generalized Framework for Population Based Training. CoRR, 2019.
    Google ScholarFindings
  • [Millington, 2019] Ian Millington. AI for Games, Third Edition. CRC Press, 2019.
    Google ScholarFindings
  • [Mnih et al., 2015] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.
    Google ScholarLocate open access versionFindings
  • [Mnih et al., 2016] Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy P Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu.
    Google ScholarFindings
  • Asynchronous Methods for Deep Reinforcement Learning. ICML, 2016.
    Google ScholarLocate open access versionFindings
  • [Mouret and Clune, 2015] Jean-Baptiste Mouret and Jeff Clune. Illuminating search spaces by mapping elites. arXiv:1504.04909, 2015.
    Findings
  • [Newzoo, 2019] Newzoo. Global games market report. https://newzoo.com/solutions/standard/market-forecasts/global-games-market-report, 2019.
    Findings
  • [Oh et al., 2019] Inseok Oh, Seungeun Rho, Sangbin Moon, Seongho Son, Hyoil Lee, and Jinyun Chung. Creating ProLevel AI for a Real-Time Fighting Game Using Deep Reinforcement Learning. arXiv.org, April 2019.
    Google ScholarFindings
  • [Ortega et al., 2013] Juan Ortega, Noor Shaker, Julian Togelius, and Georgios N Yannakakis. Imitating human playing styles in super mario bros. Entertainment Computing, 4(2):93–104, 2013.
    Google ScholarLocate open access versionFindings
  • [Rockstar Games, 2018] Rockstar Games. Red dead redemption 2.
    Google ScholarFindings
  • https://www.rockstargames.com/
    Findings
  • reddeadredemption2/, 2018.
    Google ScholarFindings
  • [Suay et al., 2016] Halit Bener Suay, Tim Brys, Matthew E Taylor, and Sonia Chernova. Learning from Demonstration for Shaping through Inverse Reinforcement Learning. AAMAS, 2016.
    Google ScholarLocate open access versionFindings
  • [Sun et al., 2020] Jianwen Sun, Tianwei Zhang, Xiaofei Xie, Lei Ma, Yan Zheng, Kangjie Chen, and Yang Liu. Stealthy and efficient adversarial attacks against deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 2020.
    Google ScholarLocate open access versionFindings
  • [Sutton and Barto, 2018] Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT press, 2018.
    Google ScholarFindings
  • [Szita et al., 2009] Istvan Szita, Marc Ponsen, and Pieter Spronck. Effective and diverse adaptive game AI. IEEE Transactions on Computational Intelligence and AI in Games, 1(1):16–27, 2009.
    Google ScholarLocate open access versionFindings
  • [Zheng et al., 2018] Yan Zheng, Zhaopeng Meng, Jianye Hao, Zongzhang Zhang, Tianpei Yang, and Changjie Fan. A deep bayesian policy reuse approach against nonstationary agents. In Advances in Neural Information Processing Systems, pages 954–964, 2018.
    Google ScholarLocate open access versionFindings
  • [Zheng et al., 2019] Yan Zheng, Xiaofei Xie, Ting Su, Lei Ma, Jianye Hao, Zhaopeng Meng, Yang Liu, Ruimin Shen, Yingfeng Chen, and Changjie Fan. Wuji: Automatic online combat game testing using evolutionary deep reinforcement learning. In International Conference on Automated Software Engineering (ASE), pages 772–784, 2019.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments