Challenges for learning in complex environments

Genetic and Evolutionary Computation Conference(2019)

引用 0|浏览48
暂无评分
摘要
ABSTRACTDeep reinforcement learning has rapidly grown as a research field with far-reaching potential for artificial intelligence. Games and simple physical simulations have been used as the main benchmark domains for many fundamental developments. As the field matures, it is important to develop more sophisticated learning systems with the aim of solving more complex real-world tasks, but problems like catastrophic forgetting remain critical, and important capabilities such as skill composition through curriculum learning remain unsolved. Continual learning is an important challenge for reinforcement learning, because RL agents are trained sequentially, in interactive environments, and are especially vulnerable to the phenomena of catastrophic forgetting and catastrophic interference. Successful methods for continual learning have broad potential, because they could enable agents to learn multiple skills, potentially enabling complex behaviors. In particular, while deep learning has shown excellent progress towards training systems to perform with human or superhuman ability on various tasks (domains like vision, speech, and language as well as games such as Starcraft and Go), the resulting systems are still sluggish to respond to new information, or non-stationarities in the environment, compared to humans. Learning algorithms do exist that can quickly adapt to new data, but these are often at odds with large-scale deep learning systems. Meta-learning is one example of a learning paradigm that may not have this dilemma and thus holds promise as a framework for supporting fast and slow learning in a single learner. In this framework, one could view the learning process as having two levels of optimisation: an outer loop, which might adapt slowly towards a "species" level of optimisation, tailored for an environment, a morphology, and a family of skills or tasks; and an inner loop which allows an individual agent to more quickly adapt and diversify in response to a lifetime of experiences. I would argue that model-free deep reinforcement learning is an effective algorithm for optimising the outer loop of this process, but it may not be as successful as an algorithm for effective lifelong learning - the inner loop of the process.
更多
查看译文
关键词
Machine learning, Deep learning, Meta-learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要