An Adversarial Objective for Scalable Exploration

2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS)(2021)

引用 7|浏览38
暂无评分
摘要
Collecting new experience is costly in many robotic tasks, so determining how to efficiently explore in a new environment to learn as much as possible in as few trials as possible is an important problem for robotics. In this paper, we propose a method for exploring for the purpose of learning a dynamics model. Our key idea is to minimize a score given by a discriminator network as an objective for a planner which chooses actions. This discriminator is optimized jointly with a prediction model and enables our active learning approach to sample sequences of observations and actions which result in predictions considered the least realistic by the discriminator. Comparable existing exploration methods cannot operate in many prediction-planning pipelines used in robotic learning without hardware modifications to standard robotics platforms in order to accommodate their large compute requirements, so the primary contribution of our adversarial exploration method is scalability. We demonstrate progressively increased performance of our adversarial exploration approach compared to leading model-based exploration strategies as compute is restricted in simulated environments. We further demonstrate the ability of our adversarial method to scale to a robotic manipulation prediction-planning pipeline where we improve sample efficiency and prediction performance for a domain transfer problem.
更多
查看译文
关键词
Curiosity,Scalability,Active learning,Discriminator,Machine learning,Robotics,Computer science,Adversarial system,Transfer problem,Pipeline transport,Artificial intelligence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要