Beyond Lowest-Warping Cost Action Selection In Trajectory Transfer

Robotics and Automation(2015)

引用 6|浏览163
暂无评分
摘要
We consider the problem of learning from demonstrations to manipulate deformable objects. Recent work [1], [2], [3] has shown promising results that enable robotic manipulation of deformable objects through learning from demonstrations. Their approach is able to generalize from a single demonstration to new test situations, and suggests a nearest neighbor approach to select a demonstration to adapt to a given test situation. Such a nearest neighbor approach, however, ignores important aspects of the problem: brittleness (versus robustness) of demonstrations when generalized through this process, and the extent to which a demonstration makes progress towards a goal.In this paper, we frame the problem of selecting which demonstration to transfer as an options Markov decision process (MDP). We present max-margin Q-function estimation: an approach to learn a Q-function from expert demonstrations. Our learned policies account for variability in robustness of demonstrations and the sequential nature of our tasks. We developed two knot-tying benchmarks to experimentally validate the effectiveness of our proposed approach. The selection strategy described in [2] achieves success rates of 70% and 54%, respectively. Our approach performs significantly better, with success rates of 88% and 76%, respectively.
更多
查看译文
关键词
Markov processes,end effectors,learning by example,Markov decision process,end-effector,learning from demonstrations,max-margin Q-function estimation,trajectory transfer,warping cost action selection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要