Learning Policies by Learning Rules

IEEE Robotics and Automation Letters(2022)

引用 1|浏览20
暂无评分
摘要
Efficiently learning interpretable policies for complex tasks from demonstrations is a challenging problem. We present Hierarchical Inference with Logical Options (HILO), a novel learning algorithm that learns to imitate expert demonstrations by learning the rules that the expert is following. The rules are represented as linear temporal logic (LTL) formulas, which are interpretable and capable of encoding complex behaviors. Unlike previous works, which learn rules from high-level propositions, HILO learns rules by taking both propositions and low-level trajectories as input. It does this by defining a Bayesian model over LTL formulas, propositions, and low-level trajectories. The Bayesian model bridges the gap from formula to low-level trajectory by using a planner to find an optimal policy for a given LTL formula. Stochastic variational inference is then used to find a posterior distribution over formulas and policies given expert demonstrations. We show that by learning rules from both propositions and low-level states, HILO outperforms previous work on a rule-learning task and on four planning tasks while needing less data. We also validate HILO in the real world by teaching a robotic arm a complex packing task.
更多
查看译文
关键词
Imitation learning,learning from demonstration,probabilistic inference,task and motion planning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要