Rethinking Mutual Information for Language Conditioned Skill Discovery on Imitation Learning
CoRR(2024)
摘要
Language-conditioned robot behavior plays a vital role in executing complex
tasks by associating human commands or instructions with perception and
actions. The ability to compose long-horizon tasks based on unconstrained
language instructions necessitates the acquisition of a diverse set of
general-purpose skills. However, acquiring inherent primitive skills in a
coupled and long-horizon environment without external rewards or human
supervision presents significant challenges. In this paper, we evaluate the
relationship between skills and language instructions from a mathematical
perspective, employing two forms of mutual information within the framework of
language-conditioned policy learning. To maximize the mutual information
between language and skills in an unsupervised manner, we propose an end-to-end
imitation learning approach known as Language Conditioned Skill Discovery
(LCSD). Specifically, we utilize vector quantization to learn discrete latent
skills and leverage skill sequences of trajectories to reconstruct high-level
semantic instructions. Through extensive experiments on language-conditioned
robotic navigation and manipulation tasks, encompassing BabyAI, LORel, and
CALVIN, we demonstrate the superiority of our method over prior works. Our
approach exhibits enhanced generalization capabilities towards unseen tasks,
improved skill interpretability, and notably higher rates of task completion
success.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要