Diffusing in Someone Else's Shoes: Robotic Perspective Taking with Diffusion
arxiv(2024)
摘要
Humanoid robots can benefit from their similarity to the human shape by
learning from humans. When humans teach other humans how to perform actions,
they often demonstrate the actions and the learning human can try to imitate
the demonstration. Being able to mentally transfer from a demonstration seen
from a third-person perspective to how it should look from a first-person
perspective is fundamental for this ability in humans. As this is a challenging
task, it is often simplified for robots by creating a demonstration in the
first-person perspective. Creating these demonstrations requires more effort
but allows for an easier imitation. We introduce a novel diffusion model aimed
at enabling the robot to directly learn from the third-person demonstrations.
Our model is capable of learning and generating the first-person perspective
from the third-person perspective by translating the size and rotations of
objects and the environment between two perspectives. This allows us to utilise
the benefits of easy-to-produce third-person demonstrations and easy-to-imitate
first-person demonstrations. The model can either represent the first-person
perspective in an RGB image or calculate the joint values. Our approach
significantly outperforms other image-to-image models in this task.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要