Learning Concept-Based Causal Transition and Symbolic Reasoning for Visual Planning
arxiv(2023)
摘要
Visual planning simulates how humans make decisions to achieve desired goals
in the form of searching for visual causal transitions between an initial
visual state and a final visual goal state. It has become increasingly
important in egocentric vision with its advantages in guiding agents to perform
daily tasks in complex environments. In this paper, we propose an interpretable
and generalizable visual planning framework consisting of i) a novel
Substitution-based Concept Learner (SCL) that abstracts visual inputs into
disentangled concept representations, ii) symbol abstraction and reasoning that
performs task planning via the self-learned symbols, and iii) a Visual Causal
Transition model (ViCT) that grounds visual causal transitions to semantically
similar real-world actions. Given an initial state, we perform goal-conditioned
visual planning with a symbolic reasoning method fueled by the learned
representations and causal transitions to reach the goal state. To verify the
effectiveness of the proposed model, we collect a large-scale visual planning
dataset based on AI2-THOR, dubbed as CCTP. Extensive experiments on this
challenging dataset demonstrate the superior performance of our method in
visual task planning. Empirically, we show that our framework can generalize to
unseen task trajectories, unseen object categories, and real-world data.
Further details of this work are provided at
https://fqyqc.github.io/ConTranPlan/.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要