Dynamic Reasoning for Movie QA: A Character-Centric Approach

IEEE Transactions on Multimedia(2023)

引用 0|浏览1
暂无评分
摘要
Movie story understanding necessitates modeling of, and reasoning about characters and their relationships with the surroundings and others as the story goes. In Movie QA, this poses the challenges of effectively capturing the visual moments relevant to questions in long videos, and efficiently navigating the web of dynamic, contextual character-centric relationships over time. This paper presents a novel character-centric method that efficiently supports reasoning about relational dynamics for Movie QA. Central to the method is a T ime- E volving C onditional C H aracter-centric graph network ( ${\rm{TECH}}$ ) which models the characters, objects, and their question-conditioned relationships in space-time. ${\rm{TECH}}$ first maps the raw video data into a question-focused temporal neural graph over visual entities within and across shots and then distills the graph into a character-centric network which gives rise to the answer. At the core of this graph reasoning machine, TECH uses a two-stage feature refinement process for feature movie characters and their relationships, using their interactions with the surroundings as contextual information. ${\rm{TECH}}$ draws its efficiency over long videos from a “skim and scan” technique to rapidly localize the most query-relevant moments in the movie. Tested on the three large-scale datasets, TECH clearly shows advantages over recent state-of-the-art models.
更多
查看译文
关键词
movie question answering,dynamic relational reasoning,character-centric modeling,query-conditioned graph network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要