Counterfactual Scenario-relevant Knowledge-enriched Multi-modal Emotion Reasoning

ACM Transactions on Multimedia Computing, Communications, and Applications(2023)

引用 0|浏览165
暂无评分
摘要
Multi-modal video emotion reasoning (MERV) has recently attracted increasing attention due to its potential application in human-computer interaction. This task needs to not only recognize utterance-level emotions for conspicuous speakers, but also perceive the emotions of non-speakers in videos. Existing methods focus on modeling multi-modal multi-level contexts to capture emotion-relevant clues from the complex scenarios in videos. However, the context information is far from enough to infer the emotion labels of non-speakers due to the large gap between the scenario situation and emotions labels. Inspired by the observation that humans can find solutions to complex problems with the leverage of experience and knowledge, we propose SK-MER , a Scenario-relevant Knowledge-enhanced Multi-modal Emotion Reasoning framework for MERV task, which can leverage external knowledge to enhance the video scenario understanding and emotion reasoning. Specifically, we use scenario concepts extracted from videos to build knowledge subgraphs from external knowledge bases. The knowledge subgraphs are then utilized to obtain scenario-relevant knowledge representations through dynamic knowledge graph attention. Next, we incorporate the knowledge representations into context modeling to enhance emotion reasoning with external scenario-relevant knowledge. In addition, we propose a counterfactual knowledge representation learning approach to obtain more effective scenario-relevant knowledge representations. Extensive experimental results on MEmoR dataset show that the proposed SK-MER framework achieves new state-of-the-art results.
更多
查看译文
关键词
Neural networks, emotion reasoning, knowledge enhancement, counterfactual
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要