CH-MEAD: A Chinese Multimodal Conversational Emotion Analysis Dataset with Fine-Grained Emotion Taxonomy

Yu-Ping Ruan,Shu-Kai Zheng, Jiantao Huang, Xiaoning Zhang, Yulong Liu,Taihao Li

2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC(2023)

引用 0|浏览1
暂无评分
摘要
Emotion recognition in conversations (ERC) is important for building human-like chatbots. Much research work has been devoted to the English community, however, there is still missing large-scale multimodal ERC dataset for the Chinese community. On the other hand, emotion states in the dynamic flow of conversations are typically subtle and complex. But currently most adopted emotion taxonomy for emotion recognition in conversations is based on Ekman's six basic emotion categories [1], which cannot cover the diverse set of emotions occurring in multi-turn dialogues, such as the emerging states of anxiety, puzzled and so on. In this paper, we present a Chinese multimodal conversational emotion analysis dataset (CH-MEAD), which contains 25, 292 video segments (utterances) labeled for 26 emotion categories (including neutral) and are collected from more than 400 Chinese TV series, movies and show. To our best knowledge, the CH-MEAD is the first largescale Chinese multimodal conversational dataset with fine-grained emotion taxonomy, which is suitable for studying the speaker's subtle and complex emotion states in dialogues and will also promote the development of multimodal ERC in Chinese research community. Based on the CH-MEAD dataset, we propose a speaker-aware contextual multimodal fusion (SCMF) network and demonstrate its efficiency over the current SOTA models in our experiments1.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要