Sentiment-Aware Automatic Speech Recognition Pre-Training for Enhanced Speech Emotion Recognition

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)(2022)

引用 12|浏览27
暂无评分
摘要
We propose a novel multi-task pre-training method for Speech Emotion Recognition (SER). We pre-train SER model simultaneously on Automatic Speech Recognition (ASR) and sentiment classification tasks to make the acoustic ASR model more ``emotion aware''. We generate targets for the sentiment classification using text-to-sentiment model trained on publicly available data. Finally, we fine-tune the acoustic ASR on emotion annotated speech data. We evaluated the proposed approach on the MSP-Podcast dataset, where we achieved the best reported concordance correlation coefficient (CCC) of 0.41 for valence prediction.
更多
查看译文
关键词
Speech emotion recognition,automatic speech recognition,sentiment analysis,pre-training
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要