Hierarchical Classifier Design For Speech Emotion Recognition In The Mixed-Cultural Environment

JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE(2021)

引用 5|浏览9
暂无评分
摘要
Recognition of emotion in speech is a difficult task due to many speaker factors like gender, age, and the cultural background (nationality, ethnicity, and region) as well as the acoustical environment. Among these factors, the cultural background of the speaker has a strong influence on the expression of emotion. The reason for the unsatisfactory performance of an emotion recognition engine built using mixed-cultural samples can be traced back to this. To address this issue, a two-level hierarchical engine has been designed to identify emotion from the speech of different cultural backgrounds. The first level of the hierarchical engine is a culture identification system, which identifies the corpus of an input utterance. As most of the speakers involved in the construction of a specific corpus are from the same locality and cultural background, we assume that a corpus represents the cultural background of the speakers of the corpus constructed. Based on the response of the first level classifier, the input utterance is forwarded to an appropriate corpus-specific emotion recognition engine, in the second level. Each corpus-specific emotion recognition system is a discriminative, multiclass SVM classifier, trained with the emotional utterances of that particular corpus. The system has been tested with five different corpora, collected from diverse cultural backgrounds, namely EMO-DB, SAVEE, IITKGP-SEC, Spanish corpus S0329, and CMU's Woogles corpus. The system achieved an accuracy of 82.01% which is an improvement of 13.38% over monolithic approaches.
更多
查看译文
关键词
Speech emotion classification, hierarchical classification system, integrated corpus environment
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要