Transfer Subspace Learning For Unsupervised Cross-Corpus Speech Emotion Recognition

IEEE ACCESS(2021)

引用 11|浏览5
暂无评分
摘要
In many practical applications, a speech emotion recognition model learned on a source (training) domain but applied to a novel target (testing) domain degenerates even significantly due to the mismatch between the two domains. Aiming at learning a better speech emotion recognition model for the target domain, the paper investigates this interesting problem, i.e., unsupervised cross-corpus speech emotion recognition (SER), in which the training and testing speech signals come from two different speech emotion corpora. Meanwhile, the training speech signals are labeled, while the label information of the testing speech signals is entirely unknown. To deal with this problem, we propose a simple yet effective method called transfer subspace learning (TRaSL). TRaSL aims at learning a projection matrix with which we can transform the source and target speech signals from the original feature space to the label space. The transformed source and target speech signals in the label space would share similar feature distributions. Consequently, the classifier learned on the labeled source speech signals can effectively predict the emotional states of the unlabeled target speech signals. To evaluate the performance of the proposed TRaSL method, we carry out extensive cross-corpus SER experiments on four speech emotion corpora including IEMOCAP, EmoDB, eNTERFACE, and AFEW 4.0. Compared with recent state-of-the-art cross-corpus SER methods, the proposed TRaSL can achieve more satisfactory overall results.
更多
查看译文
关键词
Speech recognition, Training, Testing, Emotion recognition, Support vector machines, Feature extraction, Transforms, Cross-corpus speech emotion recognition, subspace learning, transfer learning, domain adaptation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要