FF-GLAM-cs: a fusion framework based on GLAM with channel shuffle for speech emotion recognition

Jinfeng Wang, Zhishen Zheng,Yong Liang,Jing Qin,Wenzhong Wang

International Journal of Machine Learning and Cybernetics（2024）

引用 0|浏览5

暂无评分

摘要

With the support of artificial intelligence, speech emotion recognition is integrated into people’s daily lives in the form of smart speakers. Some high-accuracy models have poor applicability due to their large size. However, the accuracy of lightweight models is unsatisfactory. In this article, an integrated framework based on GLobal-Aware Multiscale with channel shuffle (FF-GLAM-cs) is proposed, fusing multiple lightweight models to ensure a small size and high accuracy. Channel shuffle is added to solve the computational duplication caused by multiscale convolution. In addition, a fuzzy integral fusion method is adopted that describes the interactions of classifiers ignored by traditional methods. The impact of different combinations of classifiers is analyzed. In experiments, the performance of the new model and the effect of fusion are verified by the model to four speech emotion datasets. The model is validated and analyzed with parameter and ablation testing, kernel model comparisons and fusion verification. The results show that FF-GLAM-cs is superior to state-of-the-art methods in terms of accuracy and efficiency. In particular, the fusion module presents excellent improvements in accuracy. The source code of this work is available at https://github.com/zhishen33/GLAM-cs-and-FI-for-SER.git .

查看译文

关键词

Speech emotion recognition,Multiscale features,Fuzzy integral,Deep learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要