GM-TCNet: Gated Multi-scale Temporal Convolutional Network using Emotion Causality for Speech Emotion Recognition

Speech Communication(2022)

引用 2|浏览24
•. This paper proposes a novel network architecture called GM-TCNet for Speech Emotion Recognition based on the dilated causal convolutions and gating mechanism.•. A novel emotional causality representation learning component is designed to capture the dynamics of•emotion across time domain, and better model the speech emotions at the frame level. It also has a strong ability in building a reliable long-term sentimental dependency. To the best of our knowledge, this is the first attempt at applying the causality learning method to SER.•. GM-TCNet uses the skip connection among all Gated Convolution Blocks. It provides our network structure with a multi-scale temporal receptive field to improve its generalization ability. Moreover, a new dilated rate distribution of blocks is designed to obtain a larger receptive field, better fitting the SER applications.•. The proposed GM-TCNet approach gains state-of-the-art results in four widely studied datasets compared with other advanced approaches.
Speech Emotion Recognition,Temporal Convolution Network,Emotion Causality,Multi-Scale,Gating Mechanism
AI 理解论文
Chat Paper