A cascaded spatiotemporal attention network for dynamic facial expression recognition

Yaoguang Ye, Yongqi Pan,Yan Liang,Jiahui Pan

Applied Intelligence(2022)

引用 1|浏览5
暂无评分
摘要
Dynamic facial expression recognition (DFER) is a promising research area because it concerns the dynamic change pattern of facial expressions, but it is difficult to effectively capture the facial appearances and dynamic temporal information of each image in an image sequence. In this paper, a cascaded spatiotemporal attention network (CSTAN) is proposed to learn and integrate spatial and temporal emotional information in the process of facial expression change. Three types of attention modules are embedded into the cascaded network to enable it to extract more informative spatiotemporal features for the DFER task in different dimensions. A channel attention module helps the network focus on the meaningful spatial feature maps for the DFER task, a spatial attention module focuses on the regions of interest among the spatial feature maps, and a temporal attention module aims to explore the dynamic temporal information when an expression changes. The experimental results on three public facial expression recognition datasets prove the good performance of the CSTAN, and it can extract representative spatiotemporal features. Meanwhile, the visualization results reveal that the CSTAN can locate regions of interest and contributing timesteps, which illustrates the effectiveness of the multidimensional attention modules.
更多
查看译文
关键词
Dynamic facial expression recognition, Spatiotemporal features, Cascaded network, Attention module
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要