Cross-Task Multi-Branch Vision Transformer for Facial Expression and Mask Wearing Classification
arxiv(2024)
摘要
With wearing masks becoming a new cultural norm, facial expression
recognition (FER) while taking masks into account has become a significant
challenge. In this paper, we propose a unified multi-branch vision transformer
for facial expression recognition and mask wearing classification tasks. Our
approach extracts shared features for both tasks using a dual-branch
architecture that obtains multi-scale feature representations. Furthermore, we
propose a cross-task fusion phase that processes tokens for each task with
separate branches, while exchanging information using a cross attention module.
Our proposed framework reduces the overall complexity compared with using
separate networks for both tasks by the simple yet effective cross-task fusion
phase. Extensive experiments demonstrate that our proposed model performs
better than or on par with different state-of-the-art methods on both facial
expression recognition and facial mask wearing classification task.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要