Robust Light-Weight Facial Affective Behavior Recognition with CLIP
arxiv(2024)
摘要
Human affective behavior analysis aims to delve into human expressions and
behaviors to deepen our understanding of human emotions. Basic expression
categories (EXPR) and Action Units (AUs) are two essential components in this
analysis, which categorize emotions and break down facial movements into
elemental units, respectively. Despite advancements, existing approaches in
expression classification and AU detection often necessitate complex models and
substantial computational resources, limiting their applicability in everyday
settings. In this work, we introduce the first lightweight framework adept at
efficiently tackling both expression classification and AU detection. This
framework employs a frozen CLIP image encoder alongside a trainable multilayer
perceptron (MLP), enhanced with Conditional Value at Risk (CVaR) for robustness
and a loss landscape flattening strategy for improved generalization.
Experimental results on the Aff-wild2 dataset demonstrate superior performance
in comparison to the baseline while maintaining minimal computational demands,
offering a practical solution for affective behavior analysis. The code is
available at https://github.com/Purdue-M2/Affective_Behavior_Analysis_M2_PURDUE
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要