ZS-VAT: Learning Unbiased Attribute Knowledge for Zero-Shot Recognition Through Visual Attribute Transformer.

IEEE transactions on neural networks and learning systems(2024)

引用 0|浏览2
暂无评分
摘要
In zero-shot learning (ZSL), attribute knowledge plays a vital role in transferring knowledge from seen classes to unseen classes. However, most existing ZSL methods learn biased attribute knowledge, which usually results in biased attribute prediction and a decline in zero-shot recognition performance. To solve this problem and learn unbiased attribute knowledge, we propose a visual attribute Transformer for zero-shot recognition (ZS-VAT), which is an effective and interpretable Transformer designed specifically for ZSL. In ZS-VAT, we design an attribute-head self-attention (AHSA) that is capable of learning unbiased attribute knowledge. Specifically, each attribute head in AHSA first transforms the local features into attribute-reinforced features and then accumulates the attribute knowledge from all corresponding reinforced features, reducing the mutual influence between attributes and avoiding information loss. AHSA finally preserves unbiased attribute knowledge through attribute embeddings. We also propose an attribute fusion model (AFM) that learns to recover the correct category knowledge from the attribute knowledge. In particular, AFM takes all features from AHSA as input and generates global embeddings. We carried out experiments to demonstrate that the attribute knowledge from AHSA and the category knowledge from AFM are able to assist each other. During the final semantic prediction, we combine the attribute embedding prediction (AEP) and global embedding prediction (GEP). We evaluated the proposed scheme on three benchmark datasets. ZS-VAT outperformed the state-of-the-art generalized ZSL (GZSL) methods on two datasets and achieved competitive results on the other dataset.
更多
查看译文
关键词
Attribute,Transformer,unbiased attribute knowledge,zero-shot learning (ZSL)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要