Top-K Visual Tokens Transformer: Selecting Tokens for Visible-Infrared Person Re-Identification

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2023)

引用 2|浏览0
暂无评分
摘要
Visible modality and infrared modality person re-identification (VI-ReID) is an extremely important and challenging task. Existing works mainly focus on reducing the modality gap with Convolutional Neural Networks (CNN). However, the features extracted by CNN may contain useless identity-irrelevant information, which inevitably reduces the discrimination of features. To address this issue, this paper introduces a Top-K Visual Tokens Transformer (TVTR) framework which utilizes a top-k visual tokens selection module to accurately select top-k discriminative visual patches for reducing the distraction of identity-irrelevant information and learning discriminative features. Furthermore, a global-local circle loss is developed to optimize the TVTR for achieving cross-modality positive concentration and negative separation properties. The experimental results on SYSU-MM01 and RegDB datasets demonstrate the superiority of our method. The source code will be released.
更多
查看译文
关键词
Person re-identification,cross-modality,visible-infrared,vision transformer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要