Person Retrieval with Conv-Transformer.

ICME(2021)

引用 3|浏览10
暂无评分
摘要
Part-level features obtained by uniformly partitioning have attracted much attention in person re-identification. However, standard uniform part partitions may lead to within-part inconsistency across different samples, as shown in Figure 1. Attention mechanisms (e.g., refined part pooling) have been proposed to refine part division with enhanced consistency. Unfortunately, such mechanisms adopt single-headed convolutional structures, fail to fuse fine-grained part information. Besides, convolution-based schemes can maintain local positional information but cannot effectively pre-serve relative positions between parts. This paper proposes a new CNN-Transformer hyper architecture called the Person Retrieval with Conv-Transformer (PRCT). We integrate the multi-head self-attention and positional embedding module, which are the core ingredients of non-convolutional Transformer, with a CNN-based part-feature extractor to maintain more precise within-part consistency in feature aggregation. With PRCT, we can effectively eliminate the part mis-alignments when matching different samples. We conduct extensive evaluations on the MSMT17, DukeMTMC-ReID, and Market-1501 datasets and obtain state-of-the-art performance.
更多
查看译文
关键词
ReID,Attention,Transformer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要