Real-world Person Re-Identification via Degradation Invariance Learning

CVPR, pp. 14072-14082, 2020.

Cited by: 0|Bibtex|Views42|Links
EI
Keywords:
Natural Science Foundation of Chinalow resolution personmean Average PrecisionDegradation Decomposition Generative Adversarial Networkdegradation invarianceMore(7+)
Weibo:
There are still some practical issues that need to be solved for real-world surveillance scenarios, and low quality images caused by various degradation factors are one of them

Abstract:

Person re-identification (Re-ID) in real-world scenarios usually suffers from various degradation factors, e.g., low-resolution, weak illumination, blurring and adverse weather. On the one hand, these degradations lead to severe discriminative information loss, which significantly obstructs identity representation learning; on the other...More

Code:

Data:

0
Introduction
  • Person re-identification (Re-ID) is a pedestrian retrieval task for non-overlapping camera networks.
  • It is very challenging since the same identity captured by different cameras usually have significant variations in human pose, view, illumination conditions, resolution and so on.
  • These degradations lead to pool visual appearances and discriminative information loss, making representation learning more difficult; on the other hand, it brings the feature mismatch problem and greatly reduces the retrieval performance
Highlights
  • Person re-identification (Re-ID) is a pedestrian retrieval task for non-overlapping camera networks
  • There are still some practical issues that need to be solved for real-world surveillance scenarios, and low quality images caused by various degradation factors are one of them
  • We propose a Degradation-Invariant representation learning framework for real-world person ReID, named DI-REID
  • We propose a degradation invariance learning framework to extract robust identity representations for realworld person Re-ID
  • We aim to propose a general degradation invariance learning network against various real-world degradations under limited supervised information
  • To evaluate our approach on person Re-ID task against various real-world degradations, we focus on two major degradation factors, i.e., resolution and illumination
Methods
  • Note that the approach outperforms the best competitor [28] by 8.4% at rank-1 on the only real-world crossresolution dataset CAVIAR.
  • It proves the effectiveness of the approach to the real-world resolution degradation.
  • To demonstrate that the DI-REID is capable of dealing with various real-world degradation, extended evaluation on the real-world MSMT17 dataset is performed for cross-illumination Re-ID.
  • It is worth mentioning that the authors only use the illumination degradation prior without introducing extra structural or semantic priors of human body parts
Results
  • The authors use low-resolution images as the main demonstration, and experiments show that the approach is able to achieve state-of-the-art performance on several Re-ID benchmarks.
  • The authors introduce a new direction to improve the performance of person re-identification affected by various image degradations in real-world scenarios
Conclusion
  • The authors propose a degradation-invariance feature learning framework for real-world person Re-ID.
  • With the capability of disentangled representation and the selfsupervised learning, the method is able to capture and remove real-world degradation factors without extra labeled data.
  • The authors consider integrating other semisupervised feature representation methods, e.g., graph embedding [53], to better extract pedestrian features from noisy real-world data
Summary
  • Introduction:

    Person re-identification (Re-ID) is a pedestrian retrieval task for non-overlapping camera networks.
  • It is very challenging since the same identity captured by different cameras usually have significant variations in human pose, view, illumination conditions, resolution and so on.
  • These degradations lead to pool visual appearances and discriminative information loss, making representation learning more difficult; on the other hand, it brings the feature mismatch problem and greatly reduces the retrieval performance
  • Methods:

    Note that the approach outperforms the best competitor [28] by 8.4% at rank-1 on the only real-world crossresolution dataset CAVIAR.
  • It proves the effectiveness of the approach to the real-world resolution degradation.
  • To demonstrate that the DI-REID is capable of dealing with various real-world degradation, extended evaluation on the real-world MSMT17 dataset is performed for cross-illumination Re-ID.
  • It is worth mentioning that the authors only use the illumination degradation prior without introducing extra structural or semantic priors of human body parts
  • Results:

    The authors use low-resolution images as the main demonstration, and experiments show that the approach is able to achieve state-of-the-art performance on several Re-ID benchmarks.
  • The authors introduce a new direction to improve the performance of person re-identification affected by various image degradations in real-world scenarios
  • Conclusion:

    The authors propose a degradation-invariance feature learning framework for real-world person Re-ID.
  • With the capability of disentangled representation and the selfsupervised learning, the method is able to capture and remove real-world degradation factors without extra labeled data.
  • The authors consider integrating other semisupervised feature representation methods, e.g., graph embedding [53], to better extract pedestrian features from noisy real-world data
Tables
  • Table1: Cross-resolution Re-ID performance (%) compared to the state-of-the-art methods on the MLR-CUHK03, MLR-VIPeR and CAVIAR datasets, respectively
  • Table2: Cross-illumination Re-ID performance (%) compared to the state-of-the-art methods on the MSMT17 dataset
  • Table3: Ablation Study on the CAVIAR dataset
Download tables as Excel
Related work
  • Since our work is related with feature representation learning and GANs, we first briefly summarize these two aspects of works.

    2.1. Feature Representation Learning

    Person re-identification, including image-based Re-ID [24, 57] and video-based Re-ID [54, 31], is a very challenging task due to dramatic variations of human pose, camera view, occlusion, illumination, resolution and so on. An important objective of Re-ID is to learn identity representations, which are robust enough for the interference factors mentioned above. These interference factors can be roughly divided into high-level variations and low-level variations.

    Feature learning against high-level variations. Such variations include pose, view, occlusions, etc. Since these variations tend to be spatially sensitive, one typical solution is to leverage local features, i.e., pre-defined regional partition [39, 42, 43, 34], multi-scale feature fusion [33, 4, 60], attention-based models [2, 3, 25, 35, 56] and semantic parts extraction [11, 39, 22, 55]. These methods usually require auxiliary tasks, such as pose estimation or human parsing. The research line described above has been fully explored and will not be discussed in detail. In this work, we focus on the low-level variation problem.
Funding
  • This work was supported by the National Key R&D Program of China under Grant 2017YFB1300201 and 2017YFB1002203, the National Natural Science Foundation of China (NSFC) under Grants 61622211, U19B2038, 61901433, 61620106009 and 61732007 as well as the Fundamental Research Funds for the Central Universities under Grant WK2100100030
Reference
  • Slawomir Bak, Peter Carr, and Jean-Francois Lalonde. Domain adaptation through synthesis for unsupervised person re-identification. In Proceedings of the European Conference on Computer Vision (ECCV), pages 189–205, 2018.
    Google ScholarLocate open access versionFindings
  • Binghui Chen, Weihong Deng, and Jiani Hu. Mixed highorder attention network for person re-identification. In The IEEE International Conference on Computer Vision (ICCV), October 2019.
    Google ScholarLocate open access versionFindings
  • Tianlong Chen, Shaojin Ding, Jingyi Xie, Ye Yuan, Wuyang Chen, Yang Yang, Zhou Ren, and Zhangyang Wang. Abdnet: Attentive but diverse person re-identification. In The IEEE International Conference on Computer Vision (ICCV), October 2019.
    Google ScholarLocate open access versionFindings
  • Yanbei Chen, Xiatian Zhu, and Shaogang Gong. Person reidentification by deep learning multi-scale representations. In Proceedings of the IEEE International Conference on Computer Vision Workshops, pages 2590–2600, 2017.
    Google ScholarLocate open access versionFindings
  • Yun-Chun Chen, Yu-Jhe Li, Xiaofei Du, and YuChiang Frank Wang. Learning resolution-invariant deep representations for person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 8215–8222, 2019.
    Google ScholarLocate open access versionFindings
  • Dong Seon Cheng, Marco Cristani, Michele Stoppa, Loris Bazzani, and Vittorio Murino. Custom pictorial structures for re-identification. In Bmvc, volume 1, page 6, 2011.
    Google ScholarLocate open access versionFindings
  • Weijian Deng, Liang Zheng, Qixiang Ye, Guoliang Kang, Yi Yang, and Jianbin Jiao. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 994–1003, 2018.
    Google ScholarLocate open access versionFindings
  • Yixiao Ge, Zhuowan Li, Haiyu Zhao, Guojun Yin, Shuai Yi, Xiaogang Wang, et al. Fd-gan: Pose-guided feature distilling gan for robust person re-identification. In Advances in Neural Information Processing Systems, pages 1222–1233, 2018.
    Google ScholarLocate open access versionFindings
  • Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014.
    Google ScholarLocate open access versionFindings
  • Douglas Gray and Hai Tao. Viewpoint invariant pedestrian recognition with an ensemble of localized features. In European conference on computer vision, pages 262–275.
    Google ScholarLocate open access versionFindings
  • Jianyuan Guo, Yuhui Yuan, Lang Huang, Chao Zhang, JinGe Yao, and Kai Han. Beyond human parts: Dual partaligned representations for person re-identification. In The IEEE International Conference on Computer Vision (ICCV), October 2019.
    Google ScholarLocate open access versionFindings
  • Xiaojie Guo, Yu Li, and Haibin Ling. Lime: Low-light image enhancement via illumination map estimation. IEEE Transactions on image processing, 26(2):982–993, 2016.
    Google ScholarLocate open access versionFindings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
    Google ScholarLocate open access versionFindings
  • Alexander Hermans, Lucas Beyer, and Bastian Leibe. In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737, 2017.
    Findings
  • Ruibing Hou, Bingpeng Ma, Hong Chang, Xinqian Gu, Shiguang Shan, and Xilin Chen. Interaction-and-aggregation network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 9317–9326, 2019.
    Google ScholarLocate open access versionFindings
  • Ruibing Hou, Bingpeng Ma, Hong Chang, Xinqian Gu, Shiguang Shan, and Xilin Chen. Vrstc: Occlusion-free video person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7183–7192, 2019.
    Google ScholarLocate open access versionFindings
  • Xun Huang and Serge Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In ICCV, 2017.
    Google ScholarLocate open access versionFindings
  • Yukun Huang, Zheng-Jun Zha, Xueyang Fu, and Wei Zhang. Illumination-invariant person re-identification. In Proceedings of the 27th ACM International Conference on Multimedia, MM ’19, pages 365–373, New York, NY, USA, 2019. ACM.
    Google ScholarLocate open access versionFindings
  • Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134, 2017.
    Google ScholarLocate open access versionFindings
  • Jiening Jiao, Wei-Shi Zheng, Ancong Wu, Xiatian Zhu, and Shaogang Gong. Deep low-resolution person reidentification. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
    Google ScholarLocate open access versionFindings
  • Xiao-Yuan Jing, Xiaoke Zhu, Fei Wu, Xinge You, Qinglong Liu, Dong Yue, Ruimin Hu, and Baowen Xu. Superresolution person re-identification with semi-coupled lowrank discriminant dictionary learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 695–704, 2015.
    Google ScholarLocate open access versionFindings
  • Mahdi M Kalayeh, Emrah Basaran, Muhittin Gokmen, Mustafa E Kamasak, and Mubarak Shah. Human semantic parsing for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1062–1071, 2018.
    Google ScholarLocate open access versionFindings
  • Igor Kviatkovsky, Amit Adam, and Ehud Rivlin. Color invariants for person reidentification. IEEE Transactions on pattern analysis and machine intelligence, 35(7):1622–1634, 2012.
    Google ScholarLocate open access versionFindings
  • Wei Li, Rui Zhao, Tong Xiao, and Xiaogang Wang. Deepreid: Deep filter pairing neural network for person reidentification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 152–159, 2014.
    Google ScholarLocate open access versionFindings
  • Wei Li, Xiatian Zhu, and Shaogang Gong. Harmonious attention network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2285–2294, 2018.
    Google ScholarLocate open access versionFindings
  • Xiang Li, Ancong Wu, and Wei-Shi Zheng. Adversarial open-world person re-identification. In Proceedings of the
    Google ScholarLocate open access versionFindings
  • Xiang Li, Wei-Shi Zheng, Xiaojuan Wang, Tao Xiang, and Shaogang Gong. Multi-scale learning for low-resolution person re-identification. In Proceedings of the IEEE International Conference on Computer Vision, pages 3765–3773, 2015.
    Google ScholarLocate open access versionFindings
  • Yu-Jhe Li, Yun-Chun Chen, Yen-Yu Lin, Xiaofei Du, and Yu-Chiang Frank Wang. Recover and identify: A generative dual model for cross-resolution person re-identification. In The IEEE International Conference on Computer Vision (ICCV), October 2019.
    Google ScholarLocate open access versionFindings
  • Kevin Lin, Dianqi Li, Xiaodong He, Zhengyou Zhang, and Ming-Ting Sun. Adversarial ranking for language generation. In Advances in Neural Information Processing Systems, pages 3155–3165, 2017.
    Google ScholarLocate open access versionFindings
  • Jiawei Liu, Zheng-Jun Zha, Di Chen, Richang Hong, and Meng Wang. Adaptive transfer network for cross-domain person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages
    Google ScholarLocate open access versionFindings
  • 7202–7211, 2019.
    Google ScholarLocate open access versionFindings
  • [31] Jiawei Liu, Zheng-Jun Zha, Xuejin Chen, Zilei Wang, and Multimedia Computing, Communications, and Applications (TOMM), 15(1s):1–19, 2019.
    Google ScholarLocate open access versionFindings
  • [32] Jiawei Liu, Zheng-Jun Zha, Richang Hong, Meng Wang, and Yongdong Zhang. Deep adversarial graph attention convolution network for text-based person search. In Proceedings of the 27th ACM International Conference on Multimedia, pages 665–673, 2019.
    Google ScholarLocate open access versionFindings
  • [33] Jiawei Liu, Zheng-Jun Zha, QI Tian, Dong Liu, Ting Yao, Qiang Ling, and Tao Mei. Multi-scale triplet cnn for person re-identification. In Proceedings of the 24th ACM international conference on Multimedia, pages 192–196, 2016.
    Google ScholarLocate open access versionFindings
  • [34] Jiawei Liu, Zheng-Jun Zha, Hongtao Xie, Zhiwei Xiong, and Yongdong Zhang. Ca3net: Contextual-attentional attributeappearance network for person re-identification. In Proceedings of the 26th ACM international conference on Multimedia, pages 737–745, 2018.
    Google ScholarLocate open access versionFindings
  • [35] Xihui Liu, Haiyu Zhao, Maoqing Tian, Lu Sheng, Jing Shao, Shuai Yi, Junjie Yan, and Xiaogang Wang. Hydraplus-net: Attentive deep features for pedestrian analysis. In Proceedings of the IEEE international conference on computer vision, pages 350–359, 2017.
    Google ScholarLocate open access versionFindings
  • [36] Hao Luo, Youzhi Gu, Xingyu Liao, Shenqi Lai, and Wei Jiang. Bag of tricks and a strong baseline for deep person reidentification. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019.
    Google ScholarLocate open access versionFindings
  • [37] Liqian Ma, Qianru Sun, Stamatios Georgoulis, Luc Van Gool, Bernt Schiele, and Mario Fritz. Disentangled person image generation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages
    Google ScholarFindings
  • [38] Shunan Mao, Shiliang Zhang, and Ming Yang. Resolutioninvariant person re-identification. arXiv preprint arXiv:1906.09748, 2019.
    Findings
  • [39] Jiaxu Miao, Yu Wu, Ping Liu, Yuhang Ding, and Yi Yang. Pose-guided feature alignment for occluded person re-identification. In The IEEE International Conference on Computer Vision (ICCV), October 2019.
    Google ScholarLocate open access versionFindings
  • [40] Xuelin Qian, Yanwei Fu, Tao Xiang, Wenxuan Wang, Jie Qiu, Yang Wu, Yu-Gang Jiang, and Xiangyang Xue. Posenormalized image generation for person re-identification. In Proceedings of the European Conference on Computer Vision (ECCV), pages 650–667, 2018.
    Google ScholarLocate open access versionFindings
  • [41] Chi Su, Jianing Li, Shiliang Zhang, Junliang Xing, Wen Gao, and Qi Tian. Pose-driven deep convolutional model for person re-identification. In Proceedings of the IEEE International Conference on Computer Vision, pages 3960–3969, 2017.
    Google ScholarLocate open access versionFindings
  • [42] Yifan Sun, Qin Xu, Yali Li, Chi Zhang, Yikang Li, Shengjin Wang, and Jian Sun. Perceive where to focus: Learning visibility-aware part-level features for partial person reidentification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 393–402, 2019.
    Google ScholarLocate open access versionFindings
  • [43] Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, and Shengjin Wang. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proceedings of the European Conference on Computer Vision (ECCV), pages 480–496, 2018.
    Google ScholarLocate open access versionFindings
  • [44] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
    Google ScholarLocate open access versionFindings
  • [45] Rahul Rama Varior, Gang Wang, Jiwen Lu, and Ting Liu. Learning invariant color features for person reidentification. IEEE Transactions on Image Processing, 25(7):3395–3410, 2016.
    Google ScholarLocate open access versionFindings
  • [46] Ruixing Wang, Qing Zhang, Chi-Wing Fu, Xiaoyong Shen, Wei-Shi Zheng, and Jiaya Jia. Underexposed photo enhancement using deep illumination estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6849–6857, 2019.
    Google ScholarLocate open access versionFindings
  • [47] Zheng Wang, Ruimin Hu, Yi Yu, Junjun Jiang, Chao Liang, and Jinqiao Wang. Scale-adaptive low-resolution person reidentification via learning a discriminating surface. In IJCAI, pages 2669–2675, 2016.
    Google ScholarLocate open access versionFindings
  • [48] Zhixiang Wang, Zheng Wang, Yinqiang Zheng, Yung-Yu Chuang, and Shin’ichi Satoh. Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 618–626, 2019.
    Google ScholarLocate open access versionFindings
  • [49] Zheng Wang, Mang Ye, Fan Yang, Xiang Bai, and Shin’ichi Satoh. Cascaded sr-gan for scale-adaptive low resolution person re-identification. In IJCAI, pages 3891–3897, 2018.
    Google ScholarLocate open access versionFindings
  • [50] Longhui Wei, Shiliang Zhang, Wen Gao, and Qi Tian. Person transfer gan to bridge domain gap for person reidentification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 79–88, 2018.
    Google ScholarLocate open access versionFindings
  • [51] Longhui Wei, Shiliang Zhang, Hantao Yao, Wen Gao, and Qi Tian. Glad: Global-local-alignment descriptor for pedestrian retrieval. In Proceedings of the 25th ACM international conference on Multimedia, pages 420–428. ACM, 2017.
    Google ScholarLocate open access versionFindings
  • [52] Zheng-Jun Zha, Jiawei Liu, Di Chen, and Feng Wu. Adversarial attribute-text embedding for person search with natural language query. IEEE Transactions on Multimedia, 2020.
    Google ScholarLocate open access versionFindings
  • [53] Hanwang Zhang, Zheng-Jun Zha, Yang Yang, Shuicheng Yan, and Tat-Seng Chua. Robust (semi) nonnegative graph embedding. IEEE transactions on image processing, 23(7):2996–3012, 2014.
    Google ScholarLocate open access versionFindings
  • [54] Wei Zhang, Shengnan Hu, Kan Liu, and Zhengjun Zha. Learning compact appearance representation for video-based person re-identification. IEEE Transactions on Circuits and Systems for Video Technology, 29(8):2442–2452, 2018.
    Google ScholarLocate open access versionFindings
  • [55] Haiyu Zhao, Maoqing Tian, Shuyang Sun, Jing Shao, Junjie Yan, Shuai Yi, Xiaogang Wang, and Xiaoou Tang. Spindle net: Person re-identification with human body region guided feature decomposition and fusion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1077–1085, 2017.
    Google ScholarLocate open access versionFindings
  • [56] Liming Zhao, Xi Li, Yueting Zhuang, and Jingdong Wang. Deeply-learned part-aligned representations for person reidentification. In Proceedings of the IEEE International Conference on Computer Vision, pages 3219–3228, 2017.
    Google ScholarLocate open access versionFindings
  • [57] Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. Scalable person re-identification: A benchmark. In Proceedings of the IEEE International Conference on Computer Vision, pages 1116–1124, 2015.
    Google ScholarLocate open access versionFindings
  • [58] Zhedong Zheng, Xiaodong Yang, Zhiding Yu, Liang Zheng, Yi Yang, and Jan Kautz. Joint discriminative and generative learning for person re-identification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
    Google ScholarLocate open access versionFindings
  • [59] Zhun Zhong, Liang Zheng, Zhedong Zheng, Shaozi Li, and Yi Yang. Camera style adaptation for person reidentification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5157– 5166, 2018.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments