Multiple Expert Brainstorming for Domain Adaptive Person Re-identification

Cited by: 0|Bibtex|Views51|Links
Keywords:
id modelNatural Science Foundation of Chinaauthority regularizationPerson re-IDCumulative Matching CharacteristicMore(21+)
Weibo:
– We propose a novel multiple expert brainstorming network based on mutual learning among expert models, each of which is equipped with knowledge of an architecture

Abstract:

Often the best performing deep neural models are ensembles of multiple base-level networks, nevertheless, ensemble learning with respect to domain adaptive person re-ID remains unexplored. In this paper, we propose a multiple expert brainstorming network (MEB-Net) for domain adaptive person re-ID, opening up a promising direction about ...More
0
Introduction
  • Person re-identification aims to match persons in an image gallery collected from non-overlapping camera networks [40], [14], [16].
  • The first category attempts to align feature distributions between source and target domains [35], [39], aiming to minimize the inter-domain gap for optimal adaptation.
  • How to leverage specific features and knowledge of multiple networks and optimally adapt them to an unlabelled target domain remains to be elaborated
Highlights
  • Person re-identification aims to match persons in an image gallery collected from non-overlapping camera networks [40], [14], [16]
  • We present an multiple expert brainstorming network (MEBNet), which learns and adapts multiple networks with different architectures for optimal re-ID in an unlabelled target domain
  • – We propose a novel multiple expert brainstorming network (MEB-Net) based on mutual learning among expert models, each of which is equipped with knowledge of an architecture
  • MEB-Net is trained by two stages: pre-training in source domains and the adaptation in target domains
  • An authority regularization scheme was introduced to tackle the heterogeneity of experts
  • MEB-Net performs significantly better than all compared methods
  • Experiments demonstrated the effectiveness of MEB-Net for improving the discrimination ability of re-ID models
Methods
  • MAP

    LOMO[20] Bow[47] UMDL[25] MMFA[21] TJ-AIDL[35] UCDA-CCE[26] ATNet[22] SPGAN+LMP[5] CamStyle[51] HHL[49] ECN[50] PDA-Net[18] PUL[6] UDAP[30] PCB-PAST[44] SSG[8] MMT-500[9] MEB-Net(Ours)

    Market-1501 R-1 R-5 R-10

    DukeMTMC-reID mAP R-1 R-5 R-10

    mini-batch k-means clustering and the number of groups Mt is set as 500 for all target datasets.
  • The authors compare MEB-Net with state-of-the-art methods including: hand-crafted feature approaches (LOMO[20], BOW[47], UMDL[25]), feature alignment based methods (MMFA[21], TJ-AIDL[35], UCDA-CCE[26]), GAN-based methods (SPGAN [5], ATNet[22], CamStyle[51], HHL[49], ECN[50] and PDA-Net[18]), pseudolabel prediction based methods (PUL[6], UDAP[30], PCB-PAST[44], SSG[8] MMT[9]).
  • As Table.1 shows, MEB-Net outperforms hand-crafted feature approaches including LOMO, BOW and UMDL by large margins, as deep network can learn more discriminative representations than hand-crafted features
Results
  • The authors use one dataset as the target domain and the other as the source domain.
  • MEB-Net is trained by two stages: pre-training in source domains and the adaptation in target domains.
  • The authors adopt three architectures: DenseNet-121 [12], ResNet-50 [10] and Inception-v3 [32] as backbone networks for the three experts, and initialize them by using parameters pre-trained on the ImageNet [4].
  • The initial learning rate is set to 0.00035 and is decreased to 1/10 of its previous value on the 40th and 70th epoch in the total 80 epochs
Conclusion
  • 3. The baseline model ensemble uses all networks to extract average features of unlabelled samples for pseudo-label prediction, but without mutual learning among them while adaptation in the target domain.
  • The improvement of baseline ensemble than single model transfer is because of more accurate pseudo-labels.
  • MEB-Net performs significantly better than all compared methods.
  • It validates that MEB-Net provides a more effective ensemble method with respect to domain adaptive person re-ID.The paper proposed a multiple expert brainstorming network (MEB-Net) for domain adaptive person re-ID.
  • The authors' approach efficiently assembled discrimination capability of multiple networks while requiring solely a single model during inference time throughout
Summary
  • Introduction:

    Person re-identification aims to match persons in an image gallery collected from non-overlapping camera networks [40], [14], [16].
  • The first category attempts to align feature distributions between source and target domains [35], [39], aiming to minimize the inter-domain gap for optimal adaptation.
  • How to leverage specific features and knowledge of multiple networks and optimally adapt them to an unlabelled target domain remains to be elaborated
  • Objectives:

    The authors aim to leverage the labelled sample images in S and the unlabelled sample images in T to learn a transferred re-ID model for the target-domain T.
  • Methods:

    MAP

    LOMO[20] Bow[47] UMDL[25] MMFA[21] TJ-AIDL[35] UCDA-CCE[26] ATNet[22] SPGAN+LMP[5] CamStyle[51] HHL[49] ECN[50] PDA-Net[18] PUL[6] UDAP[30] PCB-PAST[44] SSG[8] MMT-500[9] MEB-Net(Ours)

    Market-1501 R-1 R-5 R-10

    DukeMTMC-reID mAP R-1 R-5 R-10

    mini-batch k-means clustering and the number of groups Mt is set as 500 for all target datasets.
  • The authors compare MEB-Net with state-of-the-art methods including: hand-crafted feature approaches (LOMO[20], BOW[47], UMDL[25]), feature alignment based methods (MMFA[21], TJ-AIDL[35], UCDA-CCE[26]), GAN-based methods (SPGAN [5], ATNet[22], CamStyle[51], HHL[49], ECN[50] and PDA-Net[18]), pseudolabel prediction based methods (PUL[6], UDAP[30], PCB-PAST[44], SSG[8] MMT[9]).
  • As Table.1 shows, MEB-Net outperforms hand-crafted feature approaches including LOMO, BOW and UMDL by large margins, as deep network can learn more discriminative representations than hand-crafted features
  • Results:

    The authors use one dataset as the target domain and the other as the source domain.
  • MEB-Net is trained by two stages: pre-training in source domains and the adaptation in target domains.
  • The authors adopt three architectures: DenseNet-121 [12], ResNet-50 [10] and Inception-v3 [32] as backbone networks for the three experts, and initialize them by using parameters pre-trained on the ImageNet [4].
  • The initial learning rate is set to 0.00035 and is decreased to 1/10 of its previous value on the 40th and 70th epoch in the total 80 epochs
  • Conclusion:

    3. The baseline model ensemble uses all networks to extract average features of unlabelled samples for pseudo-label prediction, but without mutual learning among them while adaptation in the target domain.
  • The improvement of baseline ensemble than single model transfer is because of more accurate pseudo-labels.
  • MEB-Net performs significantly better than all compared methods.
  • It validates that MEB-Net provides a more effective ensemble method with respect to domain adaptive person re-ID.The paper proposed a multiple expert brainstorming network (MEB-Net) for domain adaptive person re-ID.
  • The authors' approach efficiently assembled discrimination capability of multiple networks while requiring solely a single model during inference time throughout
Tables
  • Table1: Comparison with state-of-the-art methods: For the adaptation on Market1501 and that on DukeMTMC-reID. The top-three results are highlighted with bold, italic, and underline fonts, respectively
  • Table2: Ablation studies: Supervised Models: - Re-ID models trained using the labelled target-domain training images. Direct Transfer: - Re-ID models trained by labelled source-domain training images. Lvot (Eq 12), ΘT (Eq 4), Lmid (Eq 6) and Lmtri (Eq 9) are described in Sec. 3.4. AR: Authority Regularization as described in Sec. 3.5
  • Table3: mAP (%) of networks of different architectures for DukeMTMC-reID → Market-1501 transfer: Supervised - supervised models; Dire. tran. - direct transfer; Sing. tran. - single model transfer; Base. ens. - baseline ensemble
Download tables as Excel
Related work
  • 2.1 Unsupervised Domain Adaptive Re-ID

    Unsupervised domain adaptation (UDA) for person re-ID defines a learning problem for target domains where source domains are fully labeled while sample labels in target domains are totally unknown. Methods have been extensively explored in recent years, which take three typical approaches as follows.

    Feature distribution alignment. In [21], Lin et al proposed minimizing the distribution variation of the source’s and the target’s mid-level features based on Maximum Mean Discrepancy (MMD) distance. Wang et al [35] utilized additional attribute annotations to align feature distributions of source and target domains in a common space.

    Image-style transformation. GAN-based methods have been extensively explored for domain adaptive person re-ID [24], [49], [36], [5], [22]. HHL [49] simultaneously enforced cameras invariance and domain connectedness to improve the generalization ability of models on the target set. PTGAN [36], SPGAN [5], ATNet [22] and PDA-Net [18] transferred images with identity labels from source into target domains to learn discriminative models.
Funding
  • This work is partially supported by grants from the National Key R&D Program of China under grant 2017YFB1002400, the National Natural Science Foundation of China (NSFC) under contract No 61825101, U1611461 and 61836012
Study subjects and analysis
epochs for both datasets: 20
As shown in Fig. 4, the models become stronger when the iterative clustering proceeds. The performance is improved in early epochs, and finally converges after 20 epochs for both datasets. The paper proposed a multiple expert brainstorming network (MEB-Net) for domain adaptive person re-ID

Reference
  • Anil, R., Pereyra, G., Passos, A., Ormandi, R., Dahl, G.E., Hinton, G.E.: Large scale distributed neural network training through online distillation. arXiv preprint arXiv:1804.03235 (2018)
    Findings
  • Bagherinezhad, H., Horton, M., Rastegari, M., Farhadi, A.: Label refinery: Improving imagenet classification through label progression. arXiv preprint arXiv:1805.02641 (2018)
    Findings
  • Chen, T., Goodfellow, I., Shlens, J.: Net2net: Accelerating learning via knowledge transfer. arXiv preprint arXiv:1511.05641 (2015)
    Findings
  • Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.: Imagenet: A large-scale hierarchical image database. In: IEEE CVPR (2009)
    Google ScholarFindings
  • Deng, W., Zheng, L., Ye, Q., Kang, G., Yang, Y., Jiao, J.: Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person reidentification. In: IEEE CVPR (2018)
    Google ScholarFindings
  • Fan, H., Zheng, L., Yan, C., Yang, Y.: Unsupervised person re-identification: Clustering and fine-tuning. TOMCCAP 14(4), 83:1–83:18 (2018)
    Google ScholarLocate open access versionFindings
  • Fan, H., Zheng, L., Yang, Y.: Unsupervised person re-identification: Clustering and fine-tuning. CoRR abs/1705.10444 (2017)
    Findings
  • Fu, Y., Wei, Y., Wang, G., Zhou, Y., Shi, H., Huang, T.S.: Self-similarity grouping: A simple unsupervised cross domain adaptation approach for person reidentification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). pp. 6112–6121 (2019)
    Google ScholarLocate open access versionFindings
  • Ge, Y., Chen, D., Li, H.: Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv preprint arXiv:2001.01526 (2020)
    Findings
  • He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016)
    Google ScholarLocate open access versionFindings
  • Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
    Findings
  • Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). pp. 4700–4708 (2017)
    Google ScholarLocate open access versionFindings
  • Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.Q.: Deep networks with stochastic depth. In: European conference on computer vision (ECCV). pp. 646– 661. Springer (2016)
    Google ScholarLocate open access versionFindings
  • Jia, M., Zhai, Y., Lu, S., Ma, S., Zhang, J.: A similarity inference metric for rgbinfrared cross-modality person re-identification. In: IJCAI-20 (7 2020)
    Google ScholarLocate open access versionFindings
  • Jin, X., Lan, C., Zeng, W., Chen, Z.: Global distance-distributions separation for unsupervised person re-identification. arXiv preprint arXiv:2006.00752 (2020)
    Findings
  • Jin, X., Lan, C., Zeng, W., Chen, Z., Zhang, L.: Style normalization and restitution for generalizable person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 3143–3152 (2020)
    Google ScholarLocate open access versionFindings
  • Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
    Findings
  • Li, Y.J., Lin, C.S., Lin, Y.B., Wang, Y.C.F.: Cross-dataset person re-identification via unsupervised pose disentanglement and adaptation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). pp. 7919–7929 (2019)
    Google ScholarLocate open access versionFindings
  • Li, Y., Yang, J., Song, Y., Cao, L., Luo, J., Li, L.J.: Learning from noisy labels with distillation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). pp. 1910–1918 (2017)
    Google ScholarLocate open access versionFindings
  • Liao, S., Hu, Y., Zhu, X., Li, S.Z.: Person re-identification by local maximal occurrence representation and metric learning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2015)
    Google ScholarLocate open access versionFindings
  • Lin, S., Li, H., Li, C., Kot, A.C.: Multi-task mid-level feature alignment network for unsupervised cross-dataset person re-identification. In: BMVC (2018)
    Google ScholarFindings
  • Liu, J., Zha, Z.J., Chen, D., Hong, R., Wang, M.: Adaptive transfer network for cross-domain person re-identification. In: IEEE CVPR (2019)
    Google ScholarFindings
  • Liu, Z., Wang, D., Lu, H.: Stepwise metric promotion for unsupervised video person re-identification. In: IEEE ICCV. pp. 2448–2457 (2017)
    Google ScholarFindings
  • Lv, J., Wang, X.: Cross-dataset person re-identification using similarity preserved generative adversarial networks. In: Liu, W., Giunchiglia, F., Yang, B. (eds.) KSEM. pp. 171–183 (2018)
    Google ScholarFindings
  • Peng, P., Xiang, T., Wang, Y., Pontil, M., Gong, S., Huang, T., Tian, Y.: Unsupervised cross-dataset transfer learning for person re-identification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016)
    Google ScholarLocate open access versionFindings
  • Qi, L., Wang, L., Huo, J., Zhou, L., Shi, Y., Gao, Y.: A novel unsupervised cameraaware domain adaptation framework for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). pp. 8080–8089 (2019)
    Google ScholarLocate open access versionFindings
  • Ristani, E., Solera, F., Zou, R.S., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: IEEE ECCV Workshops (2016)
    Google ScholarFindings
  • Shen, Z., He, Z., Xue, X.: Meal: Multi-model ensemble via adversarial learning. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 33, pp. 4886–4893 (2019)
    Google ScholarLocate open access versionFindings
  • Singh, S., Hoiem, D., Forsyth, D.: Swapout: Learning an ensemble of deep architectures. In: Advances in neural information processing systems. pp. 28–36 (2016)
    Google ScholarFindings
  • Song, L., Wang, C., Zhang, L., Du, B., Zhang, Q., Huang, C., Wang, X.: Unsupervised domain adaptive re-identification: Theory and practice. CoRR abs/1807.11334 (2018)
    Findings
  • Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research 15(1), 1929–1958 (2014)
    Google ScholarLocate open access versionFindings
  • Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). pp. 2818–2826 (2016)
    Google ScholarLocate open access versionFindings
  • Tarvainen, A., Valpola, H.: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in neural information processing systems. pp. 1195–1204 (2017)
    Google ScholarFindings
  • Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: Regularization of neural networks using dropconnect. In: International conference on machine learning. pp. 1058–1066 (2013)
    Google ScholarLocate open access versionFindings
  • Wang, J., Zhu, X., Gong, S., Li, W.: Transferable joint attribute-identity deep learning for unsupervised person re-identification. In: IEEE CVPR (2018)
    Google ScholarFindings
  • Wei, L., Zhang, S., Gao, W., Tian, Q.: Person transfer gan to bridge domain gap for person re-identification. In: IEEE CVPR (2018)
    Google ScholarFindings
  • Wu, J., Liao, S., Lei, Z., Wang, X., Yang, Y., Li, S.Z.: Clustering and dynamic sampling based unsupervised domain adaptation for person re-identification. In: IEEE ICME. pp. 886–891 (2019)
    Google ScholarFindings
  • Wu, Y., Lin, Y., Dong, X., Yan, Y., Ouyang, W., Yang, Y.: Exploit the unknown gradually: One-shot video-based person re-identification by stepwise learning. In: IEEE CVPR (2018)
    Google ScholarFindings
  • Yang, F., Yan, K., Lu, S., Jia, H., Xie, D., Yu, Z., Guo, X., Huang, F., Gao, W.: Part-aware progressive unsupervised domain adaptation for person reidentification. IEEE Transactions on Multimedia (2020)
    Google ScholarLocate open access versionFindings
  • Yang, F., Yan, K., Lu, S., Jia, H., Xie, X., Gao, W.: Attention driven person re-identification. Pattern Recognition 86, 143–155 (2019)
    Google ScholarLocate open access versionFindings
  • Ye, M., Ma, A.J., Zheng, L., Li, J., Yuen, P.C.: Dynamic label graph matching for unsupervised video re-identification. In: IEEE ICCV. pp. 5152–5160 (2017)
    Google ScholarFindings
  • Yim, J., Joo, D., Bae, J., Kim, J.: A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4133–4141 (2017)
    Google ScholarLocate open access versionFindings
  • Zhai, Y., Lu, S., Ye, Q., Shan, X., Chen, J., Ji, R., Tian, Y.: Ad-cluster: Augmented discriminative clustering for domain adaptive person re-identification. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2020)
    Google ScholarLocate open access versionFindings
  • Zhang, X., Cao, J., Shen, C., You, M.: Self-training with progressive augmentation for unsupervised cross-domain person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). pp. 8222–8231 (2019)
    Google ScholarLocate open access versionFindings
  • Zhang, Y., Xiang, T., Hospedales, T.M., Lu, H.: Deep mutual learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4320–4328 (2018)
    Google ScholarLocate open access versionFindings
  • Zheng, L., Bie, Z., Sun, Y., Wang, J., Su, C., Wang, S., Tian, Q.: MARS: A video benchmark for large-scale person re-identification. In: ECCV. pp. 868–884 (2016)
    Google ScholarFindings
  • Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person reidentification: A benchmark. In: The IEEE International Conference on Computer Vision (ICCV) (December 2015)
    Google ScholarLocate open access versionFindings
  • Zheng, Z., Zheng, L., Yang, Y.: Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In: IEEE ICCV (2017)
    Google ScholarFindings
  • Zhong, Z., Zheng, L., Li, S., Yang, Y.: Generalizing a person retrieval model heteroand homogeneously. In: ECCV. pp. 176–192 (2018)
    Google ScholarFindings
  • Zhong, Z., Zheng, L., Luo, Z., Li, S., Yang, Y.: Invariance matters: Exemplar memory for domain adaptive person re-identification. In: IEEE CVPR (2019)
    Google ScholarFindings
  • Zhong, Z., Zheng, L., Zheng, Z., Li, S., Yang, Y.: Camstyle: A novel data augmentation method for person re-identification. IEEE TIP 28(3), 1176–1190 (2019)
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments