Do Better ImageNet Models Transfer Better?

    computer vision and pattern recognition, 2019.

    Cited by: 130|Bibtex|Views89|Links
    EI
    Keywords:
    image classificationcomputer vision researchconvolutional networkdeep convolutionalclassification datasetMore(2+)
    Wei bo:
    We evaluated models on 12 image classification datasets ranging in training set size from 2,040 to 75,750 images

    Abstract:

    Transfer learning has become a cornerstone of computer vision with the advent of ImageNet features, yet little work has been done to evaluate the performance of ImageNet architectures across different datasets. An implicit hypothesis in modern computer vision research is that models that perform better on ImageNet necessarily perform bett...More

    Code:

    Data:

    Introduction
    • The last decade of computer vision research has pursued academic benchmarks as a measure of progress.
    • An implicit assumption behind this progress is that network architectures that perform better on ImageNet necessarily perform better on other vision tasks.
    • Another assumption is that bet-.
    Highlights
    • The last decade of computer vision research has pursued academic benchmarks as a measure of progress
    • Network architectures measured against this dataset have fueled much progress in computer vision research across a broad array of problems, including transferring to new datasets [17, 56], object detection [32], image segmentation [27, 7] and perceptual metrics of images [35]
    • We evaluated models on 12 image classification datasets ranging in training set size from 2,040 to 75,750 images (20 to 5,000 images per class; Table 1)
    • On the datasets we examine, we outperform all such methods by finetuning state-of-the-art convolutional neural networks (Supp
    Methods
    • Much of the analysis in this work requires comparing accuracies across datasets of differing difficulty.
    • When fitting linear models to accuracy values across multiple datasets, the authors consider effects of model and dataset to be additive.
    • In this context, using untransformed accuracy as a dependent variable is problematic: The meaning of a 1% additive increase in accuracy is different if it is relative to a base accuracy of 50% vs 99%.
    • The authors take the mean and standard error of the adjusted accuracy across datasets, and multiply the latter by a correction factor
    Results
    • The authors examined 16 modern networks ranging in ImageNet (ILSVRC 2012 validation) top-1 accuracy from 71.6% to 80.8%.
    • Appendix A.3 provides training hyperparameters along with further details of each network, including the ImageNet top-1 accuracy, parameter count, dimension of the penultimate layer, input image size, and performance of retrained models.
    • The authors rescaled images to the same image size as was used for ImageNet training
    Conclusion
    • The authors' results suggest the answer is no: The authors find that there is a strong correlation between ImageNet top-1 accuracy and transfer accuracy, suggesting that better ImageNet architectures are capable of learning better, transferable representations.
    • Examples per Class number of widely-used regularizers that improve ImageNet performance do not produce better representations.
    • These regularizers are harmful to the penultimate layer feature space, and have mixed effects when networks are fine-tuned.
    Summary
    • Introduction:

      The last decade of computer vision research has pursued academic benchmarks as a measure of progress.
    • An implicit assumption behind this progress is that network architectures that perform better on ImageNet necessarily perform better on other vision tasks.
    • Another assumption is that bet-.
    • Methods:

      Much of the analysis in this work requires comparing accuracies across datasets of differing difficulty.
    • When fitting linear models to accuracy values across multiple datasets, the authors consider effects of model and dataset to be additive.
    • In this context, using untransformed accuracy as a dependent variable is problematic: The meaning of a 1% additive increase in accuracy is different if it is relative to a base accuracy of 50% vs 99%.
    • The authors take the mean and standard error of the adjusted accuracy across datasets, and multiply the latter by a correction factor
    • Results:

      The authors examined 16 modern networks ranging in ImageNet (ILSVRC 2012 validation) top-1 accuracy from 71.6% to 80.8%.
    • Appendix A.3 provides training hyperparameters along with further details of each network, including the ImageNet top-1 accuracy, parameter count, dimension of the penultimate layer, input image size, and performance of retrained models.
    • The authors rescaled images to the same image size as was used for ImageNet training
    • Conclusion:

      The authors' results suggest the answer is no: The authors find that there is a strong correlation between ImageNet top-1 accuracy and transfer accuracy, suggesting that better ImageNet architectures are capable of learning better, transferable representations.
    • Examples per Class number of widely-used regularizers that improve ImageNet performance do not produce better representations.
    • These regularizers are harmful to the penultimate layer feature space, and have mixed effects when networks are fine-tuned.
    Tables
    • Table1: Datasets examined in transfer learning
    Download tables as Excel
    Related work
    • ImageNet follows in a succession of progressively larger and more realistic benchmark datasets for computer vision. Each successive dataset was designed to address perceived issues with the size and content of previous datasets. Torralba and Efros [69] showed that many early datasets were heavily biased, with classifiers trained to recognize or classify objects on those datasets possessing almost no ability to generalize to images from other datasets.

      Early work using convolutional neural networks (CNNs) for transfer learning extracted fixed features from ImageNettrained networks and used these features to train SVMs and logistic regression classifiers for new tasks [17, 56, 6]. These features could outperform hand-engineered features even for tasks very distinct from ImageNet classification [17, 56]. Following this work, several studies compared the performance of AlexNet-like CNNs of varying levels of computational complexity in a transfer learning setting with no fine-tuning. Chatfield et al [6] found that, out of three networks, the two more computationally expensive networks performed better on PASCAL VOC. Similar work concluded that deeper networks produce higher accuracy across many transfer tasks, but wider networks produce lower accuracy [2]. More recent evaluation efforts have investigated transfer from modern CNNs to medical image datasets [51], and transfer of sentence embeddings to language tasks [12].
    Reference
    • Pulkit Agrawal, Ross B. Girshick, and Jitendra Malik. Analyzing the performance of multilayer neural networks for object recognition. In European Conference on Computer Vision (ECCV), 2014.
      Google ScholarLocate open access versionFindings
    • H. Azizpour, A. S. Razavian, J. Sullivan, A. Maki, and S. Carlsson. Factors of transferability for a generic convnet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(9):1790–1802, Sept 2016.
      Google ScholarLocate open access versionFindings
    • Herbert Bay, Andreas Ess, Tinne Tuytelaars, and Luc Van Gool. Speeded-up robust features (SURF). Computer Vision and Image Understanding, 110(3):346–359, 2008.
      Google ScholarLocate open access versionFindings
    • Thomas Berg, Jiongxin Liu, Seung Woo Lee, Michelle L Alexander, David W Jacobs, and Peter N Belhumeur. Birdsnap: Large-scale fine-grained visual categorization of birds. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2019–2026. IEEE, 2014.
      Google ScholarLocate open access versionFindings
    • Lukas Bossard, Matthieu Guillaumin, and Luc Van Gool. Food-101 — mining discriminative components with random forests. In European Conference on Computer Vision (ECCV), pages 446–461.
      Google ScholarLocate open access versionFindings
    • Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Return of the devil in the details: delving deep into convolutional nets. In British Machine Vision Conference, 2014.
      Google ScholarLocate open access versionFindings
    • Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4):834–848, 2018.
      Google ScholarLocate open access versionFindings
    • Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, and Jia-Bin Huang. A closer look at few-shot classification. In International Conference on Learning Representations, 2019.
      Google ScholarLocate open access versionFindings
    • Brian Chu, Vashisht Madhavan, Oscar Beijbom, Judy Hoffman, and Trevor Darrell. Best practices for fine-tuning visual classifiers to new domains. In Gang Hua and Hervé Jégou, editors, Computer Vision – ECCV 2016 Workshops, pages 435–442, Cham, 2016. Springer International Publishing.
      Google ScholarLocate open access versionFindings
    • Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. Describing textures in the wild. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3606–3613. IEEE, 2014.
      Google ScholarLocate open access versionFindings
    • Mircea Cimpoi, Subhransu Maji, and Andrea Vedaldi. Deep filter banks for texture recognition and segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3828–3836. IEEE, 2015.
      Google ScholarLocate open access versionFindings
    • Alexis Conneau and Douwe Kiela. Senteval: An evaluation toolkit for universal sentence representations. arXiv preprint arXiv:1803.05449, 2018.
      Findings
    • Yin Cui, Feng Zhou, Jiang Wang, Xiao Liu, Yuanqing Lin, and Serge Belongie. Kernel pooling for convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
      Google ScholarLocate open access versionFindings
    • Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. R-fcn: Object detection via region-based fully convolutional networks. In Advances in neural information processing systems, pages 379–387, 2016.
      Google ScholarLocate open access versionFindings
    • Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume 1, pages 886–893. IEEE, 2005.
      Google ScholarLocate open access versionFindings
    • Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255. IEEE, 2009.
      Google ScholarLocate open access versionFindings
    • Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, and Trevor Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. In International Conference on Machine Learning, pages 647–655, 2014.
      Google ScholarLocate open access versionFindings
    • Nanqing Dong and Eric P Xing. Domain adaption in oneshot learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 573– 588.
      Google ScholarLocate open access versionFindings
    • Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2):303–338, 2010.
      Google ScholarLocate open access versionFindings
    • Li Fei-Fei, Rob Fergus, and Pietro Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop on Generative-Model Based Vision, 2004.
      Google ScholarLocate open access versionFindings
    • Chelsea Finn, Pieter Abbeel, and Sergey Levine. Modelagnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning, pages 1126–1135, 2017.
      Google ScholarLocate open access versionFindings
    • Blair Hanley Frank. Google Brain chief: Deep learning takes at least 100,000 examples. In VentureBeat. https://venturebeat.com/2017/10/23/google-brain-chiefsays-100000-examples-is-enough-data-for-deep-learning/, 2017.
      Findings
    • Yang Gao, Oscar Beijbom, Ning Zhang, and Trevor Darrell. Compact bilinear pooling. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 317–326, 2016.
      Google ScholarLocate open access versionFindings
    • Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 580–587, 2014.
      Google ScholarLocate open access versionFindings
    • Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. Accurate, large minibatch SGD: training ImageNet in 1 hour. arXiv preprint arXiv:1706.02677, 2017.
      Findings
    • Sam Gross and Michael Wilber. Training and investigating residual nets. In The Torch Blog. http://torch.ch/blog/2016/02/04/resnets.html, 2016.
      Findings
    • Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. Mask R-CNN. In IEEE International Conference on Computer Vision (ICCV), pages 2980–2988. IEEE, 2017.
      Google ScholarLocate open access versionFindings
    • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
      Google ScholarLocate open access versionFindings
    • Luis Herranz, Shuqiang Jiang, and Xiangyang Li. Scene recognition with CNNs: objects, scales and dataset bias. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 571–579, 2016.
      Google ScholarLocate open access versionFindings
    • Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
      Findings
    • Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2261–2269, 2017.
      Google ScholarLocate open access versionFindings
    • Jonathan Huang, Vivek Rathod, Chen Sun, Menglong Zhu, Anoop Korattikara, Alireza Fathi, Ian Fischer, Zbigniew Wojna, Yang Song, Sergio Guadarrama, and Kevin Murphy. Speed/accuracy trade-offs for modern convolutional object detectors. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
      Google ScholarLocate open access versionFindings
    • Mi-Young Huh, Pulkit Agrawal, and Alexei A. Efros. What makes ImageNet good for transfer learning? CoRR, abs/1608.08614, 2016.
      Findings
    • Sergey Ioffe and Christian Szegedy. Batch normalization: accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning, pages 448–456, 2015.
      Google ScholarLocate open access versionFindings
    • Justin Johnson, Alexandre Alahi, and Li Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision (ECCV), pages 694–711.
      Google ScholarLocate open access versionFindings
    • Jonathan Krause, Jia Deng, Michael Stark, and Li Fei-Fei. Collecting a large-scale dataset of fine-grained cars. In Second Workshop on Fine-Grained Visual Categorization, 2013.
      Google ScholarLocate open access versionFindings
    • Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
      Google ScholarFindings
    • Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, pages 1097–1105, 2012.
      Google ScholarLocate open access versionFindings
    • Brenden M Lake, Ruslan Salakhutdinov, and Joshua B Tenenbaum. Human-level concept learning through probabilistic program induction. Science, 350(6266):1332–1338, 2015.
      Google ScholarLocate open access versionFindings
    • Chen-Yu Lee, Saining Xie, Patrick Gallagher, Zhengyou Zhang, and Zhuowen Tu. Deeply-supervised nets. In Artificial Intelligence and Statistics, pages 562–570, 2015.
      Google ScholarLocate open access versionFindings
    • Zhichao Li, Yi Yang, Xiao Liu, Feng Zhou, Shilei Wen, and Wei Xu. Dynamic computational time for visual attention. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1199–1209, 2017.
      Google ScholarLocate open access versionFindings
    • Tsung-Yu Lin and Subhransu Maji. Visualizing and understanding deep texture representations. In IEEE International Conference on Computer Vision (ICCV), pages 2791–2799, 2016.
      Google ScholarLocate open access versionFindings
    • Tsung-Yu Lin, Aruni RoyChowdhury, and Subhransu Maji. Bilinear CNN models for fine-grained visual recognition. In IEEE International Conference on Computer Vision (ICCV), pages 1449–1457, 2015.
      Google ScholarLocate open access versionFindings
    • Dong C Liu and Jorge Nocedal. On the limited memory bfgs method for large scale optimization. Mathematical programming, 45(1-3):503–528, 1989.
      Google ScholarLocate open access versionFindings
    • David G Lowe. Object recognition from local scale-invariant features. In IEEE International Conference on Computer Vision, volume 2, pages 1150–1157.
      Google ScholarLocate open access versionFindings
    • Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(Nov):2579–2605, 2008.
      Google ScholarLocate open access versionFindings
    • Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, and Laurens van der Maaten. Exploring the limits of weakly supervised pretraining. In European Conference on Computer Vision (ECCV), pages 181–196, 2018.
      Google ScholarLocate open access versionFindings
    • S. Maji, J. Kannala, E. Rahtu, M. Blaschko, and A. Vedaldi. Fine-grained visual classification of aircraft. Technical report, 2013.
      Google ScholarFindings
    • Nikhil Mishra, Mostafa Rohaninejad, Xi Chen, and Pieter Abbeel. A simple neural attentive meta-learner. In International Conference on Learning Representations, 2018.
      Google ScholarLocate open access versionFindings
    • Richard D. Morey. Confidence intervals from normalized data: A correction to cousineau (2005). Tutorials in Quantitative Methods for Psychology, 4(2):61–64, 2008.
      Google ScholarLocate open access versionFindings
    • Romain Mormont, Pierre Geurts, and Raphaël Marée. Comparison of deep transfer learning strategies for digital pathology. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 2262–2271, 2018.
      Google ScholarLocate open access versionFindings
    • Maria-Elena Nilsback and Andrew Zisserman. Automated flower classification over a large number of classes. In Computer Vision, Graphics & Image Processing, 2008. ICVGIP’08. Sixth Indian Conference on, pages 722–729. IEEE, 2008.
      Google ScholarLocate open access versionFindings
    • Omkar M Parkhi, Andrea Vedaldi, Andrew Zisserman, and CV Jawahar. Cats and dogs. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3498–3505. IEEE, 2012.
      Google ScholarLocate open access versionFindings
    • Yuxin Peng, Xiangteng He, and Junjie Zhao. Object-part attention model for fine-grained image classification. IEEE Transactions on Image Processing, 27(3):1487–1500, 2018.
      Google ScholarLocate open access versionFindings
    • Sachin Ravi and Hugo Larochelle. Optimization as a model for few-shot learning. In International Conference on Machine Learning, 2016.
      Google ScholarLocate open access versionFindings
    • Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. CNN features off-the-shelf: an astounding baseline for recognition. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 512–519. IEEE, 2014.
      Google ScholarLocate open access versionFindings
    • Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91–99, 2015.
      Google ScholarLocate open access versionFindings
    • Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3):211–252, Dec 2015.
      Google ScholarLocate open access versionFindings
    • Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4510–4520, 2018.
      Google ScholarLocate open access versionFindings
    • Tyler Scott, Karl Ridgeway, and Michael C Mozer. Adapted deep embeddings: A synthesis of methods for k-shot inductive transfer learning. In Advances in Neural Information Processing Systems, pages 76–85, 2018.
      Google ScholarLocate open access versionFindings
    • Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations, 2015.
      Google ScholarLocate open access versionFindings
    • Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems, pages 4080–4090, 2017.
      Google ScholarLocate open access versionFindings
    • Yang Song, Fan Zhang, Qing Li, Heng Huang, Lauren J O’Donnell, and Weidong Cai. Locally-transferred fisher vectors for texture classification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4912–4920, 2017.
      Google ScholarLocate open access versionFindings
    • Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1):1929–1958, 2014.
      Google ScholarLocate open access versionFindings
    • Chen Sun, Abhinav Shrivastava, Saurabh Singh, and Abhinav Gupta. Revisiting unreasonable effectiveness of data in deep learning era. In Computer Vision (ICCV), 2017 IEEE International Conference on, pages 843–852. IEEE, 2017.
      Google ScholarLocate open access versionFindings
    • Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander A Alemi. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), 2017.
      Google ScholarLocate open access versionFindings
    • Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
      Google ScholarLocate open access versionFindings
    • Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2818–2826, 2016.
      Google ScholarLocate open access versionFindings
    • Antonio Torralba and Alexei A Efros. Unbiased look at dataset bias. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1521–1528. IEEE, 2011.
      Google ScholarLocate open access versionFindings
    • Twan van Laarhoven. L2 regularization versus batch and weight normalization. CoRR, abs/1706.05350, 2017.
      Findings
    • Oriol Vinyals, Charles Blundell, Tim Lillicrap, Daan Wierstra, et al. Matching networks for one shot learning. In Advances in Neural Information Processing Systems, pages 3630–3638, 2016.
      Google ScholarLocate open access versionFindings
    • Jianxiong Xiao, James Hays, Krista A Ehinger, Aude Oliva, and Antonio Torralba. Sun database: Large-scale scene recognition from abbey to zoo. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3485–3492. IEEE, 2010.
      Google ScholarLocate open access versionFindings
    • Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, and Qi Tian. Coarse-to-fine description for fine-grained visual categorization. IEEE Transactions on Image Processing, 25(10):4858–4872, 2016.
      Google ScholarLocate open access versionFindings
    • Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. How transferable are features in deep neural networks? In Advances in Neural Information Processing Systems, pages 3320–3328, 2014.
      Google ScholarLocate open access versionFindings
    • Amir R Zamir, Alexander Sax, William Shen, Leonidas Guibas, Jitendra Malik, and Silvio Savarese. Taskonomy: Disentangling task transfer learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3712–3722, 2018.
      Google ScholarLocate open access versionFindings
    • Guodong Zhang, Chaoqi Wang, Bowen Xu, and Roger Grosse. Three mechanisms of weight decay regularization. In International Conference on Learning Representations, 2019.
      Google ScholarLocate open access versionFindings
    • Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
      Google ScholarLocate open access versionFindings
    • Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. Learning transferable architectures for scalable image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 8697–8710, 2018.
      Google ScholarLocate open access versionFindings
    Your rating :
    0

     

    Tags
    Comments