AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
Large-Scale Long-Tailed Recognition In An Open World
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), (2019): 2532-2541
- While the natural data distribution contains head, tail, and open classes (Fig. 1), existing classification approaches focus mostly on the head [7, 28], the tail [51, 25], often in a closed setting [55, 31].
- The authors define OLTR as learning from long-tail and open-end distributed data and evaluating the classification accuracy over a balanced test set which include head, tail, and open classes in a continuous spectrum (Fig. 1)
- Our visual world is inherently long-tailed and openended: The frequency distribution of visual categories in our daily life is long-tailed , with a few common classes and many more rare classes, and we constantly encounter new visual concepts as we navigate in an open world.
Open Long-tailed Recognition Imbalanced Classification
Open World Few-shot Learning Head Classes Tail Classes Open Classes
While the natural data distribution contains head, tail, and open classes (Fig. 1), existing classification approaches focus mostly on the head [7, 28], the tail [51, 25], often in a closed setting [55, 31]
- We develop an integrated Open Long-Tailed Recognition algorithm that maps an image to a feature space such that visual concepts can relate to each other based on a learned metric that respects the closed-world classification while acknowledging the novelty of the open world
- All classes novel classes all classes all classes in the tail class, the recognition accuracy should maintain as high as possible; on the other hand, as the number of instances drops to zero in the open set, the recognition accuracy relies on the sensitivity to distinguish unknown open classes from known tail classes
- We introduce the Open Long-Tailed Recognition task that learns from natural long-tail open-end distributed data and optimizes the overall accuracy over a balanced test set
- We propose an integrated Open Long-Tailed Recognition algorithm, dynamic meta-embedding, in order to share visual knowledge between head and tail classes and to reduce confusion between tail and open classes
- Softmax Pred.  Ours ODIN † Ours† Error (%)
the most related to the work, the authors directly contrast the results to the numbers reported in their paper.
- Recall that the dynamic meta-embedding consists of three main components: memory feature, concept selector, and confidence calibrator.
- From Fig. 5 (b), the authors observe that the combination of the memory feature and concept selector leads to large improvements on all three shots.
- It is because the obtained memory feature transfers useful visual concepts among classes
- Another observation is that the confidence calibrator is the most effective on few-shot classes.
- The reachability estimation inside the confidence calibrator helps distinguish tail classes from open classes
- Accuracy Over ?
all classes novel classes all classes all classes in the tail class, the recognition accuracy should maintain as high as possible; on the other hand, as the number of instances drops to zero in the open set, the recognition accuracy relies on the sensitivity to distinguish unknown open classes from known tail classes.
An integrated OLTR algorithm should tackle the two seemingly contradictory aspects of recognition robustness and recognition sensitivity on a continuous category spectrum.
- The authors learn to retrieve a summary of memory activations from the direct feature, combined into a meta-embedding that is enriched for the tail class.
- Besides the overall top-1 classification accuracy  over all classes, the authors calculate the accuracy of three disjoint subsets: many-shot classes, medium-shot classes and few-shot classes
- This helps them understand the detailed characteristics of each method.
- The sof tmax probability threshold is initially set as 0.1, while a more detailed analysis is provided in Sec. 4.3
- The authors introduce the OLTR task that learns from natural long-tail open-end distributed data and optimizes the overall accuracy over a balanced test set.
- The authors propose an integrated OLTR algorithm, dynamic meta-embedding, in order to share visual knowledge between head and tail classes and to reduce confusion between tail and open classes.
- The authors validate the method on three curated large-scale OLTR benchmarks (ImageNet-LT, Places-LT and MS1M-LT).
- The authors' publicly available code and data would enable future research that is directly transferable to real-world applications
- Table1: Comparison between our proposed OLTR task and related existing tasks
- Table2: Open class detection error (%) comparison. It is performed on the standard open-set benchmark, CIFAR100 + TinyImageNet (resized). “†” denotes the setting where open samples are used to tune algorithmic parameters
- Table3: Benchmarking results on (a) ImageNet-LT and (b) Places-LT. Our approach provides a comprehensive treatment to all the many/medium/few-shot classes as well as the open classes, achieving substantial advantages on all aspects
- Table4: Benchmarking results on MegaFace (left) and SUN-LT (right). Our approach achieves the best performance on natural-world datasets when compared to other state-of-the-art methods. Furthermore, our approach achieves across-board improvements on both ‘male’ and ‘female’ sub-groups
- While OLTR has not been defined in the literature, there are three closely related tasks which are often studied in isolation: imbalanced classification, few-shot learning, and open-set recognition. Tab. 1 summarizes their differences.
Imbalanced Classification. Arising from long-tail distributions of natural data, it has been extensively studied [41, 61, 3, 30, 62, 34, 29, 49, 6]. Classical methods include under-sampling head classes, over-sampling tail classes, and data instance re-weighting. We refer the readers to  for a detailed review. Some recent methods include metric learning [22, 33], hard negative mining [10, 27], and meta learning [15, 55]. The lifted structure loss  introduces margins between many training instances. The range loss  enforces data in the same class to be close and those in different classes to be far apart. The focal loss  induces an online version of hard negative mining. MetaModelNet  learns a meta regression net from head classes and uses it to construct the classifier for tail classes.
- This research was supported, in part, by SenseTime Group Limited, NSF IIS 1835539, Berkeley Deep Drive, DARPA, and US Government fund through Etegent Technologies on Low-Shot Detection in Remote Sensing Imagery
- Jimmy Ba, Geoffrey E Hinton, Volodymyr Mnih, Joel Z Leibo, and Catalin Ionescu. Using fast weights to attend to the recent past. In NIPS, 2016. 2, 3
- Abhijit Bendale and Terrance E Boult. Towards open set deep networks. In CVPR, 2016. 3, 5, 6, 7
- Samy Bengio. The battle against the long tail. In Talk on Workshop on Big Data and Statistical Machine Learning, 2015. 2
- Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representation learning: A review and new perspectives. TPAMI, 2013. 8
- Luca Bertinetto, Joao F Henriques, Jack Valmadre, Philip Torr, and Andrea Vedaldi. Learning feed-forward one-shot learners. In NIPS, 2016. 3
- Yin Cui, Yang Song, Chen Sun, Andrew Howard, and Serge Belongie. Large scale fine-grained categorization and domain-specific transfer learning. In CVPR, 2018. 2
- Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, 2009. 1, 3, 5
- Jiankang Deng, Jia Guo, and Stefanos Zafeiriou. Arcface: Additive angular margin loss for deep face recognition. arXiv preprint arXiv:1801.07698, 2015
- Terrance DeVries and Graham W Taylor. Learning confidence for out-of-distribution detection in neural networks. arXiv preprint arXiv:1802.04865, 2018. 3
- Qi Dong, Shaogang Gong, and Xiatian Zhu. Class rectification hard mining for imbalanced deep learning. In ICCV, 2017. 2
- Yan Duan, John Schulman, Xi Chen, Peter L Bartlett, Ilya Sutskever, and Pieter Abbeel. Rl2: Fast reinforcement learning via slow reinforcement learning. arXiv preprint arXiv:1611.02779, 2016. 2, 3
- Chelsea Finn, Pieter Abbeel, and Sergey Levine. Modelagnostic meta-learning for fast adaptation of deep networks. arXiv preprint arXiv:1703.03400, 2017. 3
- Spyros Gidaris and Nikos Komodakis. Dynamic few-shot visual learning without forgetting. In CVPR, 2018. 3, 4, 5, 6, 7
- Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao. Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In ECCV, 2016. 5
- David Ha, Andrew Dai, and Quoc V Le. Hypernetworks. arXiv preprint arXiv:1609.09106, 2016. 2
- Bharath Hariharan and Ross B Girshick. Low-shot visual recognition by shrinking and hallucinating features. In ICCV, 2017. 1, 3, 5
- Haibo He and Edwardo A Garcia. Learning from imbalanced data. TKDE, 2008. 2
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In CVPR, 2016. 1, 4, 5, 7, 8
- Dan Hendrycks and Kevin Gimpel. Baseline for detecting misclassified and out-of-distribution examples in neural networks. In ICLR, 2017. 6
- Geoffrey E Hinton and David C Plaut. Using fast weights to deblur old memories. In Proceedings of the ninth annual conference of the Cognitive Science Society, 1987. 3
- Yen-Chang Hsu, Zhaoyang Lv, and Zsolt Kira. Learning to cluster in order to transfer across domains and tasks. arXiv preprint arXiv:1711.10125, 2017. 4
- Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. Learning deep representation for imbalanced classification. In CVPR, 2016. 2, 7
- Ira Kemelmacher-Shlizerman, Steven M Seitz, Daniel Miller, and Evan Brossard. The megaface benchmark: 1 million faces for recognition at scale. In CVPR, 2016. 5
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012. 1, 3
- Brenden M Lake, Ruslan Salakhutdinov, and Joshua B Tenenbaum. Human-level concept learning through probabilistic program induction. Science, 2015. 1
- Shiyu Liang, Yixuan Li, and R Srikant. Enhancing the reliability of out-of-distribution image detection in neural networks. In ICLR, 2018. 3, 6
- Tsung-Yi Lin, Priyal Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. Focal loss for dense object detection. In ICCV, 2017. 2, 5, 6, 7
- Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In ECCV, 2014. 1
- Ziwei Liu, Ping Luo, Shi Qiu, Xiaogang Wang, and Xiaoou Tang. Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In CVPR, 2016. 2
- Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In ICCV, 2015. 2
- Zhongqi Miao, Kaitlyn M Gaynor, Jiayun Wang, Ziwei Liu, Oliver Muellerklein, Mohammad S Norouzzadeh, Alex McInturff, Rauri CK Bowie, Ran Nathon, Stella X. Yu, and Wayne M. Getz. A comparison of visual features used by humans and machines to classify wildlife. bioRxiv, 2018. 1
- Tsendsuren Munkhdalai and Hong Yu. Meta networks. arXiv preprint arXiv:1703.00837, 2017. 3
- Hyun Oh Song, Yu Xiang, Stefanie Jegelka, and Silvio Savarese. Deep metric learning via lifted structured feature embedding. In CVPR, 2016. 2, 5, 6, 7
- Wanli Ouyang, Xiaogang Wang, Cong Zhang, and Xiaokang Yang. Factors in finetuning deep model for object detection with long-tail distribution. In CVPR, 2016. 2
- Hang Qi, Matthew Brown, and David G Lowe. Low-shot learning with imprinted weights. In CVPR, 2018. 3, 4
- Siyuan Qiao, Chenxi Liu, Wei Shen, and Alan Yuille. Few-shot image recognition by predicting parameters from activations. In CVPR, 2018. 3
- Sachin Ravi and Hugo Larochelle. Optimization as a model for few-shot learning. In ICLR, 2017. 3
- William J Reed. The pareto, zipf and other power laws. Economics letters, 2001. 1
- Mengye Ren, Renjie Liao, Ethan Fetaya, and Richard S Zemel. Incremental few-shot learning with attention attractor networks. arXiv preprint arXiv:1810.07218, 2018. 3
- Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. Dynamic routing between capsules. In NIPS, 2017. 5
- Ruslan Salakhutdinov, Antonio Torralba, and Josh Tenenbaum. Learning to share visual appearance for multiclass object detection. In CVPR, 2011. 2
- Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, and Timothy Lillicrap. Meta-learning with memory-augmented neural networks. In ICML, 2016. 3
- Nikolay Savinov, Anton Raichuk, Raphael Marinier, Damien Vincent, Marc Pollefeys, Timothy Lillicrap, and Sylvain Gelly. Episodic curiosity through reachability. arXiv preprint arXiv:1810.02274, 2018. 4
- Walter J Scheirer, Anderson de Rezende Rocha, Archana Sapkota, and Terrance E Boult. Toward open set recognition. TPAMI, 2013. 3
- Jurgen Schmidhuber. Learning to control fast-weight memories: An alternative to dynamic recurrent networks. Neural Computation, 1992. 3
- Jurgen Schmidhuber. A neural network that embeds its own meta-levels. In ICNN, 1993. 3
- Li Shen, Zhouchen Lin, and Qingming Huang. Relay backpropagation for effective learning of deep convolutional neural networks. In ECCV, 2016. 5
- Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. In NIPS, 2017. 1, 3, 4
- Grant Van Horn and Pietro Perona. The devil is in the tails: Fine-grained classification in the wild. arXiv preprint arXiv:1709.01450, 2017. 2
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In NIPS, 2017. 4
- Oriol Vinyals, Charles Blundell, Tim Lillicrap, and Daan Wierstra. Matching networks for one shot learning. In NIPS, 2016. 1, 2, 3
- Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. Non-local neural networks. arXiv preprint arXiv:1711.07971, 2017. 2, 4
- Yu-Xiong Wang, Ross Girshick, Martial Hebert, and Bharath Hariharan. Low-shot learning from imaginary data. arXiv preprint arXiv:1801.05401, 2018. 3, 5
- Yu-Xiong Wang and Martial Hebert. Learning to learn: Model regression networks for easy small sample learning. In ECCV, 2016. 5, 7
- Yu-Xiong Wang, Deva Ramanan, and Martial Hebert. Learning to model the tail. In NIPS, 2017. 1, 2, 5, 7
- Yandong Wen, Kaipeng Zhang, Zhifeng Li, and Yu Qiao. A discriminative feature learning approach for deep face recognition. In ECCV, 2016. 4
- Flood Sung Yongxin Yang, Li Zhang, Tao Xiang, Philip HS Torr, and Timothy M Hospedales. Learning to compare: Relation network for few-shot learning. In CVPR, 2018. 3
- Matthew D Zeiler and Rob Fergus. Visualizing and understanding convolutional networks. In ECCV, 2014. 8
- Xiao Zhang, Zhiyuan Fang, Yandong Wen, Zhifeng Li, and Yu Qiao. Range loss for deep face recognition with longtailed training data. In CVPR, 2017. 2, 5, 7, 8
- Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. Places: A 10 million image database for scene recognition. TPAMI, 2018. 5
- Xiangxin Zhu, Dragomir Anguelov, and Deva Ramanan. Capturing long-tail distributions of object subcategories. In CVPR, 2014. 2
- Xiangxin Zhu, Carl Vondrick, Charless C Fowlkes, and Deva Ramanan. Do we need more training data? IJCV, 2016. 2