Fine-grained pose prediction, normalization, and recognition.

arXiv: Computer Vision and Pattern Recognition(2015)

引用 76|浏览236
暂无评分
摘要
Pose variation and subtle differences in appearance are key challenges to fine-grained classification. While deep networks have markedly improved general recognition, many approaches to fine-grained recognition rely on anchoring networks to parts for better accuracy. Identifying parts to find correspondence discounts pose variation so that features can be tuned to appearance. To this end previous methods have examined how to find parts and extract pose-normalized features. These methods have generally separated fine-grained recognition into stages which first localize parts using hand-engineered and coarsely-localized proposal features, and then separately learn deep descriptors centered on inferred part positions. We unify these steps in an end-to-end trainable network supervised by keypoint locations and class labels that localizes parts by a fully convolutional network to focus the learning of feature representations for the fine-grained classification task. Experiments on the popular CUB200 dataset show that our method is state-of-the-art and suggest a continuing role for strong supervision.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要