Classifier Classifier Classifier Fine-grained classication Logo retrieval Original Image Background Reconstruction Bckg . Filtering ( Text Saliency ) Binarizing Text Saliency Character Detection Textual Cue Encoding Visual Cue Encoding Character Recognition

semanticscholar(2017)

引用 0|浏览1
暂无评分
摘要
This work focuses on fine-grained object classification using recognized scene text in natural images. While the state-of-the-art relies on visual cues only, this paper is the first work which proposes to combine textual and visual cues. Another novelty is the textual cue extraction. Unlike the state-of-theart text detection methods, we focus more on the background instead of text regions. Once text regions are detected, they are further processed by two methods to perform text recognition i.e. ABBYY commercial OCR engine and a state-of-the-art character recognition algorithm. Then, to perform textual cue encoding, biand trigrams are formed between the recognized characters by considering the proposed spatial pairwise constraints. Finally, extracted visual and textual cues are combined for fine-grained classification. The proposed method is validated on four publicly available datasets: ICDAR03, ICDAR13, Con-Text and Flickr-logo. We improve the state-of-the-art end-to-end character recognition by a large margin of 15% on ICDAR03. We show that textual cues are useful in addition to visual cues for fine-grained classification. We show that textual cues are also useful for logo retrieval. Adding textual cues outperforms visualand textual-only in finegrained classification (70.7% to 60.3%) and logo retrieval (57.4% to 54.8%).
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要