Robust Scene Text Detection Under Occlusion via Multi-scale Adaptive Deep Network.

My-Tham Dinh,Minh-Trieu Tran, Quang-Vinh Dang,Guee-Sang Lee

IW-FCV（2023）

引用 0|浏览4

暂无评分

摘要

Detecting text under occlusion in natural images is a challenge in scene text detection, which is severely sensitive and dramatically affects the performance of this field. Despite several research papers discussing the issue of occluded text in natural images, they still struggle to accurately identify word regions that are split by occlusion phenomena. In this paper, we first exploit the salient attention maps from Gradient Class Activation Maps Plus Plus (Grad-CAM++) to obtain knowledge of the important regions in the images. Furthermore, to effectively capture the different scales of text instances and enhance feature representations, we create a MulTi-scale adaptive Deep network (MTD). In addition, from ICDAR 2015 benchmark, we build occluded text, namely Realistic Occluded Text Detection dataset (ROTD), and then combine a part of ROTD dataset with the ICDAR 2015 dataset for the training stage to gain occluded text perception. Throughout these works, our model significantly improves the accuracy of text detection containing partially occluded text in natural scenes. Our proposed method achieves state-of-the-art results on partial occlusion text detection with F1-score of 69.6% on ISTD-OC, 78.7% on our ROTD, and validates competitive performance F1-score of 82.4% on ICDAR 2015 benchmark.

查看译文

关键词

robust scene text detection,occlusion,multi-scale

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要