Flexible scene text recognition based on dual attention mechanism

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE(2021)

引用 1|浏览19
暂无评分
摘要
Scene text recognition (STR) is a very popular topic in the field of computer vision, which can extract text from complex natural scenes. In this article, we propose an end-to-end trainable and flexible STR method based on a dual attention mechanism. The proposed method consists of four modules: a thin plate spline transformer for normalizing the original image, a Channel-Att feature extractor for obtaining representative features, a bidirectional long short-term memory encoder for encoding sequential context features, and a Self-Att based decoder for predicting text labels. The results on seven different benchmark datasets IIIT, SVT, IC03, IC13, IC15, SVTP, and CUTE, show that the proposed method is comparable to 13 existing methods. Especially, the average text recognition accuracy of the proposed method is about 1.4% higher than the state-of-the-art method.
更多
查看译文
关键词
scene text recognition, channel attention, self-attention
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要