HCADecoder: A Hybrid CTC-Attention Decoder for Chinese Text Recognition

DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT III(2021)

引用 2|浏览7
暂无评分
摘要
Text recognition has attracted much attention and achieved exciting results on several commonly used public English datasets in recent years. However, most of these well-established methods, such as connectionist temporal classification (CTC)-based methods and attention-based methods, pay less attention to challenges on the Chinese scene, especially for long text sequences. In this paper, we exploit the characteristic of Chinese word frequency distribution and propose a hybrid CTC-Attention decoder (HCADecoder) supervised with bigram mixture labels for Chinese text recognition. Specifically, we first add high-frequency bigram subwords into the original unigram labels to construct the mixture bigram label, which can shorten the decoding length. Then, in the decoding stage, the CTC module outputs a preliminary result, in which confused predictions are replaced with bigram subwords. The attention module utilizes the preliminary result and outputs the final result. Experimental results on four Chinese datasets demonstrate the effectiveness of the proposed method for Chinese text recognition, especially for long texts. Code will be made publicly available(https://github.com/lukecsq/hybrid-CTC-Attention).
更多
查看译文
关键词
Chinese text recognition, CTC-Attention, Subword
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要