Two-step sequence transformer based method for Cham to Latin script transliteration

PROCEEDINGS OF THE 2023 INTERNATIONAL WORKSHOP ON HISTORICAL DOCUMENT IMAGING AND PROCESSING, HIP 2023(2023)

引用 0|浏览5
暂无评分
摘要
Fusion information between visual and textual information is an interesting way to better represent the features. In thiswork, we propose a method for the text line transliteration of Cham manuscripts by combining visual and textual modality. Instead of using a standard approach that directly recognizes the words in the image, we split the problem into two steps. Firstly, we propose a scenario for recognition where similar characters are considered as unique characters, then we use the transformer model which considers both visual and context information to adjust the prediction when it concerns similar characters to be able to distinguish them. Based on this two-step strategy, the proposed method consists of a sequence to sequence model and a multi-modal transformer. Thus, we can take advantage of both the sequence-to-sequence model and the transformer model. Extensive experiments prove that the proposed method outperforms the approaches of the literature on our Cham manuscripts dataset.
更多
查看译文
关键词
Transliteration,Historical documents,Cham manuscript images,Transformer,Sequence to Sequence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要