Using Attributes For Word Spotting And Recognition In Polytonic Greek Documents

2015 13th International Conference on Document Analysis and Recognition (ICDAR)(2015)

引用 14|浏览15
暂无评分
摘要
Word spotting and recognition are among the most important applications used today in the field of document processing and text understanding. In word spotting, the goal is to search a scanned document for instances of a specific word. In word recognition, we aim to identify the transcription of the document words. While substantial work in both topics has been published, not all are readily adaptible to scripts other than a specific script and/or language. This is especially true for documents written in the poly tonic greek script, a script used to write the greek language during a period that approximately spans two millenia. In this work, we extend the attribute-based model for word spotting and recognition recently presented in [1] for use with poly tonic greek documents. To this end, we present three alternative ways to extend the model mechanism to handle the greek alphabet and its various combinations of diacritic marks. We have run numerical experiments over poly tonic machine-printed and handwritten documents for word spotting and recognition. The proposed model is shown to outperform other state-of-the-art methods in word spotting trials. Regarding poly tonic greek unconstrained handwritten word recognition, to the best of our knowledge, this is the first work to address the problem succesfully.
更多
查看译文
关键词
document processing,text understanding,word recognition,word spotting,document word transcription,polytonic Greek documents,Greek language,attribute-based model,Greek alphabet,diacritic marks,polytonic Greek script,polytonic machine-printed documents,polytonic handwritten documents
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要