Flexible Sequence Matching technique

Pattern Recognition(2016)

引用 5|浏览43
暂无评分
摘要
In this paper, a robust method is presented to perform word spotting in degraded handwritten and printed document images. A new sequence matching technique, called the Flexible Sequence Matching (FSM) algorithm, is introduced for this word spotting task. The FSM algorithm was specially designed to incorporate crucial characteristics of other sequence matching algorithms (especially Dynamic Time Warping (DTW), Subsequence DTW (SSDTW), Minimal Variance Matching (MVM) and Continuous Dynamic Programming (CDP)). Along with the characteristics of multiple matching (many-to-one and one-to-many), FSM is strongly capable of skipping existing outliers or noisy elements, regardless of their positions in the target signal. More precisely, in the domain of word spotting, FSM has the ability to retrieve complete words or words that contain only a part of the query. Furthermore, due to its adaptable skipping capability, FSM is less sensitive to local variation in the spelling of words and to local degradation effects within the word image. The multiple matching capability (many-to-one, one-to-many) of FSM helps it addressing the stretching effects of query and/or target images. Moreover, FSM is designed in such a way that with little modification, its architecture can be changed into the architecture of DTW, MVM, and SSDTW and to CDP-like techniques. To illustrate these possibilities for FSM applied to specific cases of word spotting, such as incorrect word segmentation and word-level local variations, we performed experiments on historical handwritten documents and also on historical printed document images. To demonstrate the capabilities of sub-sequence matching, of noise skipping, as well as the ability to work in a multilingual paradigm with local spelling variations, we have considered properly segmented lines of historical handwritten documents in different languages and improperly as well as properly segmented words in printed and handwritten historical documents. From the comparative experimental results shown in this paper, it can be clearly seen that FSM can be equivalent or better than most DTW-based word spotting techniques in the literature while providing at the same time more meaningful correspondences between elements. HighlightsFlexible Sequence Matching (FSM) is introduced here and applied to word spotting.FSM can do partial matching, skip outliers, and one-to-one/many correspondances.FSM is able to spot words inside segmented lines or from improperly segmented words.FSM can handle word derivatives and spelling variations.FSM can behave as other sequence matching techniques (DTW, MVM, CDP).
更多
查看译文
关键词
Flexible Sequence Matching (FSM),Dynamic Time Warping (DTW),Minimal Variance Matching (MVM),Subsequence DTW (SSDTW),Continuous Dynamic Programming (CDP),Word spotting,Historical documents,Handwritten documents,Printed documents,George Washington dataset
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要