Historical Document Binarization Combining Semantic Labeling And Graph Cuts

IMAGE ANALYSIS, SCIA 2017, PT I(2017)

引用 7|浏览14
暂无评分
摘要
Most data mining applications on collections of historical documents require binarization of the digitized images as a preprocessing step. Historical documents are often subjected to degradations such as parchment aging, smudges and bleed through from the other side. The text is sometimes printed, but more often handwritten. Mathematical modeling of appearance of the text, background and all kinds of degradations, is challenging. In the current work we try to tackle binarization as pixel classification problem. We first apply semantic segmentation, using fully convolutional neural networks. In order to improve the sharpness of the result, we then apply a graph cut algorithm. The labels from the semantic segmentation are used as approximate estimates of the text and background, with the probability map of background used for pruning the edges in the graph cut. The results obtained show significant improvement over the state of the art approach.
更多
查看译文
关键词
Binarization,Semantic labeling,Deep learning,Graph cut,Zero shot learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要