The Handwritten Sundanese Palm Leaf Manuscript Dataset From 15th Century

2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1(2017)

引用 20|浏览0
暂无评分
摘要
In order to preserve the Sundanese palm leaf manuscripts, some digitization campaigns have been done recently. Then, for further access in education and research, the handwritten Sundanese palm leaf manuscript dataset called Lontar Sunda dataset has been created. The dataset was constructed from 66 pages of 27 collections of Sundanese palm leaf manuscripts from the 15th century. The dataset has been carried out with manuscripts from Garut, West Java, Indonesia. This paper presents the Sundanese dataset which is publicly available for scientific use. The groundtruth includes binarized images, annotations at word level and annotations at character level. The Sundanese dataset provides useful data to test word spotting, character/symbol recognition and binarization methods, and will facilitate the evaluation of developed methods.
更多
查看译文
关键词
Sundanese manuscript,palm leaf manuscript,dataset,image,isolated character
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要