Grammar-based techniques for creating ground-truthed sketch corpora

International Journal on Document Analysis and Recognition(2010)

引用 59|浏览0
暂无评分
摘要
Although publicly available, ground-truthed corpora have proven useful for training, evaluating, and comparing recognition systems in many domains, the availability of such corpora for sketch recognizers, and math recognizers in particular, is currently quite poor. This paper presents a general approach to creating large, ground-truthed corpora for structured sketch domains such as mathematics. In the approach, random sketch templates are generated automatically using a grammar model of the sketch domain. These templates are transcribed manually, then automatically annotated with ground-truth. The annotation procedure uses the generated sketch templates to find a matching between transcribed and generated symbols. A large, ground-truthed corpus of handwritten mathematical expressions presented in the paper illustrates the utility of the approach.
更多
查看译文
关键词
Expression Tree,Label Algorithm,Terminal Symbol,Symbol Recognition,Nonterminal Symbol
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要