Data Sufficiency for Online Writer Identification: A Comparative Study of Writer-Style Space vs. Feature Space Models

Pattern Recognition(2014)

引用 15|浏览27
暂无评分
摘要
A key factor in building effective writer identification/verification systems is the amount of data required to build the underlying models. In this research we systematically examine data sufficiency bounds for two broad approaches to online writer identification -- feature space models vs. writer-style space models. We report results from 40 experiments conducted on two publicly available datasets and also test identification performance for the target models using two different feature functions. Our findings show that the writer-style space model gives higher identification performance for a given level of data and further, achieves high performance levels with lesser data costs. This model appears to require as less as 20 words per page to achieve identification performance close to 80% and reaches more than 90% accuracy with higher levels of data enrollment.
更多
查看译文
关键词
data analysis,formal verification,text analysis,data sufficiency,feature space model,online writer identification,writer identification-verification system,writer-style space model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要