CRF-based bibliography extraction from reference strings using a small amount of training data

2017 Twelfth International Conference on Digital Information Management (ICDIM)(2017)

引用 0|浏览4
暂无评分
摘要
The effective use of digital libraries demands maintenance of bibliographic databases. Useful bibliographic information appears in the reference fields of academic papers, so we are developing a method for automatic extraction of bibliographic information from reference strings using a conditional random field (CRF). However, at least a few hundred reference strings are necessary to learn an accurate CRF. In this paper, we propose active learning and transfer learning techniques to reduce the required training data for CRFs. We evaluate extraction accuracies and the associated training cost by experiments.
更多
查看译文
关键词
bibliography extraction,CRF,confidence measure,active learning,transfer learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要