CRF-based bibliography extraction from reference strings using a small amount of training data
2017 Twelfth International Conference on Digital Information Management (ICDIM)(2017)
摘要
The effective use of digital libraries demands maintenance of bibliographic databases. Useful bibliographic information appears in the reference fields of academic papers, so we are developing a method for automatic extraction of bibliographic information from reference strings using a conditional random field (CRF). However, at least a few hundred reference strings are necessary to learn an accurate CRF. In this paper, we propose active learning and transfer learning techniques to reduce the required training data for CRFs. We evaluate extraction accuracies and the associated training cost by experiments.
更多查看译文
关键词
bibliography extraction,CRF,confidence measure,active learning,transfer learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要