Evaluation Of Header Metadata Extraction Approaches And Tools For Scientific Pdf Documents

Mario Lipinski, Kevin Yao,Corinna Breitinger,Joeran Beel,Bela Gipp

JCDL(2013)

引用 47|浏览8
暂无评分
摘要
This paper evaluates the performance of tools for the extraction of metadata from scientific articles. Accurate metadata extraction is an important task for automating the management of digital libraries. This comparative study is a guide for developers looking to integrate the most suitable and effective metadata extraction tool into their software. We shed light on the strengths and weaknesses of seven tools in common use. In our evaluation using papers from the arXiv collection, GROBID delivered the best results, followed by Mendeley Desktop. SciPlore Xtract, PDFMeat, and SVMHeaderParse also delivered good results depending on the metadata type to be extracted.
更多
查看译文
关键词
Information Retrieval,Metadata Extraction,Evaluation,PDF
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要