The overview of the NLM-Chem BioCreative VII track Full-text Chemical Identification and Indexing in PubMed articles

semanticscholar(2021)

引用 6|浏览0
暂无评分
摘要
The BioCreative NLM-Chem track calls for a community effort to fine-tune automated recognition of chemical names in biomedical literature. Chemical names are one of the most searched biomedical entities in PubMed and – as highlighted during the COVID-19 pandemic – their identification may significantly advance research in multiple biomedical subfields. While previous community challenges focused on identifying chemical names mentioned in titles and abstracts, the full text contains valuable additional detail. We organized the BioCreative NLM-Chem track to call for a community effort to address automated chemical entity recognition in full-text articles. The track consisted of two tasks: 1) Chemical Identification task, and 2) Chemical Indexing prediction task. For the Chemical Identification task, participants were expected to predict with high accuracy all chemicals mentioned in recently published full-text articles, both span (i.e., named entity recognition) and normalization (i.e., entity linking) using MeSH. For the Chemical Indexing task, participants identified which chemicals should be indexed as topics for the article's topic terms in the NLM article and indexing, i.e., appear in the listing of MeSH terms for the document. This manuscript summarizes the BioCreative NLM-Chem track. We received a total of 88 submissions in total from 17 teams worldwide. The highest performance achieved for the Chemical Identification task was 0.8672 f-score (0.8759 precision, 0.8587 recall) for strict NER performance and 0.8136 f-score (0.8621 precision, 0.7702 recall) for strict normalization performance. The highest performance achieved for the Chemical Indexing task was 0.4825 f-score (0.4397 precision, 0.5344 recall). The NLM-Chem track dataset and other challenge materials are publicly available at https://ftp.ncbi.nlm.nih.gov/pub/lu/BC7-NLM-Chem-track/. This community challenge demonstrated 1) the current substantial achievements in deep learning technologies can be utilized to further improve automated prediction accuracy, and 2) the Chemical Indexing task is substantially more challenging. We look forward to further development of biomedical text mining methods to respond to the rapid growth of biomedical literature. Keywords— biomedical text mining; natural language processing; artificial intelligence; machine learning; deep learning; text mining; chemical entity recognition; chemical indexing
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要