eMIND: Enabling automatic collection of protein variation impacts in Alzheimer’s disease from the literature

bioRxiv (Cold Spring Harbor Laboratory)(2023)

引用 0|浏览9
暂无评分
摘要
Alzheimer’s disease and related dementias (AD/ADRDs) are among the most common forms of dementia, and yet no effective treatments have been developed. To gain insight into the disease mechanism, capturing the connection of genetic variations to their impacts, at the disease and molecular levels, is essential. The scientific literature continues to be a main source for reporting experimental information about the impact of variants. Thus, development of automatic methods to identify publications and extract the information from the unstructured text would facilitate collecting and organizing information for reuse. We developed eMIND, a deep learning-based text mining system that supports the automatic extraction of annotations of variants and their impacts in AD/ADRDs. In particular, we use this method to capture the impacts of protein-coding variants affecting a selected set of protein properties, such as protein activity/function, structure and post-translational modifications. We conducted an evaluation on the efficacy of eMIND to extract variant impact relations and obtained a recall of 0.84 and a precision of 0.94. The publications and extracted information are integrated into the UniProtKB computationally mapped bibliography to expand annotations on protein entries. eMIND’s text-mined output are presented using controlled vocabularies and ontologies for variant, disease and impact along with the evidence sentences. A sample of annotated abstracts can be accessed at URL: . ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
protein variation impacts,alzheimer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要