Text Mining Enhancements for Image Recognition of Gene Names and Gene Relations

Computational Intelligence Methods for Bioinformatics and Biostatistics(2022)

引用 0|浏览17
暂无评分
摘要
The volume of the biological literature has been increasing fast, which leads to a rapid growth of biological pathway figures included in the related biological papers. Each pathway figure encompasses rich biological information, consisting of gene names and gene relations. However, manual curations for pathway figures require tremendous time and labor. While leveraging advanced image understanding models may accelerate the process of curations, the accuracy of these models still needs improvements. Since each pathway figure is associated with a paper, most of the gene names and gene relations in a pathway figure also appear in the related paper text, where we can utilize text mining to improve the image recognition results. In this paper, we applied a fuzzy match method to detect gene names with different “gene dictionaries,” as well as gene co-occurrence in the plain text for suggesting gene relations. We have demonstrated that the performance of image understanding for both gene name recognitions and gene relation extractions can be improved with the help of text mining methods. All the data and code are available at GitHub ( https://github.com/lyfer233/Text-Mining-Enhancements-for-Image-Recognition-of-Gene-Names-and-Gene-Relations ).
更多
查看译文
关键词
Text mining, Biological pathway, Gene name, Gene relation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要