Exploiting Grammatical Relations for Protein Relation Extraction and Role Labeling

msra(2008)

引用 23|浏览14
暂无评分
摘要
Automatic protein interaction mining from natural language texts and automatic identication of the agent and target proteins (i.e. role labeling) are challenging problems that attract a lot of attention because of the growing amount of biomedical text resources. We propose a novel approach that relies exclusively on parsing and dependency informa- tion. We strategically omit any context information such as keywords or parts-of-speech to maximally abstract from the given corpora and look whether the grammatical rela- tions correspond to the semantic relations in the text and how close this correspondence is. In particular, we construct a feature vector for each sentence only from the grammat- ical relations and some parsing information. We then use the obtained vector with standard machine learning algo- rithms in deciding whether a sentence describes a protein interaction and which roles the interaction participants play. Evaluation on benchmark datasets shows that our method is competitive with existing state-of-the-art algorithms for the extraction of protein interactions, and gives promising results for protein role detection.
更多
查看译文
关键词
natural language,machine learning,relation extraction,feature vector,part of speech
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要