Protein Language Model Predicts Mutation Pathogenicity and Clinical Prognosis

bioRxiv (Cold Spring Harbor Laboratory)(2022)

引用 0|浏览0
暂无评分
摘要
Abstract Accurately predicting the effects of mutations in cancer has the potential to improve existing treatments and identify novel therapeutic targets. In this paper, we evidence for the first time that the large-scale pre-trained protein language models (PPLMs) are zero-shot predictors for two clinically relevant tasks: identifying diseasecausing mutations and predicting patient survival rate. Then we benchmark a series of state-of-the-art (SOTA) PPLMs on 2279 protein variants across 20 cancer-related genes. Our empirical results show that the PPLMs outperform the SOTA baseline, EVE [1], trained on multiple sequence alignment (MSA) data. We also demonstrate that the evolutionary index score, generated from the PPLM’s softmax layer, is good indicator for both mutation pathogenicity and patient survival rate. Our paper has taken a key step toward the clinical utility of large-scale PPLMs.
更多
查看译文
关键词
mutation,protein,prognosis,pathogenicity,language
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要