A pre-trained large language model for translating single-cell transcriptome to proteome

biorxiv(2023)

引用 0|浏览1
暂无评分
摘要
Despite the recent advancements in single-cell proteome technology, it still has limitation on throughput, proteome depth and batch effect, and the cost is still high. Inspired by the translation procedure of both natural language processing (NLP) and the genetic central dogma, we propose a pre-trained large language model named scTranslator (single-cell translator), which is align-free and generates absent single-cell proteome by inferring from the transcriptome. scTranslator achieves a general knowledge of RNA-protein interactions by being pre-trained on substantial amounts of bulk and single-cell data. Systematic benchmarking confirms the accuracy, stability, and flexibility of scTranslator across various quantification techniques, cell types, and conditions. Furthermore, we apply scTranslator to various downstream analyses and applications, including interaction inference, gene pseudo-knockout, cell clustering, and cell origin recognition of pan-cancer data. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要