Using Deep Learning to Classify Full-Length Transcriptome Sequences.

Weiguo Li, Junchi Ma, Cuiyuan Li, Ting Yu,Xuefeng Cui

2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)(2023)

引用 0|浏览3
暂无评分
摘要
The emergence of Third-Generation Sequencing (TGS) has revolutionized transcriptome sequencing, allowing the production of long reads that span multiple kilobases. This breakthrough has enabled the sequencing of entire transcript sequences. However, a major challenge is posed by the high error rates associated with TGS, making it difficult to accurately classify transcript sequences against reference sequences using traditional algorithms. Fortunately, deep learning-based embedding methods can be trained to overcome these errors. In this pioneering study, we introduce trxCNN, a deep learning model that exhibits remarkable accuracy in classifying erroneous transcript sequences compared to reference sequences. Specifically, evaluations of simulated data have revealed that trxCNN has an impressive classification accuracy of 87.1%. This accuracy exceeds that of the Minimap2 and magicBLAST aligners, both designed for TGS data, by 10.7% and 9.0%, respectively. Furthermore, we provide evidence that trxCNN is capable of accurately estimating the abundance of transcripts. These findings strongly suggest that deep learning methods have great potential to effectively process errors-affected sequencing data.
更多
查看译文
关键词
third-generation sequencing (TGS),full-length transcriptome sequencing,transcriptome classification,transcriptome abundance estimation,deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要