CodAn: predictive models for the characterization of mRNA transcripts in Eukaryotes

biorxiv(2019)

引用 0|浏览10
暂无评分
摘要
Characterization of the coding sequences (CDSs) is an essential step on transcriptome annotation. Incorrect characterization of CDSs can lead to the prediction of non-existent proteins that can eventually compromise knowledge if databases are populated with similar incorrect predictions made in different genomes. Even though some recent methods have succeeded in correctly prediction of the stop codon position in strand-specific sequences, prediction of the complete CDS is still far from a gold standard. More importantly, prediction in strand-blind sequences and in partial sequences is deficient, presenting very low accuracy. Here, we present CodAn, a new computational approach to predict CDS and UTR, that significantly pushes the boundaries of CDS prediction in strand-blind and in partial sequences, increases strand-specific full-CDS predictions and matches or surpasses gold-standard results in strand-specific stop codon predictions. CodAn is freely available for download at .
更多
查看译文
关键词
transcriptome,computational prediction,gene annotation,coding sequence,untranslated sequence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要