Ribosomal In-Frame Mis-Translation Of Stop Codons In Multiple Open Reading Frames Of Specific Human Long Non-Coding Rnas

2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM)(2019)

引用 0|浏览7
暂无评分
摘要
One of the major discoveries of the early post-genomic era, as embodied by the gene catalogs of the FANTOM and ENCODE (Encyclopedia of DNA Elements) consortia, is that two-thirds of human genes do not encode known proteins. Those similar to 40,000 non-protein-coding (non-coding RNA) human genes (www.gencodegenes.org) remain poorly understood. Long non-coding RNA (lncRNA) genes are the most numerous category of human ncRNA genes. Hundreds of lncRNAs have recently-discovered functions and are now understood to be fundamental nuclear and cytoplasmic, epigenetic and post-transcriptional, positive and negative regulators of gene expression in normal cellular functions and in a wide range of human diseases. However, the functions, if any, of the vast majority of lncRNAs remain obscure. Significantly, an unconventional role for their transcripts as unexpected de-facto messenger RNAs has not been formally excluded. Ribosome profiling (Riboseq) predicts translational potential; nonetheless, without independent evidence of proteins matching lncRNA open reading frames (ORFs), ribosome binding does not prove translation. We were the first to mass-spectrometrically document translation of specific lncRNAs in human cells (https://genome.cshlp.org/content/22/9/1646.long). We have now performed a global search for lncRNA translation in human MCF7 breast cancer cells, integrating strand-specific RNAseq, Riboseq, and deep mass spectrometry of trypsin-digested <15kDa fractions post-UHPLC (Orbitrap MS/MS) in biological quadruplicates by two independent core facilities. We excluded known-protein matches. UCSC Genome Browser-assisted manual annotation of imperfect (tryptic-digest-peptides)-to-(lncRNA-three-frame-translations) alignments initially revealed three peptides hypothetically explicable by "stop-to-nonstop" in-frame replacement of stop codons by amino acids in two ORFs of the lncRNA MMP24-AS1. To search for this phenomenon genomewide, we designed and implemented an unprecedented computational pipeline, matching tryptic-digest spectra to wildcard-instead-of-stop versions of repeat-masked, six-frame, whole-genome translations. Along with singleton stop-to-nonstop events affecting four other lncRNAs, we identified 24 additional peptides with stop-to-nonstop inframe substitutions from multiple MMP24-AS1 lncRNA ORFs. Only UAG and UGA, but not UAA, stop codons were affected. All MMP24-AS1-matching spectra met the same significance thresholds as high-confidence known-protein signatures. Targeted resequencing of MMP24-AS1 genomic DNA and cDNA from the same samples did not reveal any mutations, polymorphisms, or sequencing-detectable RNA editing. We have therefore discovered an apparent gene-specific violation of the genetic code. It highlights the importance of matching peptides to whole-genome, not known-genes-only, ORFs in mass-spectrometry workflows, and suggests a new mechanism enhancing the combinatorial complexity of the proteome.
更多
查看译文
关键词
cDNA,MMP24-AS1 genomic DNA,ribosomal in-frame mistranslation,tryptic-digest-peptides,UCSC Genome Browser-assisted manual annotation,strand-specific RNAseq,human MCF7 breast cancer cells,lncRNA translation,specific lncRNAs,ribosome binding,ribosome profiling,de-facto messenger RNAs,human diseases,normal cellular functions,gene expression,negative regulators,positive regulators,post-transcriptional regulators,human ncRNA genes,long noncoding RNA genes,nonprotein-coding,gene catalogs,genetic code,MMP24-AS1-matching spectra,multiple MMP24-AS1 lncRNA ORFs,whole-genome translations,lncRNA MMP24-AS1
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要