Using Long-read RNA sequencing for the identification of novel transcripts in disease-causing muscle genes

M. Johari, G. Ravenscroft

NEUROMUSCULAR DISORDERS(2023)

引用 0|浏览3
暂无评分
摘要
A fundamental limitation in asserting a gene-disease association is the interpretation of variants of uncertain significance (VUS), mainly when the genes are associated with multiple phenotypes. Species- and tissue-specific transcript expression plays a vital role in determining the pathogenicity of a variant. Although bulk RNA sequencing of multiple tissues has successfully revealed tissue-specific expression for many genes, short-read RNA sequencing has known limitations in characterizing disease-associated transcripts. Recently, long-read RNA sequencing (LR-RNAseq) techniques aim to eliminate these limitations by sequencing full-length individual transcript molecules, making identifying splicing aberrations much easier. In this study, we evaluated the genes enriched in the heart and skeletal muscles and associated with skeletal muscle diseases. We analyzed tissue-specific transcript expression of 28 genes from publicly available LR-RNAseq data of healthy adults: 17 cardiac muscles and two skeletal muscle samples (ENCODE). Using the established ENCODE LR-RNAseq analysis pipeline and TALON for calculating transcript abundance, we filtered for transcripts observed with at least five reads. As a result, we identified an average of 296 novel transcripts in these 28 genes. Approximately 216 transcripts resulted from a mismatch with a known transcript model due to a novel putative start or endpoint. On the other hand, approximately 80 transcripts resulted from either a novel combination of known splice donors/acceptors or novel splice donors/acceptors. We are currently performing LR-RNAseq using skeletal muscle biopsies from 22 healthy human adults to explore the transcript diversity and better understand the exon usage/transcript heterogeneity arising in large muscle genes: e.g., NEB, TTN and OBSCN. Our initial data shows that LR-RNAseq significantly improves our understanding of tissue specificity and is a valuable tool for interpreting VUS in neuromuscular disorders. A fundamental limitation in asserting a gene-disease association is the interpretation of variants of uncertain significance (VUS), mainly when the genes are associated with multiple phenotypes. Species- and tissue-specific transcript expression plays a vital role in determining the pathogenicity of a variant. Although bulk RNA sequencing of multiple tissues has successfully revealed tissue-specific expression for many genes, short-read RNA sequencing has known limitations in characterizing disease-associated transcripts. Recently, long-read RNA sequencing (LR-RNAseq) techniques aim to eliminate these limitations by sequencing full-length individual transcript molecules, making identifying splicing aberrations much easier. In this study, we evaluated the genes enriched in the heart and skeletal muscles and associated with skeletal muscle diseases. We analyzed tissue-specific transcript expression of 28 genes from publicly available LR-RNAseq data of healthy adults: 17 cardiac muscles and two skeletal muscle samples (ENCODE). Using the established ENCODE LR-RNAseq analysis pipeline and TALON for calculating transcript abundance, we filtered for transcripts observed with at least five reads. As a result, we identified an average of 296 novel transcripts in these 28 genes. Approximately 216 transcripts resulted from a mismatch with a known transcript model due to a novel putative start or endpoint. On the other hand, approximately 80 transcripts resulted from either a novel combination of known splice donors/acceptors or novel splice donors/acceptors. We are currently performing LR-RNAseq using skeletal muscle biopsies from 22 healthy human adults to explore the transcript diversity and better understand the exon usage/transcript heterogeneity arising in large muscle genes: e.g., NEB, TTN and OBSCN. Our initial data shows that LR-RNAseq significantly improves our understanding of tissue specificity and is a valuable tool for interpreting VUS in neuromuscular disorders.
更多
查看译文
关键词
novel transcripts,rna,muscle,genes,long-read,disease-causing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要