Leveraging functional annotation to identify genes associated with complex diseases.

PLOS COMPUTATIONAL BIOLOGY(2020)

引用 12|浏览34
暂无评分
摘要
To increase statistical power to identify genes associated with complex traits, a number of transcriptome-wide association study (TWAS) methods have been proposed using gene expression as a mediating trait linking genetic variations and diseases. These methods first predict expression levels based on inferred expression quantitative trait loci (eQTLs) and then identify expression-mediated genetic effects on diseases by associating phenotypes with predicted expression levels. The success of these methods critically depends on the identification of eQTLs, which may not be functional in the corresponding tissue, due to linkage disequilibrium (LD) and the correlation of gene expression between tissues. Here, we introduce a new method called T-GEN (Transcriptome-mediated identification of disease-associated Genes with Epigenetic aNnotation) to identify disease-associated genes leveraging epigenetic information. Through prioritizing SNPs with tissue-specific epigenetic annotation, T-GEN can better identify SNPs that are both statistically predictive and biologically functional. We found that a significantly higher percentage (an increase of 18.7% to 47.2%) of eQTLs identified by T-GEN are inferred to be functional by ChromHMM and more are deleterious based on their Combined Annotation Dependent Depletion (CADD) scores. Applying T-GEN to 207 complex traits, we were able to identify more trait-associated genes (ranging from 7.7% to 102%) than those from existing methods. Among the identified genes associated with these traits, T-GEN can better identify genes with high (>0.99) pLI scores compared to other methods. When T-GEN was applied to late-onset Alzheimer's disease, we identified 96 genes located at 15 loci, including two novel loci not implicated in previous GWAS. We further replicated 50 genes in an independent GWAS, including one of the two novel loci. Author summary TWAS-like methods have been widely applied to understand disease etiology using eQTL data and GWAS results. However, it is still challenging to discriminate the true disease-associated genes from those in strong LD with true genes, which is largely due to the misidentification of eQTLs. Here we introduce a novel statistical method named T-GEN to identify disease-associated genes considering epigenetic information. Compared to current TWAS methods, T-GEN not only identified eQTLs with higher CADD scores and function potentials in gene-expression imputation models, but also identified more disease-associated genes across 207 traits and more genes with high (>0.99) pLI scores. Applying T-GEN in late-onset Alzheimer's disease identified 96 genes at 15 loci with two novel loci. Among 96 identified genes, 50 genes were further replicated in an independent GWAS.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要