Bioinformatics pipeline to identify candidate regulatory variants in late‐onset Alzheimer’s disease (LOAD) associated regions

Alzheimer's & Dementia(2023)

引用 0|浏览0
暂无评分
摘要
Background Small insertions and deletions (indels) in the human genome are substantial contributors to genetic variation and impact human traits and diseases. The role of indels in late onset Alzheimer’s disease has been understudied. Few examples, such as the intronic poly‐T variant in the TOMM40 gene, suggest that systematic exploration of indels within LOAD risk regions will advance the understanding of the genetic architecture of LOAD. Previously we developed a bioinformatics pipeline that characterizes and prioritizes candidate regulatory SNPs in enhancers located in LOAD‐GWAS regions. Here we extend the pipeline to the analysis of indels. The proposed bioinformatics pipeline progresses from indels located in LOAD‐GWAS regions to a filtered set of regulatory variants that have a predicted strong effect on transcription factor (TFs) binding. Method The pipeline utilized publicly available functional genomics data sources. Primarily, candidate cis‐regulatory elements (cCREs) from ENCODE and single‐cell RNA‐seq data from LOAD patient samples (synapse: syn22079621). For TF binding analysis we employed motifs from MotifDb. In addition, we used various bioinformatics software including motifbreakR. Result We catalogued 1230 proximal CTCF‐bound candidate cis‐regulatory elements in LOAD‐GWAS regions, 912 showed epigenetic evidence in relevant brain tissue. We catalogued 426 indels in these cCREs. These indels disrupted 391 TFs, 362 of these had snRNA‐seq data from LOAD samples. Of note, TF motifs within the APOE‐TOMM40, SPI1 and MS4A2 regions were significantly disrupted by the candidate regulatory indels. Amongst these TFs are RUNX3, SPI1 and SMAD3. Interestingly, these significant findings with the APOE‐TOMM40, SPI1 and MS4A2 regions are consistent with our prior results for SNPs. Conclusion This study provides an analytical framework to catalogue noncoding indel variation in cis‐regulatory elements located in LOAD‐GWAS loci and characterize their likelihood to perturb TF binding. The approach integrates multiple data types to prioritize genes and variants for validation experiments using disease models and gene editing technologies.
更多
查看译文
关键词
alzheimers,regulatory variants
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要