A Semi-Supervised Ensemble Approach to Rank Potential Causal Variants and Their Target Genes in Microglia for Alzheimer's Disease
biorxiv(2022)
摘要
Alzheimer's disease (AD) is the leading cause of death among individuals over 65. Despite many AD genetic variants detected by large genome-wide association studies (GWAS) a limited number of causal genes have been confirmed. Conventional machine learning techniques integrate functional annotation data and GWAS signals to assign variants functional relevance probabilities. Yet, a large proportion of genetic variation lies in the non-coding genome, where unsupervised and semi-supervised techniques have demonstrated a greater advantage. Furthermore, cell-type specific approaches are needed to better understand disease etiology. Studying AD from a microglia-specific lens is more likely to reveal causal variants involved in immune pathways. Therefore, in this study, we developed a semi-supervised ensemble approach using microglia-specific data to prioritize non-coding variants and their target genes that play roles in immune-related AD mechanisms. We designed a transductive positive-unlabeled and negative-unlabeled learning model that employs a bagging technique to learn from unlabeled variants, generating multiple predicted probabilities of variant risk. Using a combined homogeneous-heterogeneous ensemble framework, we aggregated the predictions. We applied our model to AD variant data, identifying 11 risk variants acting in well-known AD genes, such as TSPAN14, INPP5D, and MS4A2. These results validated our model's performance and demonstrated a need to study these genes in the context of microglial pathways. We also proposed further experimental study for 37 potential causal variants associated with less-known genes. Our work has utility in predicting AD relevant genes and variants functioning in microglia and can be generalized for application to other complex diseases.
### Competing Interest Statement
The authors have declared no competing interest.
更多查看译文
关键词
rank potential causal variants,microglia,ensemble approach,alzheimer,target genes,semi-supervised
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要