Machine learning for Alzheimer’s disease classification from genomic tiling variants

Alzheimer's & Dementia(2022)

引用 0|浏览11
暂无评分
摘要
Abstract Background As machine learning (ML) technologies make improving strides, they’ve demonstrated advantageous novelties in the understanding and prediction of Alzheimer’s Disease (AD). Recent efforts in AD research have aimed to capitalize ML for AD genomics. Whole‐genome tiling (WGT) establishes a new representation of whole‐genome sequencing (WGS) data, one that has been brought forth in support of ML and precision medicine. Method A comprehensive description, workflow, and publicly available WGT data can be found at https://curii.co/su92l-j7d0g-swtofxa2rct8495 . In this work, we analyzed the ADNI WGT data, and first performed quality control, imputation, and one‐hot encoding to each tile and the respective tile variants (Figure 1). Then, using the GWAS result from an independent cohort, we mapped the top ten most significant single‐nucleotide polymorphisms (SNPs) associated with AD, to their respective tiles. Using these tile data, we performed AD classification with an XGBoost algorithm. We then reiterated this process including covariates (age, sex, and ethnicity) with tiling variants. A total of 1,545, subjects (474 cases, 1,071 controls) were studied. Result After mapping our 10 most significant SNPs from an independent GWAS study to WGT data. We ended up with 10 different tiles where those 10 tiles encapsulated 356 SNPs. In our comparative analysis we make AD diagnosis classifications using either our 10 tiles of WGT data, our 356 SNPs in WGS data, covariate data (age, sex, and ethnicity), or a combination of genomic data (tiles or SNPs) and covariates. Our results (Figure 2) show that WGT performs comparatively to that of SNPs, indicating the patterns embedded within individual SNPs and those within WGT data have similar discriminative power. Conclusion Our pilot investigation on the use of a small set of targeted WGT represented genomic data within the context of ML and AD classification has demonstrated similar performance to the SNP data. These results suggest that despite the differing information encapsulated within WGT data compared to WGS data, we are able to derive comparable results in AD classification. Our study on WGT and its use in AD machine learning shows the potentiality and novelty of WGT, and it warrants probing for further exploration of its maximal use.
更多
查看译文
关键词
alzheimers,disease classification,learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要