Characterizing substructure via mixture modeling of genetic similarity in large-scale summary statistics

biorxiv(2024)

引用 0|浏览2
暂无评分
摘要
Genetic summary data are both broadly accessible and highly useful including for risk prediction, causal inference, fine mapping, and the incorporation of external controls. Nevertheless, collapsing individual-level data into groups masks intra- and inter-sample heterogeneity (e.g., population structure), leading to confounding, reduced power, and bias. Unaccounted for substructure limits summary data usability, especially for understudied or admixed populations. Here, we present Summix2, a comprehensive set of methods and software based on a computationally efficient mixture model of genetic similarity to estimate and adjust for substructure in genetic summary data. In extensive simulations and application to public data, Summix2 provides accurate and precise estimates detecting regions of selection, identifying fine-scale population structure, and disentangling latent genetic disease risk. Summix2 increases the robust use of diverse publicly available summary data resulting in improved and more equitable research to address population structure, disease etiology, and risk prediction. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要