GARSA: An integrative pipeline for genome wide association studies and polygenic risk score inference in admixed human populations

Fernando Rossi,Jose L. Patane,Vinicius de Souza, Jennifer Montoya Neyra, Rogerio Rosa,José Eduardo Krieger,Samantha Kuwada Teixeira

bioRxiv (Cold Spring Harbor Laboratory)(2023)

引用 0|浏览2
暂无评分
摘要
Abstract Genome-wide association studies (GWAS) and polygenic risk scores (PRS) are multistep analytical tools to identify genetic variants and to assess their contribution to phenotypes/diseases. These analyses are evolving and becoming instrumental to understand the genetic architecture of complex phenotypes/diseases. Nevertheless, to date, there is no single solution incorporating all major steps related to those analyses combined with robust populational bias correction. Here, we describe a semi-automated pipeline unifying steps involved in GWAS and PRS including widely used software. Our pipeline handles quality control (QC), GWAS, and PRS steps, managing different types of input/output files. Furthermore, it includes robust bias correction steps, such as inference of kinship matrix with correction for population structure, use of principal component analysis (PCA) with detection and removal of outlier variant followed by re-projection of related individuals (if desired), generation of PCA figures that assist in setting the best number of principal components (PCs) for association analysis, availability of mixed models, use of recommended software for GWAS based on population size, and a Markov chain Monte Carlo (MCMC) method to estimate best set of PRS parameters. Finally, we tested GARSA pipeline in a family-based Brazilian admixed population and demonstrated that the corrections implemented indeed mitigate bias in downstream analysis. The pipeline can be implemented on personal or server-side environments. Availability The development version (open-source) is available in https://github.com/LGCM-OpenSource/GARSA Contact Fernando P. N. Rossi - fernando.rossi@hc.fm.usp.br ; José S. L. Patané - jose.patane@hc.fm.usp.br Supplementary information Supplementary tutorial.
更多
查看译文
关键词
genome wide association studies,polygenic risk score inference
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要