Detecting and Adjusting for Hidden Biases due to Phenotype Misclassification in Genome-Wide Association Studies.

medRxiv : the preprint server for health sciences(2023)

引用 0|浏览8
暂无评分
摘要
With the advent of healthcare-based genotyped biobanks, genome-wide association studies (GWAS) leverage larger sample sizes, incorporate patients with diverse ancestries and introduce noisier phenotypic definitions. Yet the extent and impact of phenotypic misclassification on large-scale datasets is not currently well understood due to a lack of statistical methods to estimate relevant parameters from empirical data. Here, we develop a statistical method and scalable software, PheMED, Phenotypic Measurement of Effective Dilution, to quantify phenotypic misclassification across GWAS using only summary statistics. We illustrate how the parameters estimated by PheMED relate to the negative and positive predictive value of the labeled phenotype, compared to ground truth, and how misclassification of the phenotype yields diluted effect-sizes of variant-phenotype associations. Furthermore, we apply our methodology to detect multiple instances of statistically significant dilution in real-world data. We demonstrate how effective dilution biases downstream GWAS replication and heritability analyses despite utilizing current best practices, and provide a dilution-aware meta-analysis approach that outperforms existing methods. Consequently, we anticipate that PheMED will be a valuable tool for researchers to address phenotypic data quality issues both within and across cohorts.
更多
查看译文
关键词
phenotype misclassification,hidden biases,association,genome-wide
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要