Measuring the severity of multi-collinearity in high dimensions

arxiv(2022)

引用 0|浏览1
暂无评分
摘要
Multi-collinearity is a wide-spread phenomenon in modern statistical applications and when ignored, can negatively impact model selection and statistical inference. Classic tools and measures that were developed for "$n>p$" data are not applicable nor interpretable in the high-dimensional regime. Here we propose 1) new individualized measures that can be used to visualize patterns of multi-collinearity, and subsequently 2) global measures to assess the overall burden of multi-collinearity without limiting the observed data dimensions. We applied these measures to genomic applications to investigate patterns of multi-collinearity in genetic variations across individuals with diverse ancestral backgrounds. The measures were able to visually distinguish genomic regions of excessive multi-collinearity and contrast the level of multi-collinearity between different continental populations.
更多
查看译文
关键词
severity,high dimensions,multi-collinearity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要