Gene Set Overlap: An Impediment to Achieving High Specificity in Over-representation Analysis

biomedical engineering systems and technologies(2019)

引用 9|浏览5
暂无评分
摘要
Gene set analysis methods are widely used to analyze data from high-throughput "omics" technologies. One drawback of these methods is their low specificity or high false positive rate. Over-representation analysis is one of the most commonly used gene set analysis methods. In this paper, we propose a systematic approach to investigate the hypothesis that gene set overlap is an underlying cause of low specificity in over-representation analysis. We quantify gene set overlap and show that it is a ubiquitous phenomenon across gene set databases. Statistical analysis indicates a strong negative correlation between gene set overlap and the specificity of overrepresentation analysis. We conclude that gene set overlap is an underlying cause of the low specificity. This result highlights the importance of considering gene set overlap in gene set analysis and explains the lack of specificity of methods that ignore gene set overlap. This research also establishes the direction for developing new gene set analysis methods.
更多
查看译文
关键词
Gene Expression, Gene Set Analysis, Gene Set Enrichment, Gene Set Overlap, Specificity
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要