Using Spearman's correlation coefficients for exploratory data analysis on big dataset.

Concurrency and Computation: Practice and Experience(2016)

引用 145|浏览78
暂无评分
摘要
Correlation analysis is both popular and useful in a number of social networking research, particularly in the exploratory data analysis. In this paper, three well-known and often-used correlation coefficients, Pearson product-moment correlation coefficient, Spearman, and Kendall rank correlation coefficients, are compared from definition to application domain. Based on the characteristics of the pump's vibration dataset, the nonparametric and distribution-free Spearman rank correlation coefficient is introduced to analyze the relationship between the pump's working state and each of the 207'880 variables. The percentage of variables and exact variables' tables with high Spearman's correlation coefficients for states I and II, states I and III, states II and III, and three states in different files are obtained respectively, which has important valuation for the future research of the unsupervised machine learning system. Copyright © 2015 John Wiley & Sons, Ltd.
更多
查看译文
关键词
exploratory data analysis,correlation analysis,Spearman correlation coefficient,p-value,vibration analysis,pump state
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要