Multi-variable Discretization Based on Extended Maximum Information Coefficient
2020 7th International Conference on Information Science and Control Engineering (ICISCE)(2020)
摘要
Supervised data discretization is partitioning continuous variables based on a label of classification. The label usually defaults to a categorical variable. But in some applying scenario, the label may be a continuous variable. The usual way to handle this problem is transferring the label to a categorical variable, then discretizing target variable. However, the error generated in the first step may be magnified in the second step. Therefore, we propose a data discretization method based on an extended maximum information coefficient which can deal with multiple variables to combine two steps in one. We analyze the mathematical properties and calculation strategy of the method, and give examples of data set from the world health organization.
更多查看译文
关键词
association analysis,data discretization,maximum information coefficient
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要