Discretization of Unlabeled Data using RST & Clustering

Journal of Informatics and Mathematical Sciences(2019)

引用 0|浏览0
暂无评分
摘要
An algorithm can be applied on numerical or continuous attributes as well as on nominal or discrete value. If input to an algorithm required only attributes of nominal or discrete type then continuous attributes of the dataset need to be discretize before applying such algorithm. Discretization method can be of two types namely supervised and unsupervised. Supervised methods of dicretization utilize class labels of the dataset while in unsupervised method class labels are totally disregarded. In many literatures it has been shown that supervised methods gives good discretization result. Supervised algorithms cannot apply if dataset is unlabeled. In real life, many dataset do not have class (label) attribute and only unsupervised discretization methods are applicable in such cases. This paper presents discretization schemes for unlabeled data based on RST (Rough Set Theory) and clustering. The experiments have been performed to compare the proposed technique with other discretization methods for labeled data on two benchmark datasets. Two parameters Class-Attribute Interdependence Redundancy and the total number of intervals have been used to compare the proposed techniques with other existing techniques. The results display a satisfactory tradeoff between the information loss and number of intervals for the proposed method.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要