Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Datasets

IEEE Transactions on Knowledge and Data Engineering(2002)

引用 296|浏览9
暂无评分
摘要
Abstract: We investigate the use of biased sampling according to the density of the dataset, to speed up the operation of general data mining tasks, such as clustering and outlier detection in large multidimensional datasets. In density-biased sampling, the probability that a given point will be included in the sample depends on the local density of the dataset. We propose a general technique for density-biased sampling that can factor in user requirements to sample for properties of interest, and can be ...
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要