GPU-accelerated incremental correlation clustering of large data with visual feedback

Silicon Valley, CA(2013)

引用 11|浏览33
暂无评分
摘要
Clustering is an important preparation step in big data processing. It may even be used to detect redundant data points as well as outliers. Elimination of redundant data and duplicates can serve as a viable means for data reduction and it can also aid in sampling. Visual feedback is very valuable here to give users confidence in this process. Furthermore, big data preprocessing is seldom interactive, which stands at conflict with users who seek answers immediately. The best one can do is incremental preprocessing in which partial and hopefully quite accurate results become available relatively quickly and are then refined over time. We propose a correlation clustering framework which uses MDS for layout and GPU-acceleration to accomplish these goals. Our domain application is the correlation clustering of atmospheric mass spectrum data with 8 million data points of 450 dimensions each.
更多
查看译文
关键词
data visualisation,graphics processing units,learning (artificial intelligence),pattern clustering,GPU-accelerated incremental correlation clustering,MDS,atmospheric mass spectrum data,big data preprocessing,big data processing,data reduction,graphics processing unit,large data clustering,multidimensional scaling,outliers detection,redundant data points detection,visual feedback,GPU,big data,clustering,correlation,visual analytics,visualization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要