Data-driven transforms for exploration, visualization and classification of high-dimensional data

Data-driven transforms for exploration, visualization and classification of high-dimensional data(2010)

引用 22|浏览2
暂无评分
摘要
Advances in data recording techniques allow collecting of massive amounts of data, often accompanied by external metadata. To gain a full understanding of these datasets, the metadata needs to be incorporated into the analysis. This dissertation focuses on data-driven transforms: effect analysis, creating transforms that incorporate metadata directly into dataset topology, and study of data-driven transform applications. We study the transform effects using a set of new visual methods for analysis of dataset structures. The methods analyze feature distribution, topological structure, and estimates of whether the structure carries significant class information. We apply these to explore the structure of the dataset and to explore the effects of data-driven transforms. We also propose data-driven transforms that incorporate metadata directly into the dataset topology. One such transform, the force feature space (FFS) transform, modifies the dataset topology based on class metadata to emphasize similarities between points in the same class and enhance class separability. FFS can be tailored to any dataset by changing the force definitions or adjusting the parameters. FFS transforms combined with a low-dimensional projection increase the quality of visualizations. When used for classification, FFS offers alternative approaches that increase correctness and reliability. Analysis of attractive and repulsive forces can be used to increase quality of feature detection. Data-driven transforms provide alternative views of the dataset, revealing properties hidden in the original space. Understanding the effects and potential of data-driven transforms allows for better exploration of the transform space and increases the quality of analysis.
更多
查看译文
关键词
high-dimensional data,class metadata,significant class information,force feature space,class separability,external metadata,feature distribution,feature detection,effect analysis,dataset topology,dataset structure
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要