Independence diagrams: A technique for data visualization

JOURNAL OF ELECTRONIC IMAGING(2000)

引用 2|浏览19
暂无评分
摘要
An important issue in data visualization is the recognition of complex dependencies between attributes. Past techniques for identifying attribute dependence include correlation coefficients, scatterplots, and equi-width histograms. These techniques are sensitive to outliers, and often are not sufficiently informative to identify the kind of attribute dependence present We propose a new approach, which we call independence diagrams. We divide each attribute into ranges; for each pair of attributes, the combination of these ranges defines a two-dimensional grid. For each cell of this grid, we store the number of data items in it We display the grid, sealing each attribute axis so that the displayed width of a range is proportional to the total number of data items within that range. The brightness of a cell is proportional to the density of data items in it As a result, both attributes are independently normalized by frequency, ensuring insensitivity to outliers and skew, and allowing specific focus on attribute dependencies. Furthermore, independence diagrams provide quantitative measures of the interaction between two attributes, and allow formal reasoning about issues such as statistical significance. We have addressed several technical challenges in making independence diagrams work, ranging from the treatment of categorical attributes to visual artifacts of cell-to-pixel mapping. Our experimental evaluation, using both AT&T and synthetic data, shows that independence diagrams allow the easy identification of various kinds of attribute dependence that would be difficult to identify using conventional techniques. (C) 2000 SPIE and IS&T. [S1017-9909(00)01704-9].
更多
查看译文
关键词
synthetic data,data visualization,statistical significance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要