Structure-Aware Visualization of Text Corpora.

CHIIR(2017)

引用 9|浏览28
暂无评分
摘要
Trying to comprehend the structure and content of large text corpora can be a daunting and often time consuming task. In this paper, we introduce a novel tool that exploits the structural properties for extracting and visualizing the underlying topics in a given dataset. To this end, we make use of a combination of latent topic analysis, discriminative feature selection applied on top of the category structure of corpora, and various ranking methods in order to extract the most representative topics for a given corpus. The visual moniker to depict the outcome of these methods can be chosen based on the context. Such visual representations can be useful for depicting trends, identifying ``hot'' topics, and discovering interesting patterns in the underlying data. As applications, we create example representations for a variety of corpora obtained from conference proceedings, movie summaries, and newsgroup postings. Our user experiments demonstrate the viability of our approach, with a flower-like visualization inspired by the ``wheel of emotion'', for generating high quality representative topics and for unearthing hidden structures and connections in large document corpora.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要