TopicMine: User-Guided Topic Mining by Category-Oriented Embedding

user-5ebe28d54c775eda72abcdf7(2019)

引用 1|浏览786
暂无评分
摘要
With an ever-increasing volume of textual data coming from news reports, social media, literature articles, and medical records, it becomes a necessity to distill knowledge from text data by categories according to users’ interests. For example, given a general news corpus, one user may be interested in organizing articles by countries; whereas another may want to browse articles by themes. In either case, a user’s interest can be easily described by a set of category names. In this project, we develop a framework, TopicMine, which takes user-provided category names as guidance and mines category representative phrases to form coherent topics. Specifically, TopicMine first leverages a phrase mining tool to extract quality phrases from the text corpus, and then learns an embedding space that best separates the categories specified by the user. Finally, category representative phrases are retrieved by considering both topic relevance and semantic generality. The mined topics identified by category representative phrases facilitate effective and efficient understanding, organizing, searching, and summarizing of textual contents based on users’ needs.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要