Clustering Search Engine Suggests By Integrating A Topic Model And Word Embeddings

2017 18TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNDP 2017)(2017)

引用 5|浏览22
暂无评分
摘要
The background of this paper is the issue of how to overview the knowledge of a given query keyword. Especially, we focus on concerns of those who search for Web pages with a given query keyword. The Web search information needs of a given query keyword is collected through search engine suggests. Given a query keyword, we collect up to around 1,000 suggests, while many of them are redundant. We cluster redundant search engine suggests based on a topic model. However, one limitation of the topic model based clustering of search engine suggests is that the granularity of the topics, i.e., the clusters of search engine suggests, is too coarse. In order to overcome the problem of the coarse-grained clusters of search engine suggests, this paper further applies the word embedding technique to the Web pages used during the training of the topic model, in addition to the text data of the whole Japanese version of Wikipedia. Then, we examine the word embedding based similarity between search engines suggests and further classify search engine suggests within a single topic into ner-grained subtopics based on the similarity of word embeddings. Evaluation results prove that the proposed approach performs well in the task of subtopic clustering of search engine suggests.
更多
查看译文
关键词
Search Engine Suggest, Overview, Clustering, Topic Model, Word Embeddings
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要