Necessary And Frequent Terms In Queries

SIGIR '14: The 37th International ACM SIGIR Conference on Research and Development in Information Retrieval Gold Coast Queensland Australia July, 2014(2014)

引用 2|浏览29
暂无评分
摘要
Vocabulary mismatch has long been recognized as one of the major issues affecting search effectiveness. Ineffective queries usually fail to incorporate important terms and/or incorrectly include inappropriate keywords. However, in this paper we show another cause of reduced search performance: sometimes users issue reasonable query terms, but systems cannot identify the correct properties of those terms and take advantages of the properties. Specifically, we study two distinct types of terms that exist in all search queries: (1) necessary terms, for which term occurrence alone is indicative of document relevance; and (2) frequent terms, for which the relative term frequency is indicative of document relevance within the set of documents where the term appears. We evaluate these two properties of query terms in a dataset. Results show that only 1/3 of the terms are both necessary and frequent, while another 1/3 only hold one of the properties and the final third do not hold any of the properties. However, existing retrieval models do not clearly distinguish terms with the two properties and consider them differently. We further show the great potential of improving retrieval models by treating terms with distinct properties differently.
更多
查看译文
关键词
Query,term frequency,term occurrence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要