A New Term Frequency Normalization Model for Probabilistic Information Retrieval.

SIGIR(2018)

引用 3|浏览17
暂无评分
摘要
In probabilistic BM25, term frequency normalization is one of the key components. It is often controlled by parameters $k_1$ and b , which need to be optimized for each given data set. In this paper, we assume and show empirically that term frequency normalization should be specific with query length in order to optimize retrieval performance. Following this intuition, we first propose a new term frequency normalization with query length for probabilistic information retrieval, namely \textttBM25\tiny QL . Then \textttBM25\tiny QL is incorporated into the state-of-the-art models CRTER riptsize 2 and LDA-BM25, denoted as $\textttCRTER riptsize 2 ^\texttt\tiny QL $ and \textttLDA-BM25\tiny QL respectively. A series of experiments show that our proposed approaches \textttBM25\tiny QL , $\textttCRTER riptsize 2 ^\texttt\tiny QL $ and \textttLDA-BM25\tiny QL are comparable to BM25, CRTER riptsize 2 and LDA-BM25 with the optimal b setting in terms of MAP on all the data sets.
更多
查看译文
关键词
Term Frequency Normalization,BM25,Probabilistic Model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要