Efficient Query Delegation by Detecting Redundant Retrieval Strategies
msra(2007)
摘要
The task of combining the output of several retrieval strate- gies into a single relevance prediction per document is known as data fusion. The LETOR dataset provides three corpora with predictions of 25 or 44 strategies (depending on the corpus) per document/query pair. Given such a large num- ber of basic strategies, a point which is equally crucial as optimality of the combination, in our view, is its sparseness: Which strategies should be used in a real application when each strategy consumes resources? We hence focus on the question of "query delegation", a special case of weighting strategies: Which strategies should be weighted greater than zero, i.e., asked in the first place? We propose several sim- ilarity measures between strategies like various correlation measures or precision@n. Assuming that similar strategies may not contribute much to each other's results, we perform a clustering based on these similarities and only consider the best representative of each cluster. We show that this fusion strategy performs comparably to other fusion approaches like RankSVM or RankBoost, but only needs to consult a fraction of the available retrieval strategies.
更多查看译文
关键词
fusion,correlation,ranking,clustering,data fusion,rank correlation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络