Toward Estimating The Rank Correlation Between The Test Collection Results And The True System Performance

SIGIR '16: The 39th International ACM SIGIR conference on research and development in Information Retrieval Pisa Italy July, 2016(2016)

引用 4|浏览21
暂无评分
摘要
The Kendall tau and AP rank correlation coefficients have become mainstream in Information Retrieval research for comparing the rankings of systems produced by two different evaluation conditions, such as different effectiveness measures or pool depths. However, in this paper we focus on the expected rank correlation between the mean scores observed with a test collection and the true, unobservable means under the same conditions. In particular, we propose statistical estimators of tau and AP correlations following both parametric and non-parametric approaches, and with special emphasis on small topic sets. Through large scale simulation with TREC data, we study the error and bias of the estimators. In general, such estimates of expected correlation with the true ranking may accompany the results reported from an evaluation experiment, as an easy to understand figure of reliability. All the results in this paper are fully reproducible with data and code available online.
更多
查看译文
关键词
Evaluation,Test Collection,Correlation,Kendall,Average Precision,Estimation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要