Assessing ranking metrics in top-N recommendation

Information Retrieval Journal(2020)

引用 31|浏览83
暂无评分
摘要
The evaluation of recommender systems is an area with unsolved questions at several levels. Choosing the appropriate evaluation metric is one of such important issues. Ranking accuracy is generally identified as a prerequisite for recommendation to be useful. Ranking metrics have been adapted for this purpose from the Information Retrieval field into the recommendation task. In this article, we undertake a principled analysis of the robustness and the discriminative power of different ranking metrics for the offline evaluation of recommender systems, drawing from previous studies in the information retrieval field. We measure the robustness to different sources of incompleteness that arise from the sparsity and popularity biases in recommendation. Among other results, we find that precision provides high robustness while normalized discounted cumulative gain offers the best discriminative power. In dealing with cold users, we also find that the geometric mean is more robust than the arithmetic mean as aggregation function over users.
更多
查看译文
关键词
Recommender systems,Top-N recommendation,Evaluation,Ranking metrics,Robustness,Discriminative power
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要