The Evaluation of Rating Systems in Online Free-for-All Games

arxiv(2020)

引用 1|浏览18
暂无评分
摘要
Online competitive games have become increasingly popular. To ensure an exciting and competitive environment, these games routinely attempt to match players with similar skill levels. Matching players is often accomplished through a rating system. There has been an increasing amount of research on developing such rating systems. However, less attention has been given to the evaluation metrics of these systems. In this paper, we present an exhaustive analysis of six metrics for evaluating rating systems in online competitive games. We compare traditional metrics such as accuracy. We then introduce other metrics adapted from the field of information retrieval. We evaluate these metrics against several well-known rating systems on a large real-world dataset of over 100,000 free-for-all matches. Our results show stark differences in their utility. Some metrics do not consider deviations between two ranks. Others are inordinately impacted by new players. Many do not capture the importance of distinguishing between errors in higher ranks and lower ranks. Among all metrics studied, we recommend Normalized Discounted Cumulative Gain (NDCG) because not only does it resolve the issues faced by other metrics, but it also offers flexibility to adjust the evaluations based on the goals of the system
更多
查看译文
关键词
Rating systems, Rank prediction, Evaluation, Free-for-all games
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要