How Train–Test Leakage Affects Zero-Shot Retrieval

String Processing and Information Retrieval(2022)

引用 1|浏览5
暂无评分
摘要
Neural retrieval models are often trained on (subsets of) the millions of queries of the MS MARCO/ORCAS datasets and then tested on the 250 Robust04 queries or other TREC benchmarks with often only 50 queries. In such setups, many of the few test queries can be very similar to queries from the huge training data—in fact, 69% of the Robust04 queries have near-duplicates in MS MARCO/ORCAS. We investigate the impact of this unintended train–test leakage by training neural retrieval models on combinations of a fixed number of MS MARCO/ORCAS queries, which are very similar to actual test queries, and an increasing number of other queries. We find that leakage can improve effectiveness and even change the ranking of systems. However, these effects diminish the smaller, and thus more realistic, the extent of leakage is in all training instances.
更多
查看译文
关键词
Neural information retrieval, Train–test leakage, BERT, T5
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要