Two-Step SPLADE: Simple, Efficient and Effective Approximation of SPLADE
European Conference on Information Retrieval(2024)
摘要
Learned sparse models such as SPLADE have successfully shown how to
incorporate the benefits of state-of-the-art neural information retrieval
models into the classical inverted index data structure. Despite their
improvements in effectiveness, learned sparse models are not as efficient as
classical sparse model such as BM25. The problem has been investigated and
addressed by recently developed strategies, such as guided traversal query
processing and static pruning, with different degrees of success on in-domain
and out-of-domain datasets. In this work, we propose a new query processing
strategy for SPLADE based on a two-step cascade. The first step uses a pruned
and reweighted version of the SPLADE sparse vectors, and the second step uses
the original SPLADE vectors to re-score a sample of documents retrieved in the
first stage. Our extensive experiments, performed on 30 different in-domain and
out-of-domain datasets, show that our proposed strategy is able to improve mean
and tail response times over the original single-stage SPLADE processing by up
to 30× and 40×, respectively, for in-domain datasets, and by 12x
to 25x, for mean response on out-of-domain datasets, while not incurring in
statistical significant difference in 60% of datasets.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要