Formal Constraints for Structured Document Retrieval.

International Conference on the Theory of Information Retrieval (ICTIR)(2022)

引用 1|浏览3
暂无评分
摘要
The formalization of retrieval constraints for traditional (atomic) retrieval was a major milestone in information retrieval (IR) research. The aim of these constraints was to formalize IR heuristics which most retrieval models rely upon. In a similar fashion, this paper introduces constraints for structured document retrieval (SDR). Out of the many possible constraints, we focus on three that are shown to produce intuitive rankings in simple, but informative retrieval scenarios. It is shown that none of the widely used SDR models (BM25F, MLM, linear score aggregation) satisfy all three constraints. The underlying reason for this is shown to be the failure of existing models to balance between assuming independence of term occurrences across fields and considering the documents as atomic, rather than structured. The constraints introduced in this paper, together with the analysis of how they are satisfied by existing models, can be used to analytically reason about the behaviour of any SDR model in a variety of ranking scenarios.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要