Optimized Query Execution in Large Search Engines with Global Page Ordering.

VLDB '03: Proceedings of the 29th international conference on Very large data bases - Volume 29(2003)

引用 170|浏览38
暂无评分
摘要
Large web search engines have to answer thousands of queries per second with interactive response times. A major factor in the cost of executing a query is given by the lengths of the inverted lists for the query terms, which increase with the size of the document collection and are often in the range of many megabytes. To address this issue, IR and database researchers have proposed pruning techniques that compute or approximate term-based ranking functions without scanning over the full inverted lists. Over the last few years, search engines have incorporated new types of ranking techniques that exploit aspects such as the hyperlink structure of the web or the popularity of a page to obtain improved results. We focus on the question of how such techniques can be efficiently integrated into query processing. In particular, we study pruning techniques for query execution in large engines in the case where we have a global ranking of pages, as provided by Pagerank or any other method, in addition to the standard term-based approach. We describe pruning schemes for this case and evaluate their efficiency on an experimental cluster-based search engine with million web pages. Our results show that there is significant potential benefit in such techniques.
更多
查看译文
关键词
pruning technique,query execution,query processing,query term,approximate term-based ranking function,experimental cluster-based search engine,global ranking,large web search engine,million web page,pruning scheme,Optimized query execution,global page,large search engine
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要