IQN routing: integrating quality and novelty in P2P querying and ranking

ADVANCES IN DATABASE TECHNOLOGY - EDBT 2006(2006)

引用 42|浏览0
暂无评分
摘要
We consider a collaboration of peers autonomously crawling the Web. A pivotal issue when designing a peer-to-peer (P2P) Web search engine in this environment is query routing: selecting a small subset of (a potentially very large number of relevant) peers to contact to satisfy a keyword query. Existing approaches for query routing work well on disjoint data sets. However, naturally, the peers’ data collections often highly overlap, as popular documents are highly crawled. Techniques for estimating the cardinality of the overlap between sets, designed for and incorporated into information retrieval engines are very much lacking. In this paper we present a comprehensive evaluation of appropriate overlap estimators, showing how they can be incorporated into an efficient, iterative approach to query routing, coined Integrated Quality Novelty (IQN). We propose to further enhance our approach using histograms, combining overlap estimation with the available score/ranking information. Finally, we conduct a performance evaluation in MINERVA, our prototype P2P Web search engine.
更多
查看译文
关键词
web search engine,keyword query,query routing,query routing work,comprehensive evaluation,data collection,disjoint data set,information retrieval engine,iterative approach,performance evaluation,IQN routing,P2P querying
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要