Efficient Search Engine Measurements

TWEB, ArticleNo.18, 2011.

Cited by: 80|Bibtex|Views10|Links
EI
Keywords:
corpus sizecostly rejectionapproximate importancedocument degreenew importanceMore(11+)

Abstract:

We address the problem of externally measuring aggregate functions over documents indexed by search engines, like corpus size, index freshness, and density of duplicates in the corpus. State of the art estimators for such quantities [Bar-Yossef and Gurevich 2008b; Broder et al. 2006] are biased due to inaccurate approximation of the so ca...More

Code:

Data:

Your rating :
0

 

Tags
Comments