The BINGO! Focused Crawler: From Bookmarks to Archetypes
ICDE(2002)
摘要
bove (i.e., viewingthe query terms as an initial training document).According to [CBD99a] the key components of afocused crawler are a document classifier to test whethera visited document fits into one of the specified topics ofinterest, and a distiller to identify the best URLs for thecrawl frontier (i.e., those hyperlinks in already visiteddocuments that, when traversed, promise the best resultsin the continuation of the crawl). Obviously the distillershould be aware of the...
更多查看译文
关键词
focused crawler,archetypes,support vector machines,training data,classification,search engines,ontologies,world wide web
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要