Rapid and robust ranking of text documents in a dynamically changing corpus

AICCSA '08 Proceedings of the 2008 IEEE/ACS International Conference on Computer Systems and Applications(2008)

引用 2|浏览0
暂无评分
摘要
Ranking documents in a selected corpus plays an important role in information retrieval systems. Despite notable advances in this direction, with continuously accumulating text documents, maintaining up-to-date ordering among documents in the domains of interest is a challenging task. Conventional approaches can produce an ordering that is only valid within a given corpus. Thus, with such approaches, ordering should be completely redone as documents are added to or deleted from the corpus. In this paper, we introduce a corpus-independent framework for rapid ordering of documents in a dynamically changing corpus. Like in many practical approaches, our framework suggests utilizing a similarity measure in some metric space indicating the degree of relevance of a document to the domain of interest. However, unlike in corpus-dependent approaches, the relevance score of a document remains valid with changes being introduced into the corpus (insertion of new documents, for example), thus allowing a rapid ordering within the corpus. This paper particularly details a statistical approach to compute such relevance scores.
更多
查看译文
关键词
important role,challenging task,ranking document,new document,text document,relevance score,corpus-dependent approach,robust ranking,conventional approach,selected corpus,corpus-independent framework,information retrieval system,metric space,information retrieval systems
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要