Exploiting Disambiguated Thesauri For Information Retrieval In Metadata Catalogs

CURRENT TOPICS IN ARTIFICIAL INTELLIGENCE(2004)

引用 12|浏览3
暂无评分
摘要
Information in Digital Libraries is explicitly organized, described, and managed. The content of their data resources is summarized into small descriptions, usually called metadata, which can be either introduced manually or automatically generated. In this context, specialized thesauri are frequently used to provide accurate content for subject or keyword metadata elements. However, if a Digital Library aims at providing access for the general public, it is not reasonable to assume that casual users will use the same terms as the keywords used in metadata records. As an initial step to fill the semantic gap between user queries and metadata records, the authors of this paper already created a method for the semantic disambiguation of thesauri with respect to an upper-level ontology (WordNet). This paper presents now the integration of this disambiguation within an information retrieval system, in this case adapting the vector-space retrieval model. Thanks to the disambiguation, both metadata records and queries can be homogenously represented as a collection of WordNet synsets, thus enabling the computing of a similarity value, which ranks the results.
更多
查看译文
关键词
information retrieval system,vector space,semantic gap,information retrieval,digital library
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要