Classification And Focused Crawling For Semistructured Data
INTELLIGENT SEARCH ON XML DATA: APPLICATIONS, LANGUAGES, MODELS IMPLEMENTATIONS AND BENCHMARKS(2003)
摘要
Despite the great advances in XML data management and querying, the currently prevalent XPath- or XQuery-centric approaches
face severe limitations when applied to XML documents in large intranets, digital libraries, federations of scientific data
repositories, and ultimately the Web. In such environments, data has much more diverse structure and annotations than in a
business-data setting and there is virtually no hope for a common schema or DTD that all the data complies with. Without a
schema, however, databasestyle querying would often produce either empty result sets, namely, when queries are overly specific,
or way too many results, namely, when search predicates are overly broad, the latter being the result of the user not knowing
enough about the structure and annotations of the data.
更多查看译文
关键词
image processing,pattern recognition,scientific data,data management,classification,digital library,xml document,speech processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络