Application-Specific Schema Design for Storing Large RDF Datasets

PSSS(2003)

引用 43|浏览70
暂无评分
摘要
In order to realize the vision of the Semantic Web, a semantic model for encoding content in the World Wide Web, efficient storage and retrieval of large RDF data sets is required. A common technique for storing RDF data (graphs) is to use a single relational database table, a triple store, for the graph. However, we believe a single triple store cannot scale for the needs of large-scale applications. Instead, database schemas that can be customized for a particular dataset or application are required. To enable this, some RDF systems offer the ability to store RDF graphs across multiple tables. However, tools are needed to assist users in developing application-specific schema. In this paper, we describe our approach to developing RDF storage schema and describe two tools assisting in schema development. The first is a synthetic data generator that generates large RDF graphs consistent with an underlying ontology and using data distributions and relationships specified by a user. The second tool mines an RDF graph or an RDF query log for frequently occurring patterns. Knowledge of these patterns can be applied to schema design or caching strategies to improve performance. The tools are being developed as part of the Jena Semantic Web programmers' toolkit but they are generic and can be used with other RDF stores. Preliminary results with these tools on real data sets are also presented.
更多
查看译文
关键词
data mining,schema design,sequential pattern min-,rdf,storage tuning,semantic web,synthetic data,relational database,semantic model,world wide web
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要