Static analysis of xml transformation and schema languages

Static analysis of xml transformation and schema languages(2006)

引用 28|浏览4
暂无评分
摘要
XML (eXtensible Markup Language) has currently evolved to the standard data exchange format for the World Wide Web. Its main advantages are that it offers an intuitive and standard way of structuring a very wide range of data and that it admits the use of user-defined tags. The latter allows user communities to develop their own format of XML documents, which is defined by an XML schema. The presence of such a schema improves the efficiency of many tasks like, for instance, query processing, query optimization, and automatic data integration. The dissertation is divided into two parts. The first part studies the typechecking problem for XML to XML transformations. The typechecking problem asks, given an input schema, an output schema, and a transformation, whether the output of the transformation is always conform to the output schema when its input is in the input schema. We focus on identifying practical and tractable fragments of the latter problem. In particular, we exhibit a large tractable class in which deletion in transformations is allowed, but the number of copies they make of certain parts in the input tree is bounded. The second part studies the expressive power and the complexity of basic decision problems for XML schema languages. We discuss several syntactical and semantical characterizations of the Element Declarations Consistent (EDC) constraint of W3C XML Schema. We argue that cleaner, more expressive, more robust but equally feasible schema languages can be obtained by replacing EDC with the notion of 1-Pass Preorder Typing (1PPT) or Top-Down Typing (TDT). The former notion essentially allows schemas to determine the type of an element of a streaming document when its opening tag is met and the latter allows to determine the type of an element when it is met when reading the DOM tree in a top-down fashion. In terms of expressive power, EDC, 1PPT, and TDT are strictly included from left to right. We further consider problems such as inclusion, equivalence, intersection-non-emptiness, and minimization of such schemas. Surprisingly, the complexity of these decision problems is essentially the same for schemas with the EDC constraint as for its more expressive 1PPT and TDT variants. Finally, we discuss the problem of minimizing schema languages with the expressive power of unranked regular tree languages.
更多
查看译文
关键词
xml transformation,XML schema,feasible schema language,output schema,W3C XML Schema,schema language,typechecking problem,XML schema language,part study,input schema,expressive power,static analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要