Linear time membership in a class of regular expressions with interleaving and counting.

Proceedings of the 17th ACM conference on Information and knowledge management(2008)

引用 16|浏览5
暂无评分
摘要
The extension of Regular Expressions (REs) with an interleaving (shuffle) operator has been proposed in many occasions, since it would be crucial to deal with unordered data. However, interleaving badly affects the complexity of basic operations, and, expecially, makes membership NP-hard [13], which is unacceptable for most uses of REs. REs form the basis of most XML type languages, such as DTDs and XML Schema types, and XDuce types [16, 11]. In this context, the interleaving operator would be a natural addition to the language of REs, as witnessed by the presence of limited forms of interleaving in XSD (the all group), Relax-NG, and SGML, provided that the NP-hardness of membership could be avoided. We present here a restricted class of REs with interleaving and counting which admits a linear membership algorithm, and which is expressive enough to cover the vast majority of real-world XML types. We first present an algorithm for membership of a list of words into a RE with interleaving and counting, based on the translation of the RE into a set of constraints. We generalize the approach in order to check membership of XML trees into a class of EDTDs with interleaving and counting, which models the crucial aspects of DTDs and XSD schemas.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要