Efficient Enumeration Algorithms for Regular Document Spanners
ACM Transactions on Database Systems (TODS), pp. 1-42, 2020.
Information extractionautomatacapture variablesenumeration delayspanners
Regular expressions and automata models with capture variables are core tools in rule-based information extraction. These formalisms, also called regular document spanners, use regular languages to locate the data that a user wants to extract from a text document and then store this data into variables. Since document spanners can easily ...More
Full Text (Upload PDF)
PPT (Upload PPT)