Window Expressions for Stream Data Processing

arxiv(2022)

引用 0|浏览0
暂无评分
摘要
Traditional ways of storing and querying data do not work well in scenarios where data is being generated continuously and quick decisions need to be taken. E.g., in hospitals' intensive care units, signals from multiple devices need to be monitored and the occurrence of any anomaly should raise alarms immediately. A typical design would take the average from a window of say 10 seconds (time-based) or 10 successive (count-based) readings and look for sudden deviations. More sophisticated window definitions may be desired. E.g., we may want to select windows in which the maximum value of a field is greater than a fixed threshold. Existing stream processing systems either restrict to time or count-based windows or let users define customized windows in imperative programming languages. These are subject to the implementers' interpretation of what is desired and are hard to understand for others. We introduce a formalism for specifying windows based on Monadic Second Order logic. It offers several advantages over ad-hoc definitions written in imperative languages. We demonstrate four such advantages. First, we illustrate how practical streaming data queries can be easily written with precise semantics. Second, we get different formalisms (but that are expressively equivalent) for defining windows. We use one of them (regular expressions) to design an end-user-friendly language for defining windows. Third, we use another expressively equivalent formalism (automata) to design a processor that automatically generates windows according to specifications. The fourth advantage we demonstrate is more sophisticated. Some window definitions have the problem of too many windows overlapping with each other, overwhelming the processing engine. This is handled in different ways by different engines, but all the options are about what to do when this happens at runtime. We study this as a static...
更多
查看译文
关键词
processing,window,expressions,data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要