A Sampling-Based Approach To Accelerating Queries In Log Management Systems

SPLASH '16: Conference on Systems, Programming, Languages, and Applications: Software for Humanity Amsterdam Netherlands October, 2016(2016)

引用 5|浏览88
暂无评分
摘要
Log management systems are common in industry and an essential part of a system administrator's toolkit. Examples include Splunk, elk, Log Insight, Sexilog, and more. Logs in these systems are characterized by a small number of predefined fields such as timestamp and host, with the bulk of an entry being unstructured text. System administrators query these logs using a combination of range constraints over predefined fields and patterns or regular expressions over the text portion of the message. These queries are both complex and diverse.We propose a method for maintaining a subset of these logs in a much smaller database known as a sublog. Because queries are issued against a much smaller data set they run to completion quickly and avoid common scaling bottlenecks. However, the improvement in performance comes at a price. Because we only consider a subset of the original data, we are only able to provide approximate responses. Nonetheless, the reduction in accuracy is minimal and we are able to produce high-quality, high-performance results.
更多
查看译文
关键词
Stratified Sampling,Log Messages,Log Management Systems
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要