Matrix Profile XIV - Scaling Time Series Motif Discovery with GPUs to Break a Quintillion Pairwise Comparisons a Day and Beyond.

SoCC '19: ACM Symposium on Cloud Computing Santa Cruz CA USA November, 2019(2019)

引用 52|浏览119
暂无评分
摘要
The discovery of conserved (repeated) patterns in time series is arguably the most important primitive in time series data mining. Called time series motifs, these primitive patterns are useful in their own right, and are also used as inputs into classification, clustering, segmentation, visualization, and anomaly detection algorithms. Recently the Matrix Profile has emerged as a promising representation to allow the efficient exact computation of the top-k motifs in a time series. State-of-the-art algorithms for computing the Matrix Profile are fast enough for many tasks. However, in a handful of domains, including astronomy and seismology, there is an insatiable appetite to consider ever larger datasets. In this work we show that with several novel insights we can push the motif discovery envelope using a novel scalable framework in conjunction with a deployment to commercial GPU clusters in the cloud. We demonstrate the utility of our ideas with detailed case studies in seismology, demonstrating that the efficiency of our algorithm allows us to exhaustively consider datasets that are currently only approximately searchable, allowing us to find subtle precursor earthquakes that had previously escaped attention, and other novel seismic regularities.
更多
查看译文
关键词
Time Series, Matrix Profile, SCAMP, Self-Join, AB-Join, Cloud Computing, Spot Instance, GPU, Tiling, Fault-Tolerance, Numerical Optimization, Seismology, Entomology
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要