TOD: Tensor-based Outlier Detection

ArXiv(2021)

引用 0|浏览23
暂无评分
摘要
To scale outlier detection (OD) to large, high-dimensional datasets, we propose TOD, a novel system that abstracts OD algorithms into basic tensor operations for efficient GPU acceleration. To make TOD highly efficient in both time and space, we leverage recent advances in deep learning infrastructure in both hardware and software. To deploy large OD applications on GPUs with limited memory, we introduce two key techniques. First, provable quantization accelerates OD computation and reduces the required amount of memory by performing specific OD computations in lower precision while provably guaranteeing no accuracy loss. Second, to exploit the aggregated compute resources and memory capacity of multiple GPUs, we introduce automatic batching, which decomposes OD computations into small batches that can be executed on multiple GPUs in parallel. TOD supports a comprehensive set of OD algorithms and utility functions. Extensive evaluation on both real and synthetic OD datasets shows that TOD is on average 11.9× faster than the state-of-the-art comprehensive OD system PyOD (with a maximum speedup of 38.9×), and takes less than an hour to detect outliers within a million samples. TOD enables straightforward integration for additional OD algorithms and provides a unified framework for combining classical OD algorithms with deep learning methods. These combinations result in an infinite number of OD methods, many of which are novel and can be easily prototyped in TOD.
更多
查看译文
关键词
outlier detection,tensor-based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要