Bandwidth-efficient distributed k-nearest-neighbor search with dynamic time warping

2015 IEEE International Conference on Big Data (Big Data)(2015)

引用 5|浏览41
暂无评分
摘要
We study the fundamental k-nearest neighbor (kNN) search problem on distributed time series. A server has constantly received various reference time series Q of length X and seeks the exact kNN over a collection of time series distributed across a set of M local sites. When X and M are large, and when the amount of query increases, simply sending each Q to all M sites incurs high communication bandwidth costs, which we would like to avoid. Prior work has presented a communication-efficient kNN algorithm for the Euclidean distance similarity measure. In this paper, we present the first communication-efficient kNN algorithm for the dynamic time warping (DTW) similarity measure, which is generally believed a better measure for time series. To handle the complexities of DTW, we design a new multi-resolution structure for the reference time series, and multi-resolution lower bounds that can effectively prune the search space. We present a new protocol between the server and the local sites that leverages multi-resolution pruning for communication efficiency and cascading lower bounds for computational efficiency. Empirical studies on both real-world and synthetic data sets show that our method reduces communication bandwidth by up to 92%.
更多
查看译文
关键词
multiresolution structure,DTW similarity measure,Euclidean distance similarity measure,reference time series,distributed time series,kNN search problem,dynamic time warping,bandwidth-efficient distributed k-nearest-neighbor search
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要