Distance Measures for Effective Clustering of ARIMA Time-Series

ICDM(2001)

引用 574|浏览28
暂无评分
摘要
Many environmental and socioeconomic time-series data can be adequately modeled using Auto-RegressiveIntegrated Moving Average (ARIMA) models. We call such Time-series ARIMA time-series. We consider the problem of clustering ARIMA time-series. We propose the use of the Linear Predictive Coding (LPC) cepstrum of time-series for clustering ARIMA time-series, by using the Euclideandistance between the LPC cepstra of two time-series as their dissimilarity measure. We demonstrate that LPC cepstral coefficients have the desire features for accurate clustering and efficient indexing of ARIMA time-series. For example, few LPC cepstral coefficients are sufficient in order todiscriminate between time-series that are modeled by different ARIMA models. In fact this approach requires fewer coefficients than traditional approaches, such as DFT and DWT. The proposed distance measure can be use for measuring the similarity between different ARIMA models as well.We cluster ARIMA time-series using the Partition Around Medoids method with various similarity measures. We present experimental results demonstrating that using the proposed measure we achieve significantly betterclusterings of ARIMA time-series data as compared to clusterings obtained by using other traditional similaritymeasures, such as DFT, DWT, PCA, etc. Experiments wereperformed both on simulated as well as real data.
更多
查看译文
关键词
time-series,socioeconomic time-series data,time-series arima time-series,arima time-series,distance measures,cepstral coefficients.,arima models,lpc cepstral coefficient,different arima model,arima time-series data,lpc cepstra,clustering arima time-series,cluster arima,similarity measures,accurate clustering,effective clustering,clustering,arima model,cepstrum,autoregressive integrated moving average,auto regressive,linear predictive coding,fourier transforms,time series,cepstral coefficients,moving average,temporal databases,time series data,indexing,data mining,time measurement,euclidean distance,principal component analysis,social sciences
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要