A Fast Semi-Supervised Clustering Framework for Large-Scale Time Series Data

IEEE Transactions on Systems, Man, and Cybernetics: Systems(2021)

引用 17|浏览57
暂无评分
摘要
Semi-supervised clustering algorithms have several limitations: 1) the computation complexity of them is very high, because calculating the similarity distances of pairs of examples is time-consuming; 2) traditional semi-supervised clustering methods have not considered how to make full use of must-link and cannot-link constraints. In the clustering, the contribution of a few pairwise constraints to the clustering performance is very limited, and some may negatively affect the outcome; and 3) these methods are not effective to handle high dimensional data, especially for time series data. Up to now, few work touched semi-supervised clustering on time series data. To efficiently cluster large-scale time series data, we first tackle contract time series clustering to produce the most accurate clustering results under a contracted time. We propose a semi-supervised time series clustering framework (STSC), which integrates a fast similarity measure and a constraint propagation approach. Based on the proposed framework, two valid semi-supervised clustering algorithms including fssK-means and fssDBSCAN are designed. Experiments on 11 datasets show that our proposed method is efficient and effective for clustering large-scale time series data.
更多
查看译文
关键词
Constraint propagation,semi-supervised learning,similarity measure,time series clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要