Deep embedding clustering based on contractive autoencoder.

Neurocomputing(2021)

引用 56|浏览36
暂无评分
摘要
Clustering large and high-dimensional document data has got a great interest. However, current clustering algorithms lack efficient representation learning. Implementing deep learning techniques in document clustering can strengthen the learning processes. In this work, we simultaneously disentangle the problem of learned representation by preserving important information from the initial data while pushing the original samples and their augmentations together in one hand. Furthermore, we handle the cluster locality preservation issue by pushing neighboring data points together. To that end, we first introduce Contractive Autoencoders. Then we propose a deep embedding clustering framework based on contractive autoencoder (DECCA) to learn document representations. Furthermore, to grasp relevant document or word features, we append the Frobenius norm as penalty term to the conventional autoencoder framework, which helps the autoencoder to perform better. In this way, the contractive autoencoders apprehend the local manifold structure of the input data and compete with the representations learned by existing methods. Finally, we confirm the supremacy of our proposed algorithm over the state-of-the-art results on six real-world images and text datasets.
更多
查看译文
关键词
Document clustering,Representation learning,Contractive autoencoder,Deep learning,Deep embedding clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要