GPU accelerated t-distributed stochastic neighbor embedding

Journal of Parallel and Distributed Computing(2019)

引用 54|浏览64
暂无评分
摘要
Modern datasets and models are notoriously difficult to explore and analyze due to their inherent high dimensionality and massive numbers of samples. Existing visualization methods which employ dimensionality reduction to two or three dimensions are often inefficient and/or ineffective for these datasets. This paper introduces t-SNE-CUDA, a GPU-accelerated implementation of t-Distributed Symmetric Neighbor Embedding (t-SNE) for visualizing datasets and models. t-SNE-CUDA significantly outperforms current implementations with 15-700x speedups on the CIFAR-10 and MNIST datasets. These speedups enable, for the first time, large scale visualizations of modern computer vision datasets such as ImageNet, as well as larger NLP datasets such as GloVe. From these new visualizations, we can draw a number of interesting conclusions. In addition, the performance on machine learning datasets allows us to compute t-SNE embeddings in close to real time, and we explore the applications of such fast embeddings in the domain of importance sampling for neural network training.
更多
查看译文
关键词
T-SNE,Embedding,CUDA,Parallel computing,GPU computing,Applications
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要