Design and Implementation of a Communication-Optimal Classifier for Distributed Kernel Support Vector Machines.

Yang You,James Demmel,Kent Czechowski,Le Song,Rich Vuduc

IEEE Trans. Parallel Distrib. Syst.（2017）

引用 17|浏览89

暂无评分

摘要

We consider the problem of how to design and implement communication-efficient versions of parallel kernel support vector machines, a widely used classifier in statistical machine learning, for distributed memory clusters and supercomputers. The main computational bottleneck is the training phase, in which a statistical model is built from an input data set. Prior to our study, the parallel isoefficiency of a state-of-the-art implementation scaled as $W=\\Omega (P^3)$ , where $W$ is the problem size and $P$ the number of processors; this scaling is worse than even a one-dimensional block row dense matrix vector multiplication, which has $W=\\Omega (P^2)$ . This study considers a series of algorithmic refinements, leading ultimately to a Communication-Avoiding SVM method that improves the isoefficiency to nearly $W=\\Omega (P)$ . We evaluate these methods on 96 to 1,536 processors, and show average speedups of $3-16\\times$ ( $7\\times$ on average) over Dis-SMO, and a 95 percent weak-scaling efficiency on six real-world datasets, with only modest losses in overall classification accuracy. The source code can be downloaded at [1] .

查看译文

关键词

Support vector machines,Training,Kernel,Data models,Program processors,Partitioning algorithms,Optimization

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要