Scalable Training of Sparse Linear SVMs

Data Mining(2012)

引用 8|浏览1
暂无评分
摘要
Sparse linear support vector machines have been widely applied to variable selection in many applications. For large data, managing the cost of training a sparse model with good predication performance is an essential topic. In this work, we propose a scalable training algorithm for large-scale data with millions of examples and features. We develop a dual alternating direction method for solving L1-regularized linear SVMs. The learning procedure simply involves quadratic programming in the same form as the standard SVM dual, followed by a soft-thresholding operation. The proposed training algorithm possesses two favorable properties. First, it is a decomposable algorithm by which a large problem can be reduced to small ones. Second, the sparsity of intermediate solutions is maintained throughout the training process. It naturally promotes the solution sparsity by soft-thresholding. We demonstrate that, by experiments, our method outperforms state-of-the-art approaches on large-scale benchmark data sets. We also show that it is well suited for training large sparse models on a distributed system.
更多
查看译文
关键词
large data,scalable training algorithm,proposed training algorithm,sparse linear support vector,large-scale benchmark data set,decomposable algorithm,sparse linear svms,large sparse model,large-scale data,large problem,training process,scalable training,support vector machines,quadratic programming
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要