Scalable Training of Sparse Linear SVMs

Data Mining（2012）

引用 8|浏览1

暂无评分

摘要

Sparse linear support vector machines have been widely applied to variable selection in many applications. For large data, managing the cost of training a sparse model with good predication performance is an essential topic. In this work, we propose a scalable training algorithm for large-scale data with millions of examples and features. We develop a dual alternating direction method for solving L1-regularized linear SVMs. The learning procedure simply involves quadratic programming in the same form as the standard SVM dual, followed by a soft-thresholding operation. The proposed training algorithm possesses two favorable properties. First, it is a decomposable algorithm by which a large problem can be reduced to small ones. Second, the sparsity of intermediate solutions is maintained throughout the training process. It naturally promotes the solution sparsity by soft-thresholding. We demonstrate that, by experiments, our method outperforms state-of-the-art approaches on large-scale benchmark data sets. We also show that it is well suited for training large sparse models on a distributed system.

查看译文

关键词

large data,scalable training algorithm,proposed training algorithm,sparse linear support vector,large-scale benchmark data set,decomposable algorithm,sparse linear svms,large sparse model,large-scale data,large problem,training process,scalable training,support vector machines,quadratic programming

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要