Large-scale logistic regression and linear support vector machines using spark

BigData Conference(2014)

引用 111|浏览109
暂无评分
摘要
Logistic regression and linear SVM are useful methods for large-scale classification. However, their distributed implementations have not been well studied. Recently, because of the inefficiency of the MapReduce framework on iterative algorithms, Spark, an in-memory cluster-computing platform, has been proposed. It has emerged as a popular framework for large-scale data processing and analytics. In this work, we consider a distributed Newton method for solving logistic regression as well linear SVM and implement it on Spark. We carefully examine many implementation issues significantly affecting the running time and propose our solutions. After conducting thorough empirical investigations, we release an efficient and easy-to-use tool for the Spark community.
更多
查看译文
关键词
large-scale data processing,MapReduce framework,in-memory cluster-computing platform,distributed Newton method,large-scale data analytics,large-scale logistic regression,regression analysis,pattern classification,linear SVM,data analysis,linear support vector machines,Newton method,Spark platform,iterative algorithm,support vector machines,large-scale classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要