ClassifyDroid: Large scale Android applications classification using semi-supervised Multinomial Naive Bayes

2016 4th International Conference on Cloud Computing and Intelligence Systems (CCIS)(2016)

引用 14|浏览7
暂无评分
摘要
Rapid advances in mobile internet have enabled mobile applications to enter the era of `Big Data' (large datasets). Classification on large scale Android applications has attracted great interest from both researchers and practitioners. However, most existing approaches are supervised learning method which needs lots of labeled data. Their use in practice is often limited due to lack of labeled data, large scale Android applications, or high manual label cost. In this paper, we present a novel large scale Android applications classification tool using semi-supervised Multinomial Naive Bayes (SMNB) algorithm, called ClassifyDroid. Our proposed model exploits SMNB algorithm widely used in text document analysis. The approach is based on the analysis of characteristic application program interface (API), which can be seen as equivalents to the words and keywords in a text document. Namely, each application is characterized as a vector according to the characteristic API in it, with the associated frequencies. We evaluated ClassifyDroid on 15590 samples chosen from mobile market (MM) App Store. Our experiments show that ClassifyDroid is both accurate and practical, which has a better classification result than MNB algorithm when the dataset contains little labeled applications and lots of unlabeled applications.
更多
查看译文
关键词
Semi-supervised,Multinomial Naive Bayes,Android applications,large scale classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要