Spam comments detection with self-extensible dictionary and text-based features

2017 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC)(2017)

引用 5|浏览19
暂无评分
摘要
The new social media have become popular for information spreading, allowing online users to publish latest events and personal opinions. However, massive spam comments seriously decrease users' reading experience. To detect spam comments in Chinese social media, we employ semantic analysis to build the self-extensible dictionary which updates and extends itself with new cyber words automatically. The Semantic analysis brings extra semantic features which helps in text classification. Based on the statistical analysis of microblogging comments, we select four text-based features, which basically represent characteristics of Chinese spam comments. We use spam dictionary and text-based features to construct classifiers for detecting spam comments. Finally, we achieve an average detection accuracy of 93.6%, which is preferable to existing spam comments detection methods. Experimental results demonstrate that our method can effectively detect spam comments in Chinese microblogging field.
更多
查看译文
关键词
spam comments, spam dictionary, text-based features
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要