Collusive Spam Detection from Chinese Community Question Answering Sites: A Collective Classification Framework

Information Sciences(2024)

引用 0|浏览2
暂无评分
摘要
With Community Question Answering (CQA) sites evolving into quite popular knowledge-sharing platforms on the Internet, they have also become ideal places for various spammers to spread fake or promotional information. Recently, with the rapid development of crowdsourcing systems, numerous malicious users have launched organized spam campaigns, conducting many spam accounts to carry out collusive spamming activities on CQA sites. In these campaigns, the spammers do not act independently but post deceptive questions and answers (Q&As) collaboratively, which makes the Q&As closely related to each other, but the spam clues of them are even less visible. Therefore, most existing spam detection works may fail to detect these carefully organized and posted collusive CQA spam. In this paper, taking Baidu Zhidao, a popular CQA platform in Chinese, as the study object, we propose a Collective Classification framework for community Question Answering spam detection (CCQA), which collectively identifies the collusive CQA spam using Q&A features and the correlations among Q&As. First, we define the Deceptive Pattern of Q&As, based on which the real Q&A groups are extracted. Then, we extract several highly discriminative Q&A features from both individual and group levels, and propose several types of correlations, which correlate the Q&As that are more likely to have the same labels. After uniformly modeling the Q&As, features, and correlations in the Attributed Heterogeneous Information Network (AHIN), a semi-supervised collective classification algorithm is proposed to detect the collusive Q&A spam. Experimental results on a real-life dataset demonstrate that CCQA can accurately detect the collusive CQA spam, and outperform a number of competitive baselines.
更多
查看译文
关键词
Collusive CQA Spam Detection,Deceptive Pattern,Question and Answer Correlation,Attributed Heterogeneous Information Network,Collective Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要