Detecting Predatory Behaviour from Online Textual Chats
Communications in Computer and Information Science(2012)
摘要
This paper presents a novel methodology for learning the behavioural profiles of sexual predators by using state-of-the-art machine learning and computational linguistics methods. The presented methodology targets at distinguishing between predatory and non-predatory conversations and is evaluated in real-world data. All the text fragments within a malicious chat is not of predatory nature. Thus it is necessary to distinguish the predatory fragments from non-predatory ones. This distinction is made by implementing the notion of n-grams which captures predatory sequences from conversations. The paper uses as features both content words and stylistic features within conversations. The content words are weighed using tf-idf measure. Experiments show that content words alone are not enough to make distinction between predatory and non-predatory chats. The implementation of various stylistic features however improves the performance of the system.
更多查看译文
关键词
natural language processing,svm,text classification,offensive chats
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要