dictNN: A Dictionary-Enhanced CNN Approach for Classifying Hate Speech on Twitter

Maximilian Kupi, Michael Bodnar, Nikolas Schmidt, Carlos Eduardo Posada

arxiv(2021)

引用 0|浏览0
暂无评分
摘要
Hate speech on social media is a growing concern, and automated methods have so far been sub-par at reliably detecting it. A major challenge lies in the potentially evasive nature of hate speech due to the ambiguity and fast evolution of natural language. To tackle this, we introduce a vectorisation based on a crowd-sourced and continuously updated dictionary of hate words and propose fusing this approach with standard word embedding in order to improve the classification performance of a CNN model. To train and test our model we use a merge of two established datasets (110,748 tweets in total). By adding the dictionary-enhanced input, we are able to increase the CNN model's predictive power and increase the F1 macro score by seven percentage points.
更多
查看译文
关键词
classifying hate speech,dictnn,cnn,dictionary-enhanced
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要