Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words
meeting of the association for computational linguistics(2017)
摘要
Common approaches to text categorizationessentially rely either on n-gramcounts or on word embeddings. Thispresents important difficulties in highlydynamic or quickly-interacting environments,where the appearance of new wordsand/or varied misspellings is the norm.A paradigmatic example of this situationis abusive online behavior, with socialnetworks and media platforms strugglingto effectively combat uncommon or nonblacklistedhate words. To better deal withthese issues in those fast-paced environments,we propose using the error signalof class-based language models as inputto text classification algorithms. In particular,we train a next-character predictionmodel for any given class, and then exploitthe error of such class-based modelsto inform a neural network classifier. Thisway, we shift from the ability to describeseen documents to the ability to predictunseen content. Preliminary studies usingout-of-vocabulary splits from abusivetweet data show promising results, outperformingcompetitive text categorizationstrategies by 4–11%
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络