Combating Hate Speech on Q&A Forums with Machine Learning

2023 World Conference on Communication & Computing (WCONF)(2023)

引用 0|浏览0
暂无评分
摘要
Online forums are becoming increasingly popular, increasing the demand for efficient ways to handle user-generated information. In order to identify hate speech in user-generated content, this project offers a question-and-answer forum that employs machine learning (ML) algorithms and Natural Language Processing (NLP) techniques. The forum is built with the MERN stack, which includes MongoDB, Express, React, and Node.js. Several NLP strategies were implemented in the project, including feature vectorization using count vectorizer, TF-IDF vectorizer, and word2vec. Logistic regression (LR), support vector machines (SVM), decision trees (DT), and naive Bayes (NB) are the ML methods used to detect hate speech. The classification report is used as an evaluation metric to assess the performance of these methods. For model training, the “Hate Speech and Offensive Language Dataset” hosted on Kaggle and freely accessible to the public is used. The findings of the experiments show that the LR model with the count vectorizer performs better than the other model and provides a maximum accuracy of 90%.
更多
查看译文
关键词
Natural Language Processing,Hate Speech Detection,Sentiment Analysis,MERN Stack
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要