Efficient spam filtering through intelligent text modification detection using machine learning

N. Mageshkumar,A. Vijayaraj, N. Arunpriya, A. Sangeetha

MATERIALS TODAY-PROCEEDINGS(2022)

引用 0|浏览6
暂无评分
摘要
Spam emails have long been a source of concern in the field of computer security. They are both monetarily and technologically costly, as well as extremely harmful to computers and networks. Despite the rise of social networks and other Internet-based information exchange venues, email communication has become increasingly important over time, necessitating the urgent improvement of spam filters. Although various spam filters have been developed to help prevent spam emails from reaching a user's mailbox, there has been little research into text modifications. Because of its simplicity and efficiency, Naive Bayes is currently one of the most used methods of spam classification. However, when emails contain leetspeak or diacritics, Naive Bayes is unable to correctly categorize them. As a result, we created a novel method to improve the accuracy of the Naive Bayes Spam Filter to detect text alterations and correctly classify emails as Spam or ham in this proposal. When compared to Spamassassin, our Python approach uses a combination of semantic, keyword, and machine learning algorithms to improve Naive Bayes accuracy. Furthermore, we identified a link between email length and spam score, indicating that Bayesian Poisoning, a contentious concept, is an actual occurrence used by spammers. Copyright (C) 2022 Elsevier Ltd. All rights reserved.
更多
查看译文
关键词
Bayesian poisoning, Diacritics, Leetspeak, Naive Bayes, Spam filters, Spammer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要