Ensemble Approach to Classify Spam SMS from Bengali Text.

Abdullah Al Maruf, Abdullah Al Numan,Md. Mahmudul Haque, Tasmia Tahmida Jidney,Zeyar Aung

ICACDS(2023)

引用 0|浏览1
暂无评分
摘要
The Short Message Service (SMS) is a popular communication tool, but it has some security weaknesses, such as the influx of spam messages from cyber criminals. While several studies have been conducted on filtering and categorizing spam messages in various languages, including English, limited research has been done on detecting spam in Bengali (endonym Bangla) text. This study aims to fill this gap by classifying Bengali SMS messages as either spam or ham (legitimate messages). To accomplish this, the study used machine learning algorithms, including support vector machine (SVM) with a linear kernel and decision tree (DT), logistic regression (LR), and random forest (RF) with various parameters, as baseline models. Ensemble approaches, such as bagging, boosting, and stacking, were then used to enhance the performance of the models. The results show that the ensemble approach successfully identified spam messages in Bengali text, with XGBoost producing the most favorable outcome. The contribution of this study lies in its focus on Bengali text and the demonstration of the ensemble method’s performance on a small dataset. The tool developed in this study can provide a secure and efficient SMS service to customers by reducing the burden of spam messages and improving the overall user experience. Additionally, the tool can be marketed as a value-added service for customers who are concerned about the security of their personal and financial information. Overall, this study highlights the importance of machine learning algorithms, specifically ensemble methods, in detecting spam messages in Bengali text and provides a valuable contribution to the field of SMS security.
更多
查看译文
关键词
spam sms,ensemble approach,text
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要