Bangla Topic Classification Using Supervised Learning

Computational Intelligence in Pattern RecognitionAdvances in Intelligent Systems and Computing(2021)

引用 0|浏览3
暂无评分
摘要
We are living in the era of an information technology-driven world. A lot of data is being generated at every moment with the help of the Internet of Things. To extract information from that data, it becomes essential to classify those data into various categories. Classification of text involves extracting features from text and then classifying them. Categorizing this big chunk of data is cumbersome and time-consuming, and, in some cases, impossible without the machine’s involvement. That has imaged a need to model a machine-based classification algorithm. However, the recent surge in machine learning and deep learning has grown more attraction in this research domain. Literature shows a significant number of research work in topic classification that deals with only English language. But, there are very few topic classification research in the Bangla language due to the scarcity of the Bangla topics database and other linguistic constraints. Among those topic classification work in Bangla, there are few works that involve machine learning or deep learning implementation. In this article, we present the Bangla topic classification methodology using a supervised learning model. We have implemented various word embedding algorithms to embed the text of Bangla newspaper datasets and machine learning algorithms to classify the embedded text. We have selected the best embedding and classification algorithm pair considering the performance metrics.
更多
查看译文
关键词
Topic classification, Bangla NLP, Supervised learning, Feature extraction, Bangla corpus
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要