The Impact of Data Augmentation on Sentiment Analysis of Translated Textual Data

2023 International Conference on IT Innovation and Knowledge Discovery (ITIKD)(2023)

引用 1|浏览4
暂无评分
摘要
Sentiment analysis is an application of natural language processing that requires an abundance of data that may not be achieved sometimes for some reason. Data augmentation is one technique that deals with the lack of data by creating synthetic training data without adding new ones. It boosts model performance, especially with deep learning ones. Despite its influential role in boosting the model performance, it attracted very little attention from the researchers of the Arabic NLP community, specifically with scarce language resources such as Arabic and its dialects. In this study, one of the augmentation techniques called random swap was applied with LSTM deep learning model to classify three parallel datasets. The three parallel datasets are Bahraini dialects, Modern Standard Arabic and English. The results show an improvement in the LSTM model by 14.06%, 12.57%, and 11.04% on Bahraini dialects, Modern Standard Arabic, and English datasets, respectively, when applying the augmentation technique over that of no application.
更多
查看译文
关键词
Data augmentation,LSTM,translation-based,Modern standard Arabic,Bahraini dialects
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要