On finetuning Adapter-based Transformer models for classifying Abusive Social Media Tamil Comments

SSRN Electronic Journal(2023)

引用 1|浏览1
暂无评分
摘要
Abstract Speaking or expressing oneself in an abusive manner is a form of verbal abuse that targets individuals or groups on the basis of their membership in a particular social group, which is differentiated by traits such as culture, gender, sexual orientation, religious affiliation etc. In today's world, the dissemination of evil and depraved content on social media has increased exponentially. Abusive language on the internet has been linked to an increase in violence against minorities around the world, including mass shootings, murders, and ethnic cleansing. People who use social media in places where English is not the main language often use a code-mixed form of text. This makes it harder to find abusive texts, and when combined with the fact that there aren't many resources for languages like Tamil, the task becomes significantly challenging. This work makes use of abusive Tamil language comments released by the workshop “Tamil DravidianLangTech@ACL 2022” and develops adapter-based multilingual transformer models namely Muril, XLMRoBERTa and mBERT to classify the abusive comments. These transformers have been utilized as fine-tuners and adapters. This study shows that in low-resource languages like Tamil, adapter-based strategies work better than fine-tuned models. In addition, we use Optuna, a hyperparameter optimization framework to find the ideal values of the hyper-parameters that lead to better classification. Of all the proposed models, MuRIL (Large) gives 74.7%, which is comparatively better than other models proposed for the same dataset.
更多
查看译文
关键词
transformer models,social media,adapter-based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要