vnNLI - VLSP 2021: Vietnamese and English-Vietnamese Textual Entailment Based on Pre-trained Multilingual Language Models

Ngan Nguyen Luu Thuy,Đặng Văn Thìn,Hoàng Xuân Vũ, Nguyễn Văn Tài, Khoa Thi-Kim Phan

VNU Journal of Science: Computer Science and Communication Engineering(2022)

引用 0|浏览2
暂无评分
摘要
Natural Language Inference (NLI) is a high-level semantic task in Natural Language Processing - NLP, and it extends further challenges if it is in the cross-lingual scenario. In recent years, pre-trained multilingual language models (e.g., mBERT ,XLM-R, InfoXLM) have greatly contributed to the success of dealing with these challenges. Based on the motivation behind these achievements, this paper describes our approach based on fine-tuning pretrained multilingual language models (XLM-R, InfoXLM) to tackle the shared task ``Vietnamese and English\-Vietnamese Textual Entailment'' at the 8th International Workshop on Vietnamese Language and Speech Processing (VLSP 2021\footnote{https://vlsp.org.vn/vlsp2021}). We investigate other techniques to improve the performance of our work: Cross-validation, Pseudo-labeling (PL), Learning rate adjustment, and Postagging. All experimental results demonstrated that our approach based on the InfoXLM model achieved competitive results, ranking 2nd for the task evaluation in VLSP 2021 with 0.89 in terms of F1-score on the private test set.
更多
查看译文
关键词
language,english-vietnamese,pre-trained
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要