Improving Vietnamese-English Medical Machine Translation
arxiv(2024)
摘要
Machine translation for Vietnamese-English in the medical domain is still an
under-explored research area. In this paper, we introduce MedEV – a
high-quality Vietnamese-English parallel dataset constructed specifically for
the medical domain, comprising approximately 360K sentence pairs. We conduct
extensive experiments comparing Google Translate, ChatGPT (gpt-3.5-turbo),
state-of-the-art Vietnamese-English neural machine translation models and
pre-trained bilingual/multilingual sequence-to-sequence models on our new MedEV
dataset. Experimental results show that the best performance is achieved by
fine-tuning "vinai-translate" for each translation direction. We publicly
release our dataset to promote further research.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要