Multi-view fusion for universal translation quality estimation

INFORMATION FUSION(2024)

引用 0|浏览12
暂无评分
摘要
Machine translation quality estimation (QE) aims to evaluate the result of translation without reference. Despite the progress it has made, state-of-the-art QE models are proven to be biased. More specifically, they over-rely on spurious statistical features while ignoring the bilingual semantic adequacy, leading to performance degradation. Besides, existing approaches require large amounts of annotation data, restricting their applications in new domains and languages. In this work, we propose a universal framework for quality estimation based on multi-view fusion. We first introduce noise to the target side of the parallel sentence pair, either by pre-trained language model or by large language model. After that, with the clean parallel pairs and the noised pairs as different views, the QE model is trained to distinguish the clean pairs from the noised ones. Our method can improve the accuracy and generalizability in supervised scenario, and can solely perform estimation in zero-shot scenario. We perform experiments on WMT QE evaluation datasets under different scenarios, verifying the effectiveness of our method. We also make an in-depth investigation of the bias of QE model.
更多
查看译文
关键词
Translation quality estimation,Machine translation,Pre-trained model,Large language model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要