Measuring Comparability of Documents in Non-Parallel Corpora for Efficient Extraction of (Semi-)Parallel Translation Equivalents.

EACL 2012: Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)(2012)

引用 10|浏览10
暂无评分
摘要
In this paper we present and evaluate three approaches to measure comparability of documents in non-parallel corpora. We develop a task-oriented definition of comparability, based on the performance of automatic extraction of translation equivalents from the documents aligned by the proposed metrics, which formalises intuitive definitions of comparability for machine translation research. We demonstrate application of our metrics for the task of automatic extraction of parallel and semiparallel translation equivalents and discuss how these resources can be used in the frameworks of statistical and rule-based machine translation.
更多
查看译文
关键词
automatic extraction,machine translation research,rule-based machine translation,semiparallel translation equivalent,translation equivalent,proposed metrics,intuitive definition,non-parallel corpus,task-oriented definition,efficient extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要