Syntactic alignment models for large-scale statistical machine translation

Syntactic alignment models for large-scale statistical machine translation(2012)

引用 23|浏览14
暂无评分
摘要
Word alignment, the process of inferring the implicit links between words across two languages, serves as an integral piece of the puzzle of learning linguistic translation knowledge. It enables us to acquire automatically from data the rules that govern the transformation of words, phrases, and syntactic structures from one language to another. Word alignment is used in many tasks in Natural Language Processing, such as bilingual dictionary induction, cross-lingual information retrieval, and distilling parallel text from within noisy data. In this dissertation, we focus on word alignment for statistical machine translation. We advance the state-of-the-art in search, modeling, and learning of alignments and show empirically that, when taken together, these contributions significantly improve the output quality of large-scale statistical machine translation, outperforming existing methods. We show results for Arabic-English and Chinese-English translation. Ultimately, the work we describe herein may be used for any language-pair, supporting arbitrary and overlapping features from varied sources. Finally, our features are learned automatically without any human intervention, facilitating rapid deployment for new language-pairs.
更多
查看译文
关键词
statistical machine translation,linguistic translation knowledge,cross-lingual information retrieval,bilingual dictionary induction,noisy data,Chinese-English translation,large-scale statistical machine translation,Natural Language Processing,syntactic alignment model,show empirically,word alignment
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要