Complexity of alignment and decoding problems: restrictions and approximations

Machine Translation(2015)

引用 0|浏览51
暂无评分
摘要
We study the computational complexity of the Viterbi alignment and relaxed decoding problems for IBM model 3, focusing on the problem of finding a solution which has significant overlap with an optimal. That is, an approximate solution is considered good if it looks like some optimal solution with a few mistakes, where mistakes can be wrong values (such as a word aligned incorrectly or a wrong word in decoding), as well as insertions and deletions (spurious/missing words in decoding). In this setting, we show that it is computationally hard to find a solution which is correct on more than half (plus an inverse polynomial fraction) of the words. More precisely, if there is a polynomial-time algorithm computing an alignment for IBM model 3 which agrees with some Viterbi alignment on l/2+l^ϵ words, where l is the length of the English sentence, or producing a decoding with l/2+l^ϵ correct words, then P = NP. We also present a similar structure inapproximability result for phrase-based alignment. As these strong lower bounds are for the general definitions of the Viterbi alignment and decoding problems, we also consider, from a parameterized complexity perspective, which properties of the input make these problems intractable. As a first step in this direction, we show that Viterbi alignment has a fixed-parameter tractable algorithm with respect to limiting the range of words in the target sentence to which a source word can be aligned. We note that by comparison, limiting maximal fertility—even to three—does not affect NP-hardness of the result.
更多
查看译文
关键词
Viterbi alignment,Decoding,IBM Model 3,Phrase alignment,Approximation,Lower bounds,Edit distance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要