The Unreasonable Effectiveness of Random Target Embeddings for Continuous-Output Neural Machine Translation
arxiv(2023)
摘要
Continuous-output neural machine translation (CoNMT) replaces the discrete
next-word prediction problem with an embedding prediction. The semantic
structure of the target embedding space (i.e., closeness of related words) is
intuitively believed to be crucial. We challenge this assumption and show that
completely random output embeddings can outperform laboriously pretrained ones,
especially on larger datasets. Further investigation shows this surprising
effect is strongest for rare words, due to the geometry of their embeddings. We
shed further light on this finding by designing a mixed strategy that combines
random and pre-trained embeddings for different tokens.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要