Unsupervised Paraphrase Generation via Dynamic Blocking
arxiv(2021)
摘要
We propose Dynamic Blocking, a decoding algorithm which enables large-scale pretrained autoregressive models (such as BART, T5, GPT-2 and XLNet) to generate high-quality paraphrases in an unsupervised setting. In order to obtain an alternative surface form, whenever the language model emits a token that is present in the source sequence, we prevent the model from generating the subsequent source token for the next time step. We show that our approach achieves state-of-the-art results on benchmark datasets when compared to previous unsupervised approaches, and is even comparable with strong supervised, in-domain models. We also propose a new automatic metric based on self-BLEU and BERTscore which not only discourages the model from copying the input through, but also evaluates text similarity based on distributed representations, hence avoiding reliance on exact keyword matching. In addition, we demonstrate that our model generalizes across languages without any additional training.
更多查看译文
关键词
dynamic blocking,generation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络