TEncDM: Understanding the Properties of Diffusion Model in the Space of Language Model Encodings
CoRR(2024)
摘要
Drawing inspiration from the success of diffusion models in various domains,
numerous research papers proposed methods for adapting them to text data.
Despite these efforts, none of them has managed to achieve the quality of the
large language models. In this paper, we conduct a comprehensive analysis of
key components of the text diffusion models and introduce a novel approach
named Text Encoding Diffusion Model (TEncDM). Instead of the commonly used
token embedding space, we train our model in the space of the language model
encodings. Additionally, we propose to use a Transformer-based decoder that
utilizes contextual information for text reconstruction. We also analyse
self-conditioning and find that it increases the magnitude of the model
outputs, allowing the reduction of the number of denoising steps at the
inference stage. Evaluation of TEncDM on two downstream text generation tasks,
QQP and XSum, demonstrates its superiority over existing non-autoregressive
models.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要