Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
CoRR(2024)
摘要
Diffusion models have gained attention in text processing, offering many
potential advantages over traditional autoregressive models. This work explores
the integration of diffusion models and Chain-of-Thought (CoT), a
well-established technique to improve the reasoning ability in autoregressive
language models. We propose Diffusion-of-Thought (DoT), allowing reasoning
steps to diffuse over time through the diffusion process. In contrast to
traditional autoregressive language models that make decisions in a
left-to-right, token-by-token manner, DoT offers more flexibility in the
trade-off between computation and reasoning performance. Our experimental
results demonstrate the effectiveness of DoT in multi-digit multiplication and
grade school math problems. Additionally, DoT showcases promising
self-correction abilities and benefits from existing reasoning-enhancing
techniques like self-consistency decoding. Our findings contribute to the
understanding and development of reasoning capabilities in diffusion language
models.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要