Learning to Scale Logits for Temperature-Conditional GFlowNets
CoRR(2023)
摘要
GFlowNets are probabilistic models that sequentially generate compositional
structures through a stochastic policy. Among GFlowNets,
temperature-conditional GFlowNets can introduce temperature-based
controllability for exploration and exploitation. We propose
Logit-scaling GFlowNets (Logit-GFN), a novel architectural design that
greatly accelerates the training of temperature-conditional GFlowNets. It is
based on the idea that previously proposed approaches introduced numerical
challenges in the deep network training, since different temperatures may give
rise to very different gradient profiles as well as magnitudes of the policy's
logits. We find that the challenge is greatly reduced if a learned function of
the temperature is used to scale the policy's logits directly. Also, using
Logit-GFN, GFlowNets can be improved by having better generalization
capabilities in offline learning and mode discovery capabilities in online
learning, which is empirically verified in various biological and chemical
tasks. Our code is available at
更多查看译文
关键词
scale logits,temperature-conditional
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要