Augmenting Replay in World Models for Continual Reinforcement Learning
CoRR(2024)
摘要
In continual RL, the environment of a reinforcement learning (RL) agent
undergoes change. A successful system should appropriately balance the
conflicting requirements of retaining agent performance on already learned
tasks, stability, whilst learning new tasks, plasticity. The first-in-first-out
buffer is commonly used to enhance learning in such settings but requires
significant memory. We explore the application of an augmentation to this
buffer which alleviates the memory constraints, and use it with a world model
model-based reinforcement learning algorithm, to evaluate its effectiveness in
facilitating continual learning. We evaluate the effectiveness of our method in
Procgen and Atari RL benchmarks and show that the distribution matching
augmentation to the replay-buffer used in the context of latent world models
can successfully prevent catastrophic forgetting with significantly reduced
computational overhead. Yet, we also find such a solution to not be entirely
infallible, and other failure modes such as the opposite – lacking plasticity
and being unable to learn a new task – to be a potential limitation in
continual learning systems.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要