Boosting Long-Delayed Reinforcement Learning with Auxiliary Short-Delayed Task
CoRR(2024)
摘要
Reinforcement learning is challenging in delayed scenarios, a common
real-world situation where observations and interactions occur with delays.
State-of-the-art (SOTA) state-augmentation techniques either suffer from the
state-space explosion along with the delayed steps, or performance degeneration
in stochastic environments. To address these challenges, our novel
Auxiliary-Delayed Reinforcement Learning (AD-RL) leverages an auxiliary
short-delayed task to accelerate the learning on a long-delayed task without
compromising the performance in stochastic environments. Specifically, AD-RL
learns the value function in the short-delayed task and then employs it with
the bootstrapping and policy improvement techniques in the long-delayed task.
We theoretically show that this can greatly reduce the sample complexity
compared to directly learning on the original long-delayed task. On
deterministic and stochastic benchmarks, our method remarkably outperforms the
SOTAs in both sample efficiency and policy performance.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要