Divide-and-Conquer Reinforcement Learning
ICLR, Volume abs/1711.09874, 2018.
Standard model-free deep reinforcement learning (RL) algorithms sample a new initial state for each trial, allowing them to optimize policies that can perform well even in highly stochastic environments. However, problems that exhibit considerable initial state variation typically produce high-variance gradient estimates for model-free ...More
PPT (Upload PPT)