Temporal Difference Uncertainties as a Signal for Exploration
Abstract:
An effective approach to exploration in reinforcement learning is to rely on an agent's uncertainty over the optimal policy, which can yield near-optimal exploration strategies in tabular settings. However, in non-tabular settings that involve function approximators, obtaining accurate uncertainty estimates is almost as challenging a pr...More
Code:
Data:
Tags
Comments