Planning in Entropy-Regularized Markov Decision Processes and Games

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), pp. 12383-12392, 2019.

Cited by: 4|Bibtex|Views30|
EI
Keywords:
generative model

Abstract:

We propose SmoothCruiser, a new planning algorithm for estimating the value function in entropy-regularized Markov decision processes and two-player games, given a generative model of the environment. SmoothCruiser makes use of the smoothness of the Bellman operator promoted by the regularization to achieve problem-independent sample comp...More

Code:

Data:

Your rating :
0

 

Tags
Comments