Boosting Trust Region Policy Optimization by Normalizing Flows Policy

arXiv: Artificial Intelligence, Volume abs/1809.10326, 2018.

Cited by: 0|Views13
EI

Abstract:

We propose to improve trust region policy search with normalizing flows policy. We illustrate that when the trust region is constructed by KL divergence constraint, normalizing flows policy can generate samples far from the u0027centeru0027 of the previous policy iterate, which potentially enables better exploration and helps avoid bad lo...More

Code:

Data:

Your rating :
0

 

Tags
Comments