Memory Augmented Policy Optimization for Program Synthesis with Generalization

    arXiv: Learning, Volume abs/1807.02322, 2018.

    Cited by: 1|Bibtex|Views16|Links
    EI

    Abstract:

    This paper presents Memory Augmented Policy Optimization (MAPO): a novel policy optimization formulation that incorporates a memory buffer of promising trajectories to reduce the variance of policy gradient estimates for deterministic environments with discrete actions. The formulation expresses the expected return objective as a weighted...More

    Code:

    Data:

    Your rating :
    0

     

    Tags
    Comments