Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing

    NeurIPS, pp. 10015-10027, 2018.

    Cited by: 36|Bibtex|Views18|Links
    EI
    Keywords:
    program synthesisnatural languageweak supervisionsemantic parsingweighted sumMore(1+)

    Abstract:

    This paper presents MAPO: a novel policy optimization formulation that incorporates a memory buffer of promising trajectories to reduce the variance of policy gradient estimates for deterministic environments with discrete actions. The formulation expresses the expected return objective as a weighted sum of two terms: an expectation over ...More

    Code:

    Data:

    Your rating :
    0

     

    Tags
    Comments