Learning Compositional Neural Programs with Recursive Tree Search and Planning
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), pp. 14646-14656, 2019.
EI
Keywords:
reinforcement learningsample complexitytower of hanoi
Abstract:
We propose a novel reinforcement learning algorithm, AlphaNPI, that incorporates the strengths of Neural Programmer-Interpreters (NPI) and AlphaZero. NPI contributes structural biases in the form of modularity, hierarchy and recursion, which are helpful to reduce sample complexity, improve generalization and increase interpretability. Alp...More
Code:
Data:
Tags
Comments