Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), pp. 4673-4681, 2016.
You are a robot and you live in a Markov decision process (MDP) with a finite or an infinite number of transitions from state-action to next states. You got brains and so you plan before you act. Luckily, your roboparents equipped you with a generative model to do some Monte-Carlo planning. The world is waiting for you and you have no tim...More
Full Text (Upload PDF)
PPT (Upload PPT)