Low Precision Policy Distillation with Application to Low-Power, Real-time Sensation-Cognition-Action Loop with Neuromorphic Computing

Jeffrey L Mckinstry,Davis R. Barch,Deepika Bablani,Michael V. Debole,Steven K. Esser, Jeffrey A. Kusnitz,John V. Arthur,Dharmendra S. Modha

arXiv: Learning（2018）

引用 24|浏览113

暂无评分

摘要

Low precision networks in the reinforcement learning (RL) setting are relatively unexplored because of the limitations of binary activations for function approximation. Here, in the discrete action ATARI domain, we demonstrate, for the first time, that low precision policy distillation from a high precision network provides a principled, practical way to train an RL agent. As an application, on 10 different ATARI games, we demonstrate real-time end-to-end game playing on low-power neuromorphic hardware by converting a sequence of game frames into discrete actions.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要