Negative reward-prediction errors of climbing fiber inputs for cerebellar reinforcement learning algorithm

Huu Hoang,Shinichiro Tsutsumi,Masanori Matsuzaki,Masanobu Kano,Keisuke Toyama,Kazuo Kitamura,Mitsuo Kawato

biorxiv（2023）

引用 32|浏览6

暂无评分

摘要

Although the cerebellum is widely associated with supervised learning algorithm, abundant reward-related representations were found in the cerebellum. We ask the question whether the cerebellum also implements reinforcement learning algorithm, especially the essential reward-prediction error. By tensor component analysis on two-photon Ca2+ imaging data, we recently demonstrated that a component of climbing fiber inputs in the lateral zones of mouse cerebellum Crus II represents cognitive error signals for Go/No-go auditory discrimination task. Here, we applied the Q-learning model to quantitatively reproduce Go/No-go learning behaviors, as well as to compute reinforcement learning variables including reward, predicted reward and reward-prediction error within each learning trial. Climbing fiber inputs to the cognitive-error component are strongly correlated with the negative reward-prediction error, and decreased as learning progressed. Assuming parallel-fiber Purkinje-cell synaptic plasticity, Purkinje cells of this component could acquire necessary motor commands based on the negative reward-prediction error conveyed by their climbing fiber inputs, thus providing an actor of reinforcement learning. ### Competing Interest Statement The authors have declared no competing interest.

查看译文

关键词

reinforcement learning,fiber inputs,reward-prediction

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要