Negative reward-prediction errors of climbing fiber inputs for cerebellar reinforcement learning algorithm

biorxiv(2023)

引用 32|浏览6
暂无评分
摘要
Although the cerebellum is widely associated with supervised learning algorithm, abundant reward-related representations were found in the cerebellum. We ask the question whether the cerebellum also implements reinforcement learning algorithm, especially the essential reward-prediction error. By tensor component analysis on two-photon Ca2+ imaging data, we recently demonstrated that a component of climbing fiber inputs in the lateral zones of mouse cerebellum Crus II represents cognitive error signals for Go/No-go auditory discrimination task. Here, we applied the Q-learning model to quantitatively reproduce Go/No-go learning behaviors, as well as to compute reinforcement learning variables including reward, predicted reward and reward-prediction error within each learning trial. Climbing fiber inputs to the cognitive-error component are strongly correlated with the negative reward-prediction error, and decreased as learning progressed. Assuming parallel-fiber Purkinje-cell synaptic plasticity, Purkinje cells of this component could acquire necessary motor commands based on the negative reward-prediction error conveyed by their climbing fiber inputs, thus providing an actor of reinforcement learning. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
reinforcement learning,fiber inputs,reward-prediction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要