Complementary Learning System Based Intrinsic Reward in Reinforcement Learning

Zijian Gao,Kele Xu,Hongda Jia,Tianjiao Wan,Bo Ding,Dawei Feng,Xinjun Mao,Huaimin Wang

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)（2023）

引用 0|浏览13

暂无评分

摘要

Deep reinforcement learning has achieved encouraging performance in many realms. However, one of its primary challenges is the sparsity of extrinsic rewards, which is still far from solved. Complementary learning system theory suggests that effective human learning relies on two complementary learning systems utilizing short-term and long-term memories. Inspired by the fact that humans evaluate curiosity by comparing current observations with historical information, we propose a novel intrinsic reward, namely CLS-IR, which aims to address the problems caused by sparse extrinsic rewards. Specifically, we train a self-supervised predictive model with short-term and long-term memories via exponential moving averages. We employ the information gain between the two memories as the intrinsic reward, which does not incur additional training costs but leads to better exploration. To investigate the effectiveness of CLS-IR, we conduct extensive experimental evaluations; the results demonstrate that CLS-IR can achieve state-of-the-art performance on Atari games and DeepMind Control Suite.

查看译文

关键词

Deep reinforcement learning,exploration,intrinsic reward,complementary learning system

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要