Overfitting-avoiding goal-guided exploration for hard-exploration multi-goal reinforcement learning.

Neurocomputing(2023)

引用 1|浏览7
暂无评分
摘要
In hard-exploration multi-goal reinforcement learning tasks, the agent faces challenges to achieve a series of distant goals with sparse rewards. Directly exploring to pursue these hard goals can hardly succeed, because the agent is unable to acquire learning signals applicable to these goals. To progressively enhance agent ability and promote exploration, goal-guided exploration methods generate easier auxiliary goals that gradually approach the original hard goals for the agent to pursue. However, due to the neglect of the growth of agent generalizability, the goal generation region of the previous methods is limited, which causes overfitting and traps the exploration for further goals. In this paper, after modeling the multi-goal RL as a distribution-matching process, we propose an overfitting-avoiding goal-guided exploration method (OGE), where the generation of auxiliary goals follows the Wasserstein-distance-based optimal transport geodesic, and the generation region is in the Lipschitz-constant-delimited generalizability margin. Our OGE is compared with state-of-the-art methods in hard-exploration multi-goal robotic manipulation tasks. Apart from showing the highest learning efficiency, in those tasks where all the prior methods meet overfitting and fail, our method can still successfully guide the agent to achieve the hard goals.
更多
查看译文
关键词
reinforcement learning,overfitting-avoiding,goal-guided,hard-exploration,multi-goal
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要