Safe Model-Based Reinforcement Learning With an Uncertainty-Aware Reachability Certificate

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING(2023)

引用 0|浏览4
暂无评分
摘要
Safe reinforcement learning (RL) that solves constraint-satisfactory policies provides a promising way to the broader safety-critical applications of RL in real-world problems such as robotics. Among all safe RL approaches, model-based methods reduce training time violations further due to their high sample efficiency. However, lacking safety robustness against the model uncertainties remains an issue in safe model-based RL, especially in training time safety. In this paper, we propose a distributional reachability certificate (DRC) and its Bellman equation to address model uncertainties and characterize robust persistently safe states. Furthermore, we build a safe RL framework to resolve constraints required by the DRC and its corresponding shield policy. We also devise a line search method to maintain safety and reach higher returns simultaneously while leveraging the shield policy. Comprehensive experiments on classical benchmarks such as constrained tracking and navigation indicate that the proposed algorithm achieves comparable returns with much fewer constraint violations during training. Our code is available at https://github.com/ManUtdMoon/Distributional-Reachability-Policy-Optimization.
更多
查看译文
关键词
Safe reinforcement learning (RL),reachability analysis,model-based RL,robot learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要