Local 2-D Path Planning of Unmanned Underwater Vehicles in Continuous Action Space Based on the Twin-Delayed Deep Deterministic Policy Gradient

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS(2024)

引用 0|浏览8
暂无评分
摘要
In this article, the local two-dimensional (2-D) path planning problem is studied for an unmanned underwater vehicle (UUV) under continuous action space, and an improved algorithm is proposed based on the twin-delayed deep deterministic policy gradient (TD3). The mean function is added to the policy gradient to bring the output action of the algorithm closer to the mean of the action space. Hence, it suppresses the trend of a large number of boundary actions output by the TD3 algorithm. Based on the experience replay buffer, action storage is constructed to realize the automatic adjustment of the weight coefficient. Therefore, it reduces the additional hyperparameter tuning work caused by the change in the structure of the algorithm. In the setting of environmental variables and reward functions, real-time sonar variables are added to make the algorithm model more consistent with the actual underwater navigation situation. Based on ROS, a simulation environment is built and used to verify the path planning performance of the proposed algorithm.
更多
查看译文
关键词
Continuous action space,deep reinforcement learning,path planning,sonar,UUVs
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要