Learning End-to-End Visual Servoing Using an Improved Soft Actor-Critic Approach With Centralized Novelty Measurement.

IEEE Trans. Instrum. Meas.(2023)

引用 0|浏览0
暂无评分
摘要
End-to-end visual servoing (VS) based on reinforcement learning (RL) can simplify the design of features and control laws and has strong scalability in combination with neural networks. However, it is challenging for RL-based VS tasks to operate in a continuous state or action space due to the difficulty in space exploration and slow training convergence. Hence, this article presents a novel measurement method based on centralized features extracted by a neural network, which calculates the novelty of the visited state to encourage RL-agent exploration. Moreover, we propose a hybrid probability sampling method that improves the prioritized experience replay (PER) based on a temporal-difference (TD) error by integrating the intrinsic and external rewards. This strategy represents the transition novelty and quality in the buffer replay, respectively, to promote convergence in the training process. Finally, we develop an end-to-end VS scheme based on the maximum entropy RL soft actor-critic (SAC). Several simulated experiments in CoppeliaSim are designed for end-to-end VS, where the target detection information is the agent&x2019's input. The results highlight that our method's reward value and completion rates are 0.35% and 8.0% higher than the SAC VS baseline. At the same time, we conduct experiments to verify the effectiveness of the proposed algorithm.
更多
查看译文
关键词
Intrinsic reward, novelty measure, reinforcement learning (RL), sampling optimization, visual servoing (VS)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要