Ada-NAV: Adaptive Trajectory Length-Based Sample Efficient Policy Learning for Robotic Navigation
arxiv(2023)
摘要
Trajectory length stands as a crucial hyperparameter within reinforcement
learning (RL) algorithms, significantly contributing to the sample inefficiency
in robotics applications. Motivated by the pivotal role trajectory length plays
in the training process, we introduce Ada-NAV, a novel adaptive trajectory
length scheme designed to enhance the training sample efficiency of RL
algorithms in robotic navigation tasks. Unlike traditional approaches that
treat trajectory length as a fixed hyperparameter, we propose to dynamically
adjust it based on the entropy of the underlying navigation policy.
Interestingly, Ada-NAV can be applied to both existing on-policy and off-policy
RL methods, which we demonstrate by empirically validating its efficacy on
three popular RL methods: REINFORCE, Proximal Policy Optimization (PPO), and
Soft Actor-Critic (SAC). We demonstrate through simulated and real-world
robotic experiments that Ada-NAV outperforms conventional methods that employ
constant or randomly sampled trajectory lengths. Specifically, for a fixed
sample budget, Ada-NAV achieves an 18% increase in navigation success rate, a
20-38% reduction in navigation path length, and a 9.32% decrease in elevation
costs. Furthermore, we showcase the versatility of Ada-NAV by integrating it
with the Clearpath Husky robot, illustrating its applicability in complex
outdoor environments.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要