Ada-NAV: Adaptive Trajectory-Based Sample Efficient Policy Learning for Robotic Navigation

Bhrij Patel,Kasun Weerakoon,Wesley Suttle,Alec Koppel,Brian M. Sadler,Amrit Singh Bedi,Dinesh Manocha

arXiv (Cornell University)（2023）

Cited 0|Views10

No score

Abstract

Reinforcement learning has gained significant traction in the field of robotic navigation. However, a persistent challenge is its sample inefficiency, primarily due to the inherent complexities of encouraging exploration. During training, the mobile agent must explore as much as possible to efficiently learn optimal behaviors. We introduce Ada-NAV, a novel adaptive trajectory length scheme designed to enhance the training sample efficiency of reinforcement learning algorithms in robotic navigation tasks. Unlike traditional approaches that treat trajectory length as a fixed hyperparameter, Ada-NAV dynamically adjusts it based on the entropy of the underlying navigation policy. We empirically validate the efficacy of AdaNAV using two popular policy gradient methods: REINFORCE and Proximal Policy Optimization (PPO). We demonstrate through both simulated and real-world robotic experiments that Ada-NAV outperforms conventional methods that employ constant or randomly sampled trajectory lengths. Specifically, for a fixed sample budget, Ada-NAV achieves an 18% increase in navigation success rate, a 20-38% reduction in navigation path length, and a 9.32% decrease in elevation costs. Furthermore, we showcase the versatility of Ada-NAV by integrating it with the Clearpath Husky robot, illustrating its applicability in complex, outdoor environments.

Translated text

Key words

robotic navigation,sample efficient policy learning,ada-nav,trajectory-based

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined