An Efficient Generalizable Framework for Visuomotor Policies via Control-aware Augmentation and Privilege-guided Distillation
CoRR(2024)
摘要
Visuomotor policies, which learn control mechanisms directly from
high-dimensional visual observations, confront challenges in adapting to new
environments with intricate visual variations. Data augmentation emerges as a
promising method for bridging these generalization gaps by enriching data
variety. However, straightforwardly augmenting the entire observation shall
impose excessive burdens on policy learning and may even result in performance
degradation. In this paper, we propose to improve the generalization ability of
visuomotor policies as well as preserve training stability from two aspects: 1)
We learn a control-aware mask through a self-supervised reconstruction task
with three auxiliary losses and then apply strong augmentation only to those
control-irrelevant regions based on the mask to reduce the generalization gaps.
2) To address training instability issues prevalent in visual reinforcement
learning (RL), we distill the knowledge from a pretrained RL expert processing
low-level environment states, to the student visuomotor policy. The policy is
subsequently deployed to unseen environments without any further finetuning. We
conducted comparison and ablation studies across various benchmarks: the
DMControl Generalization Benchmark (DMC-GB), the enhanced Robot Manipulation
Distraction Benchmark (RMDB), and a specialized long-horizontal drawer-opening
robotic task. The extensive experimental results well demonstrate the
effectiveness of our method, e.g., showing a 17% improvement over previous
methods in the video-hard setting of DMC-GB.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要