Safety-Aware Unsupervised Skill Discovery


Cited 0|Views23
No score
Programming manipulation behaviors can become increasingly difficult with a growing number and complexity of manipulation tasks, particularly in a dynamic and unstructured environment. Recent progress in unsupervised skill discovery algorithms has shown great promise in learning an extensive collection of behaviors without extrinsic supervision. On the other hand, safety is one of the most critical factors for real-world robot applications. As skill discovery methods typically encourage exploratory and dynamic behaviors, it can often be the case that a large portion of learned skills remain too dangerous and unsafe. In this paper, we introduce the novel problem of Safety-Aware Skill Discovery, which aims to learn, in a task-agnostic fashion, a repertoire of reusable skills that are inherently safe to be composed for solving downstream tasks. We present a computationally tractable algorithm that learns a latent-conditioned skill policy that maximizes intrinsic rewards regularized with a safety-critic that can model any user-defined safety constraints. Using the pretrained safe skill repertoire, hierarchical reinforcement learning can solve multiple downstream tasks without the need for explicit consideration of safety during training and testing. We evaluate our algorithm on a collection of force-controlled robotic manipulation tasks in simulation and show promising downstream task performance while satisfying safety constraints.
Translated text
Key words
computationally tractable algorithm,dynamic behaviors,dynamic environment,extensive collection,extrinsic supervision,force-controlled robotic manipulation tasks,hierarchical reinforcement learning,latent-conditioned skill policy,learned skills,manipulation behaviors,multiple downstream tasks,pretrained safe skill repertoire,real- world robot applications,reusable skills,Safety-Aware Skill Discovery,Safety-Aware unsupervised Skill Discovery,safety-critic,show promising downstream task performance while satisfying safety constraints,skill discovery methods,task-agnostic fashion,unsupervised skill discovery algorithms,user-defined safety constraints
AI Read Science
Must-Reading Tree
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined