Ci-dessous, le titre et l'abstract du talk

Title: Safe Skill Discovery in Unsupervised Reinforcement Learning and Alignment in Code LLMs.

This presentation addresses the question of safety and alignment in statistical sequential decision models. The first part of the talk will focus on safety-centric skill discovery using unsupervised reinforcement learning and its application to robotic manipulation, as presented at ICRA'23. We introduce the novel problem of Safety-Aware Skill Discovery, which aims to learn, in a task-agnostic fashion, a repertoire of reusable skills that are inherently safe for composing solutions to downstream tasks. We present a computationally tractable algorithm that learns a latent-conditioned skill policy maximizing intrinsic rewards, regulated by a safety-critic capable of modeling any user-defined safety constraints. Utilizing the pretrained safe skill repertoire, hierarchical reinforcement learning can solve multiple downstream tasks without explicit consideration of safety during training and testing. We evaluate our algorithm on a collection of force-controlled robotic manipulation tasks in simulation, demonstrating promising performance in downstream tasks while satisfying safety constraints.

As an opening, in the second part of the talk, I will introduce the recent progress of Code LLM and the emerging alignment requirements. This section serves as a discussion, emphasizing the necessity of alignment in code-based Language Model Models (LLMs) and the imperative safety considerations inherent in this domain.

