Bonjour,

Je vous annonce que le prochain séminaire de l'axe ML sera donné par Julien Perez, qui a rejoint le LRE et l'équipe IA début novembre.

Ci-dessous, le titre et l'abstract du talk

Merci de bien vouloir remplir le framadate suivant : https://framadate.org/ctE9bGR4XuGYzRoy

Au plus tard le vendredi 24 novembre, afin de fixer rapidement un créneau pour cet évènement.

This presentation addresses the question of safety and alignment in statistical sequential decision models. The first part of the talk will focus on safety-centric skill discovery using unsupervised reinforcement learning and its application to robotic manipulation, as presented at ICRA'23. We introduce the novel problem of Safety-Aware Skill Discovery, which aims to learn, in a task-agnostic fashion, a repertoire of reusable skills that are inherently safe for composing solutions to downstream tasks. We present a computationally tractable algorithm that learns a latent-conditioned skill policy maximizing intrinsic rewards, regulated by a safety-critic capable of modeling any user-defined safety constraints. Utilizing the pretrained safe skill repertoire, hierarchical reinforcement learning can solve multiple downstream tasks without explicit consideration of safety during training and testing. We evaluate our algorithm on a collection of force-controlled robotic manipulation tasks in simulation, demonstrating promising performance in downstream tasks while satisfying safety constraints.

As an opening, in the second part of the talk, I will introduce the recent progress of Code LLM and the emerging alignment requirements. This section serves as a discussion, emphasizing the necessity of alignment in code-based Language Model Models (LLMs) and the imperative safety considerations inherent in this domain.

Bonne fin de journée,