Bonjour,

Je vous annonce que le prochain séminaire de l'axe ML sera donné par Julien Perez, qui a rejoint le LRE et l'équipe IA début novembre. 

Ci-dessous, le titre et l'abstract du talk

Merci de bien vouloir remplir le framadate suivant : https://framadate.org/ctE9bGR4XuGYzRoy
Au plus tard le vendredi 24 novembre, afin de fixer rapidement un créneau pour cet évènement.

Title: Safe Skill Discovery in Unsupervised Reinforcement Learning and Alignment in Code LLMs.

Abstract:
This presentation addresses the question of safety and alignment in statistical sequential decision models. The first part of the talk will focus on safety-centric skill discovery using unsupervised reinforcement learning and its application to robotic manipulation, as presented at ICRA'23. We introduce the novel problem of Safety-Aware Skill Discovery, which aims to learn, in a task-agnostic fashion, a repertoire of reusable skills that are inherently safe for composing solutions to downstream tasks. We present a computationally tractable algorithm that learns a latent-conditioned skill policy maximizing intrinsic rewards, regulated by a safety-critic capable of modeling any user-defined safety constraints. Utilizing the pretrained safe skill repertoire, hierarchical reinforcement learning can solve multiple downstream tasks without explicit consideration of safety during training and testing. We evaluate our algorithm on a collection of force-controlled robotic manipulation tasks in simulation, demonstrating promising performance in downstream tasks while satisfying safety constraints.

As an opening, in the second part of the talk, I will introduce the recent progress of Code LLM and the emerging alignment requirements. This section serves as a discussion, emphasizing the necessity of alignment in code-based Language Model Models (LLMs) and the imperative safety considerations inherent in this domain.


Bonne fin de journée, 


Une image contenant texte, clipart, signe

Description générée automatiquement

Idir Benouaret

Enseignant-Chercheur


 

 
 
 
 
 
Une image contenant texte, clipart

Description générée automatiquement

+33 4 28 29 37 63