Bonjour,
Suite au sondage, je vous annonce que le séminaire de Julien Perez aura lieu vendredi 8
décembre 10h-11h.
Bon weekend.
[Une image contenant texte, clipart, signe Description générée automatiquement]
Idir Benouaret
Enseignant-Chercheur
[
https://lh6.googleusercontent.com/dCYVl9pmYNgg1wBndkYtELRR9DX8RRY5d_e4N2Xhk…]
[
https://lh4.googleusercontent.com/O8lm18dZZ188g_iBNOmO2UX8qPx8zNmCfFbItUDzk…]
[
https://lh4.googleusercontent.com/iJejRKKnVTGA2lyz-eMOKQvI9vVP-ftzUIqzCoeoR…]
[
https://lh6.googleusercontent.com/J3eMugeLIjhhZigE4aw44kR3o5SCCOf3J7YJfPDTC…]
[
https://lh3.googleusercontent.com/2nPXgKhjFiG8S9uTa2TTXP1-CcUjZnjdJ5CVAd3_f…]
[Une image contenant texte, clipart Description générée automatiquement]
+33 4 28 29 37 63
[
https://lh6.googleusercontent.com/hGLtorzP2LwpWN7a7qPxHUf-Ufq7UmoJXpduyzZWG…]
________________________________
De : Idir Benouaret
Envoyé : mercredi 22 novembre 2023 15:04
À : current(a)ml.lre.epita.fr <current(a)ml.lre.epita.fr>
Cc : Nicolas Boutry <nicolas.boutry(a)epita.fr>fr>; Julien Perez
<julien.perez(a)epita.fr>
Objet : Séminaire de Julien Perez [Axe ML]
Bonjour,
Je vous annonce que le prochain séminaire de l'axe ML sera donné par Julien Perez, qui
a rejoint le LRE et l'équipe IA début novembre.
Ci-dessous, le titre et l'abstract du talk
Merci de bien vouloir remplir le framadate suivant :
https://framadate.org/ctE9bGR4XuGYzRoy
Au plus tard le vendredi 24 novembre, afin de fixer rapidement un créneau pour cet
évènement.
Title: Safe Skill Discovery in Unsupervised Reinforcement Learning and Alignment in Code
LLMs.
Abstract:
This presentation addresses the question of safety and alignment in statistical sequential
decision models. The first part of the talk will focus on safety-centric skill discovery
using unsupervised reinforcement learning and its application to robotic manipulation, as
presented at ICRA'23. We introduce the novel problem of Safety-Aware Skill Discovery,
which aims to learn, in a task-agnostic fashion, a repertoire of reusable skills that are
inherently safe for composing solutions to downstream tasks. We present a computationally
tractable algorithm that learns a latent-conditioned skill policy maximizing intrinsic
rewards, regulated by a safety-critic capable of modeling any user-defined safety
constraints. Utilizing the pretrained safe skill repertoire, hierarchical reinforcement
learning can solve multiple downstream tasks without explicit consideration of safety
during training and testing. We evaluate our algorithm on a collection of force-controlled
robotic manipulation tasks in simulation, demonstrating promising performance in
downstream tasks while satisfying safety constraints.
As an opening, in the second part of the talk, I will introduce the recent progress of
Code LLM and the emerging alignment requirements. This section serves as a discussion,
emphasizing the necessity of alignment in code-based Language Model Models (LLMs) and the
imperative safety considerations inherent in this domain.
Bonne fin de journée,
[Une image contenant texte, clipart, signe Description générée automatiquement]
Idir Benouaret
Enseignant-Chercheur
[
https://lh6.googleusercontent.com/dCYVl9pmYNgg1wBndkYtELRR9DX8RRY5d_e4N2Xhk…]
[
https://lh4.googleusercontent.com/O8lm18dZZ188g_iBNOmO2UX8qPx8zNmCfFbItUDzk…]
[
https://lh4.googleusercontent.com/iJejRKKnVTGA2lyz-eMOKQvI9vVP-ftzUIqzCoeoR…]
[
https://lh6.googleusercontent.com/J3eMugeLIjhhZigE4aw44kR3o5SCCOf3J7YJfPDTC…]
[
https://lh3.googleusercontent.com/2nPXgKhjFiG8S9uTa2TTXP1-CcUjZnjdJ5CVAd3_f…]
[Une image contenant texte, clipart Description générée automatiquement]
+33 4 28 29 37 63
[
https://lh6.googleusercontent.com/hGLtorzP2LwpWN7a7qPxHUf-Ufq7UmoJXpduyzZWG…]
______________________
Current mailing list -- current(a)ml.lre.epita.fr
https://lists.lrde.epita.fr/postorius/lists/current.ml.lre.epita.fr//
______________________
Permanents mailing list -- permanents(a)ml.lre.epita.fr
https://lists.lrde.epita.fr/postorius/lists/permanents.ml.lre.epita.fr//
______________________
Perms mailing list -- perms(a)ml.lre.epita.fr
https://lists.lrde.epita.fr/postorius/lists/perms.ml.lre.epita.fr//
______________________
Perms.paris mailing list -- perms.paris(a)ml.lre.epita.fr
https://lists.lrde.epita.fr/postorius/lists/perms.paris.ml.lre.epita.fr//