Air Hockey is a competitive, dynamic, and complex game. Utilizing robotic systems to play Air Hockey presents various challenges, including physical limitations and constraints that must be respected to ensure safety and prevent damage to the components. This thesis explores the application of Reinforcement Learning techniques in this context. Specifically, since Air Hockey comprises several sub-problems, such as hitting the puck to score a point or defending one’s goal, a hierarchical optimization problem is formulated to develop an agent capable of playing entire matches while adhering to physical constraints. This agent is organized into two levels. The lower level comprises several specialized policies, trained using Deep RL and Rule-based RL algorithms, while the higher level selects which specialized policy to use through a parametric state machine, optimized using RuleBased RL. Additionally, it describes our participation in the Robot Air Hockey Challenge, a competition designed to address the sim-to-real gap, the performance and safety drop observed when a real robotic system is controlled by a policy trained in a simulated environment.
Air Hockey è un gioco competitivo, dinamico e complesso. L’impiego di sistemi robotici per giocare ad Air Hockey presenta varie sfide, tra cui limiti fisici e vincoli che devono essere rispettati per garantire la sicurezza ed evitare danni ai componenti. Questa tesi esplora l’applicazione di tecniche di Reinforcement Learning in questo contesto. In particolare, dato che Air Hockey è composto da diversi sottoproblemi, come colpire il disco per segnare un punto o difendere la propria porta, viene formulato un problema di ottimizzazione gerarchica per sviluppare un agente capace di giocare intere partite rispettando i vincoli fisici. Questo agente è organizzato in due livelli. Il livello più basso comprende diverse politiche specializzate, che vengono allenate con algoritmi di Deep RL e Rule-based RL, mentre il livello superiore seleziona quale politica specializzata utilizzare tramite una macchina a stati parametrica, ottimizzata tramite Rule-Based RL. Inoltre, viene descritta la nostra partecipazione alla Robot Air Hockey Challenge, una competizione ideata per affrontare il sim-to-real gap, ovvero il calo di performance e sicurezza che si verifica quando un sistema robotico reale è controllato da una politica allenata in un ambiente simulato.
Robot air hockey via hierarchical reinforcement learning
Bonenfant, Thomas Jean Bernard
2023/2024
Abstract
Air Hockey is a competitive, dynamic, and complex game. Utilizing robotic systems to play Air Hockey presents various challenges, including physical limitations and constraints that must be respected to ensure safety and prevent damage to the components. This thesis explores the application of Reinforcement Learning techniques in this context. Specifically, since Air Hockey comprises several sub-problems, such as hitting the puck to score a point or defending one’s goal, a hierarchical optimization problem is formulated to develop an agent capable of playing entire matches while adhering to physical constraints. This agent is organized into two levels. The lower level comprises several specialized policies, trained using Deep RL and Rule-based RL algorithms, while the higher level selects which specialized policy to use through a parametric state machine, optimized using RuleBased RL. Additionally, it describes our participation in the Robot Air Hockey Challenge, a competition designed to address the sim-to-real gap, the performance and safety drop observed when a real robotic system is controlled by a policy trained in a simulated environment.File | Dimensione | Formato | |
---|---|---|---|
2024_07_Bonenfant_Tesi.pdf
accessibile in internet per tutti
Descrizione: Tesi
Dimensione
1.96 MB
Formato
Adobe PDF
|
1.96 MB | Adobe PDF | Visualizza/Apri |
2024_07_Bonenfant_Executive_Summary.pdf
accessibile in internet per tutti
Descrizione: Executive Summary
Dimensione
776.27 kB
Formato
Adobe PDF
|
776.27 kB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/223615