Autonomous spacecraft proximity operations in Highly Elliptical Orbits (HEO) are critical for emerging on-orbit servicing and debris removal missions, yet they pose significant guidance and control (G&C) challenges due to complex orbital dynamics and stringent safety constraints. Traditional G&C methods often struggle with the trade-off between performance and the high computational load on resource-constrained flight hardware. This thesis introduces a novel hybrid G&C framework that synergistically integrates Deep Reinforcement Learning (DRL) with Model Predictive Control (MPC) to enhance autonomy and efficiency. The core contribution is a DRL agent trained to dynamically tune key MPC parameters in real-time, specifically the prediction horizon, control timestep, and the state and control weighting matrices. This approach leverages MPC’s predictive, constraint-aware nature while using DRL’s adaptive learning to mitigate its high computational cost and reliance on manual tuning. The findings, validated through extensive high-fidelity Monte Carlo simulations, demonstrate that the DRL-tuned MPC significantly improves performance over statically configured controllers. The hybrid system achieves substantial optimization in both fuel efficiency and computational performance, with fuel savings ranging from 10% to 40% and a reduction in relative computational time by approximately 65%. Crucially, these gains are obtained without compromising the safety guarantees inherent to the MPC framework; the DRL agent acts as an expert tuner, but the underlying MPC layer remains the final arbiter responsible for enforcing all system dynamics and operational constraints. The developed system also exhibits superior robustness to unmodeled disturbances and parameter uncertainties, maintaining a perfect mission success rate across a wide range of operational conditions. This work provides definitive evidence that this hybrid strategy is a viable, robust, and efficient solution for real-time, on-board implementation in autonomous spacecraft guidance.
Le operazioni autonome di prossimità tra veicoli spaziali in Orbite Altamente Ellittiche (HEO) sono cruciali per le nuove missioni di servizio in orbita e di rimozione dei detriti, ma presentano significative sfide di controllo a causa di dinamiche orbitali complesse e stringenti vincoli di sicurezza. I metodi di controllo tradizionali spesso faticano a trovare un compromesso tra prestazioni e l’elevato carico computazionale su hardware di volo con risorse limitate. Questa tesi introduce un nuovo framework di controllo ibrido che integra sinergicamente il Deep Reinforcement Learning (DRL) con il Model Predictive Control (MPC) per migliorare l’autonomia e l’efficienza. Il contributo principale è un agente DRL addestrato per regolare dinamicamente in tempo reale i parametri chiave dell’MPC, in particolare l’orizzonte di predizione, il passo di controllo e le matrici di peso per lo stato e il controllo. Questo approccio sfrutta la natura predittiva e la capacità di gestione dei vincoli dell’MPC, mitigandone al contempo l’alto costo computazionale e la dipendenza da una taratura manuale grazie all’apprendimento adattivo del DRL. I risultati, convalidati tramite estese simulazioni Monte Carlo ad alta fedeltà, dimostrano che l’MPC ottimizzato dal DRL migliora significativamente le prestazioni rispetto ai controllori configurati staticamente. Il sistema ibrido ottiene un’ottimizzazione sostanziale sia in termini di efficienza del propellente che di prestazioni computazionali, con un risparmio di carburante che va dal 10% al 40% e una riduzione del tempo di calcolo relativo di circa il 65%. Aspetto fondamentale, questi guadagni sono realizzati senza compromettere le garanzie di sicurezza intrinseche al framework MPC; l’agente DRL agisce come un ottimizzatore esperto, ma lo strato MPC sottostante rimane l’arbitro finale responsabile dell’applicazione di tutte le dinamiche di sistema e dei vincoli operativi. Questo lavoro fornisce la prova definitiva che questa strategia ibrida è una soluzione praticabile, robusta ed efficiente per l’implementazione in tempo reale a bordo di sistemi di guida autonoma per veicoli spaziali.
Adaptive MPC controller through deep reinforcement learning for proximity operations in HEO
Cavalletti, Pietro
2024/2025
Abstract
Autonomous spacecraft proximity operations in Highly Elliptical Orbits (HEO) are critical for emerging on-orbit servicing and debris removal missions, yet they pose significant guidance and control (G&C) challenges due to complex orbital dynamics and stringent safety constraints. Traditional G&C methods often struggle with the trade-off between performance and the high computational load on resource-constrained flight hardware. This thesis introduces a novel hybrid G&C framework that synergistically integrates Deep Reinforcement Learning (DRL) with Model Predictive Control (MPC) to enhance autonomy and efficiency. The core contribution is a DRL agent trained to dynamically tune key MPC parameters in real-time, specifically the prediction horizon, control timestep, and the state and control weighting matrices. This approach leverages MPC’s predictive, constraint-aware nature while using DRL’s adaptive learning to mitigate its high computational cost and reliance on manual tuning. The findings, validated through extensive high-fidelity Monte Carlo simulations, demonstrate that the DRL-tuned MPC significantly improves performance over statically configured controllers. The hybrid system achieves substantial optimization in both fuel efficiency and computational performance, with fuel savings ranging from 10% to 40% and a reduction in relative computational time by approximately 65%. Crucially, these gains are obtained without compromising the safety guarantees inherent to the MPC framework; the DRL agent acts as an expert tuner, but the underlying MPC layer remains the final arbiter responsible for enforcing all system dynamics and operational constraints. The developed system also exhibits superior robustness to unmodeled disturbances and parameter uncertainties, maintaining a perfect mission success rate across a wide range of operational conditions. This work provides definitive evidence that this hybrid strategy is a viable, robust, and efficient solution for real-time, on-board implementation in autonomous spacecraft guidance.| File | Dimensione | Formato | |
|---|---|---|---|
|
2025_10_Cavalletti_Tesi.pdf
non accessibile
Descrizione: MSc Thesis
Dimensione
2.72 MB
Formato
Adobe PDF
|
2.72 MB | Adobe PDF | Visualizza/Apri |
|
2025_10_Cavalletti_ Executive Summary.pdf
accessibile in internet per tutti
Descrizione: Msc Thesis Executive Summary
Dimensione
740.17 kB
Formato
Adobe PDF
|
740.17 kB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/243740