This thesis presents an advanced control architecture for thermal management in buildings, with the aim of ensuring both comfort and energy efficiency in the presence of significant structural uncertainty. The proposed system integrates Model Predictive Control (MPC) with Meta-Reinforcement Learning (Meta-RL), leveraging the predictive capabilities of model-based control and the adaptive flexibility of reinforcement learning. The MPC module generates optimal temperature trajectories based on a nominal building model. However, due to the sensitivity of MPC to model accuracy, a corrective Meta-RL controller is introduced, trained to identify and compensate for unmodelled dynamics through simulated interactions. Specifically, the architecture incorporates a neural network capable of inferring latent environmental features directly from observations, enabling real-time adaptation without explicit knowledge of uncertain parameters. Experimental validation was carried out using a model of Power FlexHouse 03, a residential test building located at the Technical University of Denmark. Results show that, compared to a conservative MPC approach, the Meta-RL controller achieves an average energy saving of 6% while maintaining comparable comfort levels. Even under strong model mismatch conditions, the system maintains stable operation, demonstrating effective generalisation and robustness. Thus, this architecture emerges as a promising solution for autonomous and intelligent HVAC management, applicable even to buildings with partial data or complex structures.
Questa tesi presenta un'architettura di controllo avanzata per la gestione termica negli edifici, con l'obiettivo di garantire sia il comfort che l'efficienza energetica in presenza di significative incertezze strutturali. Il sistema proposto integra il Controllo Predittivo Basato su Modello (Model Predictive Control, MPC) con il Meta-Apprendimento per Rinforzo (Meta-Reinforcement Learning, Meta-RL), sfruttando le capacità predittive del controllo modellistico e la flessibilità adattiva dell'apprendimento per rinforzo. Il modulo MPC genera traiettorie ottimali di temperatura basandosi su un modello nominale dell’edificio. Tuttavia, a causa della sensibilità dell’MPC all’accuratezza del modello, viene introdotto un controllore correttivo basato su Meta-RL, addestrato per identificare e compensare le dinamiche non modellate attraverso interazioni simulate. In particolare, l’architettura incorpora una rete neurale in grado di inferire caratteristiche ambientali latenti direttamente dalle osservazioni, consentendo un’adattamento in tempo reale senza la necessità di una conoscenza esplicita dei parametri incerti. La validazione sperimentale è stata condotta utilizzando un modello del Power FlexHouse 03, un edificio residenziale sperimentale situato presso la Technical University of Denmark. I risultati mostrano che, rispetto a un approccio MPC conservativo, il controllore Meta-RL ottiene un risparmio energetico medio del 6% mantenendo livelli di comfort comparabili. Anche in condizioni di forte disallineamento del modello, il sistema mantiene un funzionamento stabile, dimostrando buone capacità di generalizzazione e robustezza. Pertanto, questa architettura si configura come una soluzione promettente per una gestione HVAC autonoma e intelligente, applicabile anche ad edifici con dati parziali o strutture complesse.
Residual Meta-RL with MPC for building temperature control with model parametric uncertainties
PALMIERI, SERENA
2024/2025
Abstract
This thesis presents an advanced control architecture for thermal management in buildings, with the aim of ensuring both comfort and energy efficiency in the presence of significant structural uncertainty. The proposed system integrates Model Predictive Control (MPC) with Meta-Reinforcement Learning (Meta-RL), leveraging the predictive capabilities of model-based control and the adaptive flexibility of reinforcement learning. The MPC module generates optimal temperature trajectories based on a nominal building model. However, due to the sensitivity of MPC to model accuracy, a corrective Meta-RL controller is introduced, trained to identify and compensate for unmodelled dynamics through simulated interactions. Specifically, the architecture incorporates a neural network capable of inferring latent environmental features directly from observations, enabling real-time adaptation without explicit knowledge of uncertain parameters. Experimental validation was carried out using a model of Power FlexHouse 03, a residential test building located at the Technical University of Denmark. Results show that, compared to a conservative MPC approach, the Meta-RL controller achieves an average energy saving of 6% while maintaining comparable comfort levels. Even under strong model mismatch conditions, the system maintains stable operation, demonstrating effective generalisation and robustness. Thus, this architecture emerges as a promising solution for autonomous and intelligent HVAC management, applicable even to buildings with partial data or complex structures.File | Dimensione | Formato | |
---|---|---|---|
2025_07_Palmieri.pdf
non accessibile
Descrizione: Tesi di laurea magistrale
Dimensione
7.04 MB
Formato
Adobe PDF
|
7.04 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/240474