Residential prosumer electricity demand exhibits high short-term variability driven by occupant behaviour, discrete subsystem switching, photovoltaic generation regimes, and weather-dependent dynamics. This variability complicates the construction of reliable operational baselines required for operation-phase digital twin analytics and residual-based anomaly detection. This thesis develops a traceable telemetry-to-analytics pipeline for a monitored residential case study in Italy. The pipeline integrates whole-building electrical demand, PV electrical power, pool circuit electrical power, and outdoor temperature within a clearly defined operational boundary. Heterogeneous IoT streams are aligned through UTC hourly resampling, and preprocessing rules are formalised to ensure leakage-safe evaluation and missingness of provenance tracking. A 1-hour-ahead load forecasting task is formulated using demand history, subsystem telemetry, temperature context, temporal encodings, and lag features. Random Forest regressor (RF) is adopted as the primary learner, while benchmark 1 (persistence) and benchmark 2 (Multiple Linear Regression, MLR) provide the minimum performance floor. Missing data handling is evaluated under a deployment-realistic keep-imputed causal policy and an offline interpolation upper bound used for diagnostic comparison. Performance is assessed using MAE and RMSE under a chronological holdout and an expanding rolling-week protocol. Robustness is evaluated through condition-sliced reporting across temperature bins, PV regimes, and missingness strata. Pool switching state slicing is reported where pool_on and pool_switch vary in the evaluated window. Residual-based anomaly detection is implemented using a Median Absolute Deviation (MAD) threshold rule with a K-sweep over K in the range 3 to 15 and results are reported at both hour and event levels. Monthly grid-import energy reconstructed from telemetry is compared to utility billing totals to quantify completeness and to identify months where low coverage compromises scientific evaluation. These elements together constitute a traceable and leakage-safe telemetry-to-analytics workflow for operation-phase monitoring in a residential prosumer setting (Hadri et al., 2025).
La domanda di elettricità di un prosumer residenziale presenta un’elevata variabilità nel breve periodo, guidata dal comportamento degli occupanti, dall’accensione e spegnimento discreto dei sottosistemi, dai regimi di generazione fotovoltaica e da dinamiche dipendenti dal meteo. Questa variabilità complica la costruzione di baseline operative affidabili, necessarie per le analisi di digital twin in fase operativa e per l’anomaly detection basata sui residui. Questa tesi sviluppa una pipeline tracciabile dalla telemetria alle analisi per uno studio di caso residenziale monitorato in Italia. La pipeline integra la domanda elettrica dell’intero edificio, la potenza elettrica del PV, la potenza elettrica del circuito piscina e la temperatura esterna, all’interno di un confine operativo chiaramente definito. Flussi IoT eterogenei vengono allineati tramite ricampionamento orario in UTC, e le regole di preprocessing vengono formalizzate per garantire una valutazione leakage-safe e il tracciamento della provenienza della missingness. Viene formulato un task di previsione del carico a 1 ora usando la storia della domanda, la telemetria dei sottosistemi, il contesto di temperatura, le codifiche temporali e le lag features. Il Random Forest regressor (RF) è adottato come primary learner, mentre benchmark 1 (persistence) e benchmark 2 (Multiple Linear Regression, MLR) forniscono la soglia minima di prestazione. La gestione dei dati mancanti è valutata con una policy causale keep-imputed realistica per il deployment e con un offline interpolation upper bound usato per confronto diagnostico. Le prestazioni sono valutate con MAE e RMSE mediante un holdout cronologico e un protocollo expanding rolling-week. La robustezza è valutata con reporting condizionato per sottoinsiemi, stratificato per bin di temperatura, regimi PV e strati di missingness. La stratificazione per stato di switching della piscina è riportata quando pool_on e pool_switch variano nella finestra valutata. L’anomaly detection basata sui residui è implementata con una regola di soglia basata sulla Median Absolute Deviation (MAD), con una K-sweep per valori di K compresi tra 3 e 15, e i risultati sono riportati sia a livello orario sia a livello di evento. L’energia mensile di import dalla rete, ricostruita dalla telemetria, è confrontata con i totali di fatturazione dell’utility per quantificare la completezza e identificare i mesi in cui una bassa copertura compromette la valutazione scientifica. Questi elementi costituiscono nel loro insieme un workflow tracciabile e leakage-safe dalla telemetria alle analisi per il monitoraggio in fase operativa in un contesto residenziale prosumer.
A framework for data-driven prediction of residential prosumer energy profiles using digital twins
Rahmani, Amin;Moradian, Mahsa
2024/2025
Abstract
Residential prosumer electricity demand exhibits high short-term variability driven by occupant behaviour, discrete subsystem switching, photovoltaic generation regimes, and weather-dependent dynamics. This variability complicates the construction of reliable operational baselines required for operation-phase digital twin analytics and residual-based anomaly detection. This thesis develops a traceable telemetry-to-analytics pipeline for a monitored residential case study in Italy. The pipeline integrates whole-building electrical demand, PV electrical power, pool circuit electrical power, and outdoor temperature within a clearly defined operational boundary. Heterogeneous IoT streams are aligned through UTC hourly resampling, and preprocessing rules are formalised to ensure leakage-safe evaluation and missingness of provenance tracking. A 1-hour-ahead load forecasting task is formulated using demand history, subsystem telemetry, temperature context, temporal encodings, and lag features. Random Forest regressor (RF) is adopted as the primary learner, while benchmark 1 (persistence) and benchmark 2 (Multiple Linear Regression, MLR) provide the minimum performance floor. Missing data handling is evaluated under a deployment-realistic keep-imputed causal policy and an offline interpolation upper bound used for diagnostic comparison. Performance is assessed using MAE and RMSE under a chronological holdout and an expanding rolling-week protocol. Robustness is evaluated through condition-sliced reporting across temperature bins, PV regimes, and missingness strata. Pool switching state slicing is reported where pool_on and pool_switch vary in the evaluated window. Residual-based anomaly detection is implemented using a Median Absolute Deviation (MAD) threshold rule with a K-sweep over K in the range 3 to 15 and results are reported at both hour and event levels. Monthly grid-import energy reconstructed from telemetry is compared to utility billing totals to quantify completeness and to identify months where low coverage compromises scientific evaluation. These elements together constitute a traceable and leakage-safe telemetry-to-analytics workflow for operation-phase monitoring in a residential prosumer setting (Hadri et al., 2025).| File | Dimensione | Formato | |
|---|---|---|---|
|
2026_03_Rahmani_Moradian_Executive Summary_02.pdf
accessibile in internet per tutti
Descrizione: Summary_A Framework for Data-Driven Prediction of Residential Prosumer Energy Profiles Using Digital Twins
Dimensione
1.19 MB
Formato
Adobe PDF
|
1.19 MB | Adobe PDF | Visualizza/Apri |
|
2026_03_Rahmani_Moradian_Thesis_01.pdf
accessibile in internet solo dagli utenti autorizzati
Descrizione: A Framework for Data-Driven Prediction of Residential Prosumer Energy Profiles Using Digital Twins
Dimensione
7.81 MB
Formato
Adobe PDF
|
7.81 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/252459