Water consumption forecast is one of the most useful tools for rethinking urban environments in a sustainable and technologically advanced way. However, the main obstacles to this analysis ar (i) collecting and standardizing information from providers, and (ii) selecting methodologies appropriate to infer nonlinear relationships. The dataset selected for this study differs from others because it (i) collects citywise information for the entire United States, and (ii) it has extremely varied time resolution of the data, covering the period from 2005 to 2017. Thus, the objective of this work is inspired from the dataset: to analyze whether and to what extent it was possible to decouple human activity related information from climate ones. The aim is to provide an estimate of water consumption based only on city demographic and socio-economic information, independent of atmospheric circumstances. Thus, (i) these estimates can be valid for cities with similar socio-demographic characteristics, and (ii) they could be further refined with the addition of climate and meteorological data. On the other hand, the methodological objective is to explore the potential of the graph structure in combination with convolutional and recurrent neural networks. As a preparatory step, cities are grouped into 4 clusters based on socio-demographic similarity. In the graph, each node represents a city, and they are connected by edges when the value encoding the similarity meets the threshold. Once built this descriptive model, a "Graph Convolutional Neural Network" is designed and tested to calculatehydrological predictions based only on demographic and socio-economic data. Its variant with a recurrent "Long Short-Term Memory" (LSTM) layer, allows processing of time series features: this enables to compare the quality of predictions based on only average data with predictions obtained from the combination of socio-demographic and meteorological data with different temporal resolutions. Results suggest that the model is able to produce an average estimation representative for a group of cities. However, the accuracy in node-level prediction has to be improved, this could be done by providing inputs more meaningful for the network. It appears clearly from performances of linear regression that are more accurate than neural networks and decision-tree algorithm's ones
La predizione dei consumi idrici è uno degli strumenti più utili per la pianificazione di sistemi di conservazione e allocazione efficiente delle risorse, nonchè per prepararsi ad eventuali emergenze. Tuttavia, i magigori ostacoli risiedono nel (i) nel raccogliere ed uniformare le informazioni, e (ii) selezionare metodologie adatte ad inferire l’influenza di fattori esogeni sul consumo idrico. l dataset utilizzato, raccoglie informazioni su scala cittadina per tutti gli Stati Uniti d’America, ma con diverse risoluzioni temporali: è stato infatti necessario integrare i dati socio-demografici medi. A partire dal suddetto set di dati, questo lavoro mira ad analizzare se e in che misura sia possibile disaccoppi- are le informazioni relative all’attività umana da quelle climatiche, e a fornire stime del consumo idrico basate solo su caratteristiche socio-demografiche. Cosicché queste stime possano essere valide pper città con caratteristiche socio-demografiche simili, e (ii) pos- sano essere ulteriormente rifinite con l’aggiunta dati climatici. La metodologia esplora, dapprima, la struttura dati a grafo in unione a reti neurali convoluzionali che opera su dati socio-demografici medi; in seguito, si avvale dell’ LightGBM sul set di dati integrati. Preliminarmente, le città sono raggruppate in 4 cluster in base alla somiglianza socio- demografica. Nel grafico, ogni nodo rappresenta una città ed essi sono collegati da archi quando il valore di somiglianza raggiunge una data soglia. A valle del modello descrit- tivo, è stata progettata e testata una Graph Convolutional Neural Network (GCNN) che generasse previsioni basandosi solo su dati socio-demografici. In seguito, è stato aggiunto uno strato ricorrente di "Long Short-Term Memory" (LSTM) per elaborare le time series dei consumi idrici precedenti e le caratteristiche climatiche; i risultati dei due modelli sono stati messi a confronto: entrambi i modelli sono in grado di produrre una stima media del consumo idrico per un gruppo di città: tuttavia, l’accuratezza nella previsione a livello di singola città può essere migliorata fornendo serie temporali anche per i dati socio-demografici. Coerentemente con la disponibilità dei dati, alcune delle caratteris- tiche socio-demografiche sono state integrate ed è stato applicato, infine, l’agoritmo di LightGBM, ottenendo previsioni molto più accurate.
Graph convolutional neural network for water consumption forecast: an explorative study
Ussano, Maria
2022/2023
Abstract
Water consumption forecast is one of the most useful tools for rethinking urban environments in a sustainable and technologically advanced way. However, the main obstacles to this analysis ar (i) collecting and standardizing information from providers, and (ii) selecting methodologies appropriate to infer nonlinear relationships. The dataset selected for this study differs from others because it (i) collects citywise information for the entire United States, and (ii) it has extremely varied time resolution of the data, covering the period from 2005 to 2017. Thus, the objective of this work is inspired from the dataset: to analyze whether and to what extent it was possible to decouple human activity related information from climate ones. The aim is to provide an estimate of water consumption based only on city demographic and socio-economic information, independent of atmospheric circumstances. Thus, (i) these estimates can be valid for cities with similar socio-demographic characteristics, and (ii) they could be further refined with the addition of climate and meteorological data. On the other hand, the methodological objective is to explore the potential of the graph structure in combination with convolutional and recurrent neural networks. As a preparatory step, cities are grouped into 4 clusters based on socio-demographic similarity. In the graph, each node represents a city, and they are connected by edges when the value encoding the similarity meets the threshold. Once built this descriptive model, a "Graph Convolutional Neural Network" is designed and tested to calculatehydrological predictions based only on demographic and socio-economic data. Its variant with a recurrent "Long Short-Term Memory" (LSTM) layer, allows processing of time series features: this enables to compare the quality of predictions based on only average data with predictions obtained from the combination of socio-demographic and meteorological data with different temporal resolutions. Results suggest that the model is able to produce an average estimation representative for a group of cities. However, the accuracy in node-level prediction has to be improved, this could be done by providing inputs more meaningful for the network. It appears clearly from performances of linear regression that are more accurate than neural networks and decision-tree algorithm's onesFile | Dimensione | Formato | |
---|---|---|---|
2023_Ussano_Maria.pdf
solo utenti autorizzati a partire dal 17/04/2026
Dimensione
9.06 MB
Formato
Adobe PDF
|
9.06 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/204564