Managing hydropower system network

The management of hydropower production networks has been under study for many years. The objective in the operation of multi-reservoir systems is to find an optimal decision policy that determines the release of water from each dam, in order to maximize the expected value of the reward function over the desired time horizon. The reward function must be evaluated carefully and strongly depends on the case study. It usually consists in a trade-off between electric power production and the value of the water stored in the reservoir. The complexity of the control problem is augmented by considering many characteristics such as unregulated inflows, hydrologic parameters, system demands and economic parameters that should be treated as random variables. The challenging task is then how to solve this large-scale multireservoir problem under the presence of different sources of uncertainties. In this thesis, the integration of Reinforcement Learning is presented to provide an efficient optimization framework for the management of large-scale hydroelectric power systems. Reinforcement Learning is a branch of artificial intelligence that may retain several key benefits in treating problems that are too large to be handled by traditional dynamic programming techniques. In this study, we tailor major concepts and computational aspects of the application of Reinforcement Learning to solve the management problem of multi-reservoir. The Reinforcement Learning optimization model is then implemented on the Hydro-Québec multi-reservoir complex located at the Rivière Romaine in Québec. The model is initially used to obtain release policies for the short term management of the previously-mentioned reservoir complex, more precisely the optimization is done spanning a two weeks and one month time horizon. Three different algorithms are developed and tested: the first one consists in a standard lookup table version, whereas the latter two are integrated with continuous function approximation techniques to reduce the model complexity and the computing time. Due to their reduced computational time and solution quality, the last two algorithms are then used to extend the model to the long-term management of the reservoirs covering a whole year of release policies. The results show that the Reinforcement Learning model is effective and reliable in solving the considered large-scale reservoir operational problem.

La regolazione di produzione di energia idroelettrica è stata oggetto di studi per molti anni. L'obiettivo delle operazioni in un sistema con numerose dighe è di trovare una strategia di rilascio di acqua ottimale, che regoli il rilascio di acqua da ogni diga in modo da massimizzare la funzione obiettivo nel intervallo di tempo considerato. La funzione obiettivo deve essere valutata attentamente e dipende fortemente dal caso studio. Di solito consiste in un trade-off tra la produzione di energia elettrica e il valore finale dell'acqua alla fine dell'orizzonte temporale. La complessità del sistema di controllo è accresciuta dalla presenza di numerose caratteristiche come il flusso in ingresso non regolato, parametri idrologici, le richieste del sistema e parametri economici che devono essere trattate come variabili random. Il problema diventa quindi come risolvere l'ottimizzazione di un sistema di dighe con la presenza di numerose fonti di incertezza. In questa tesi, l'integrazione del metodo di Reinforcement Learning è presentata per fornire un efficiente metodo di ottimizzazione per il management di un sistema di dighe. Reinforcement Learning è un ramo dell'intelligenza artificiale che può possedere numerosi benefici nel trattare problemi che sono troppo vasti per essere trattati dalle tradizionali tecniche di programmazione dinamica. In questo studio, presentiamo i principali aspetti computazionali dell'algoritmo di Reinforcement Learning per la risoluzione dell'ottimizzazione di un sistema di dighe. Il modello di reinforment learning è applicato al complesso di dighe di Hydro-Québec situato sul fiume Romaine in Québec. Il modello è inizialmente utilizzato per ottenere la strategia di rilascio per il breve periodo. Più precisamente, l'ottimizzazione ricopre un orizzonte temporale di due settimane e un mese. Tre diversi algoritmi sono sviluppati e testati: il primo consiste nella versione standard di lookup table, mentre gli altri due sono integrati con tecniche di approsimazione di funzione per ridurre la complessità del modello. Grazie al loro ridotto tempo computazionale e alla qualità della soluzione, gli ultimi due algoritmi sono usati per estendere l'orizzonte temporale ad un anno intero di operazioni. I risultati ottenuti dimostrano che reinforcement learning può essere applicato con successo per risolvere il problema della gestione dell'acqua in un complesso di dighe.