Real-time, efficient management of network-based systems is a critical challenge in many domains. Approaches based on optimization have shown promising results, as they are able to provide optimal solutions by planning ahead, but this comes at the cost of recomputing the optimal solution every time an impredictable event disrupts the planned schedule, which becomes unpractical when there are real-time constraints and the system has to scale to very large networks. In this work, we show how to build a formulation of a Markov Decision Process which is tightly related to the natural graph-based structure of problems of this class, with a special focus on the Train Dispatching Problem, but we will also briefly mention other domains where the same approach could be applied. Then we propose a distributed Multi-Agent Reinforcement Learning algorithm based on Q-Learning where agents are deployed to the nodes of the graph and, in a simulated environment of a railway network, we show how they are able to learn empirically optimal policies by acting independently and exchanging minimal information with their neighbors, even when unpredictable events such as train malfunctions occur.
L' efficiente gestione in tempo reale di sistemi basati su reti è una sfida critica in diversi settori. Approcci basati su ottimizzazione hanno mostrato risultati promettenti, in quanto riescono a trovare soluzioni ottimali, pianificando le azioni da intraprendere nel futuro. Tuttavia, questo tipo di approccio richiede il ricalcolo della soluzione ogniqualvolta eventi imprevedibili stravolgono la pianificazione originaria. Ciò può diventare impraticabile quando si ha a che fare con requisiti temporali stringenti e istanze molto grandi del problema. In questo lavoro si propone una formulazione di un Markov Decision Process, strettamente legata alla naturale struttura a grafo di problemi appartenenti a questa classe, con un'attenzione specifica al problema della gestione del traffico ferroviario in tempo reale; pur tuttavia, non mancheranno brevi riferimenti ad altri possibili campi di applicazione. Successivamente, si propone un approccio di Reinforcement Learning distribuito, basato su Q-Learning, dove gli agenti vengono associati ai nodi del grafo, e si mostra come, in un ambiente ferroviario simulato, siano capaci di apprendere delle policy empiricamente ottimali, agendo indipendentemente gli uni dagli altri e limitandosi a comunicare soltanto con i loro diretti successori nella rete, persino in presenza di eventi imprevedibili come malfunzionamenti.
Distributed reinforcement learning for large-scale networks
Cartechini, Giacomo
2023/2024
Abstract
Real-time, efficient management of network-based systems is a critical challenge in many domains. Approaches based on optimization have shown promising results, as they are able to provide optimal solutions by planning ahead, but this comes at the cost of recomputing the optimal solution every time an impredictable event disrupts the planned schedule, which becomes unpractical when there are real-time constraints and the system has to scale to very large networks. In this work, we show how to build a formulation of a Markov Decision Process which is tightly related to the natural graph-based structure of problems of this class, with a special focus on the Train Dispatching Problem, but we will also briefly mention other domains where the same approach could be applied. Then we propose a distributed Multi-Agent Reinforcement Learning algorithm based on Q-Learning where agents are deployed to the nodes of the graph and, in a simulated environment of a railway network, we show how they are able to learn empirically optimal policies by acting independently and exchanging minimal information with their neighbors, even when unpredictable events such as train malfunctions occur.File | Dimensione | Formato | |
---|---|---|---|
2024_12_Cartechini_Tesi.pdf
accessibile in internet per tutti
Descrizione: Thesis
Dimensione
4.53 MB
Formato
Adobe PDF
|
4.53 MB | Adobe PDF | Visualizza/Apri |
2024_12_Cartechini_Executive_Summary.pdf
accessibile in internet per tutti
Descrizione: Executive Summary
Dimensione
3.09 MB
Formato
Adobe PDF
|
3.09 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/230913