In recent years, artificial intelligence, and in particular Deep Reinforcement Learning (DRL), has found increasingly relevant applications in telecommunication networks, contributing to resource optimization and performance improvement. This trend is supported by the O-RAN (Open Radio Access Network) paradigm, which promotes openness and intelligence in radio access networks, favoring the use of machine learning techniques for the control and dynamic adaptation of the network. In this context, within the WinesLab a DRL agent has been developed to optimize the network conditions emulated by modeled wireless network environment named Colosseum. However, its offline operation, based on randomly selected state-action pairs, presents limitations in terms of temporal coherence and causal dynamics. To overcome these critical issues, this thesis proposes a forecasting system able to estimate the most plausible next state given an initial state and an action, improving the quality and stability of learning. The work involved the analysis of the dataset, the use of clustering techniques and the construction of a predictive model based on transition probabilities. The results, obtained by applying the developed methodology on a realistic dataset, confirm the effectiveness of the forecaster as a useful component for future developments and integrations of intelligent models in O-RAN systems.
Negli ultimi anni, l’intelligenza artificiale, e in particolare il Deep Reinforcement Learning (DRL), ha trovato applicazioni sempre più rilevanti nelle reti di telecomunicazione, contribuendo all’ottimizzazione delle risorse e al miglioramento delle prestazioni. Questo trend è sostenuto dal paradigma O-RAN (Open Radio Access Network), che promuove l’apertura e l’intelligenza nelle reti di accesso radio, favorendo l’uso di tecniche di machine learning per il controllo e l’adattamento dinamico della rete. In questo contesto, all’interno del WinesLab è stato sviluppato un agente DRL per ottimizzare le condizioni di rete emulate da Colosseum. Tuttavia, il suo funzionamento offline, basato su coppie di stati e azioni selezionate casualmente, presenta limiti in termini di coerenza temporale e dinamica causale. Per superare tali criticità, questa tesi propone un sistema di previsione in grado di stimare lo stato successivo più plausibile dato uno stato iniziale e un’azione, migliorando la qualità e la stabilità dell’apprendimento. Il lavoro ha previsto l’analisi del dataset, l’uso di tecniche di clustering e la costruzione di un modello predittivo basato sulle probabilità di transizione. I risultati, ottenuti applicando la metodologia sviluppata su dataset realistico, confermano l’efficacia del forecaster come componente utile per futuri sviluppi e integrazioni di modelli intelligenti nei sistemi O-RAN.
Clustering and forecasting for improving deep reinforcement learning offline training in 5G Open-RAN
Colombo, Manuel
2024/2025
Abstract
In recent years, artificial intelligence, and in particular Deep Reinforcement Learning (DRL), has found increasingly relevant applications in telecommunication networks, contributing to resource optimization and performance improvement. This trend is supported by the O-RAN (Open Radio Access Network) paradigm, which promotes openness and intelligence in radio access networks, favoring the use of machine learning techniques for the control and dynamic adaptation of the network. In this context, within the WinesLab a DRL agent has been developed to optimize the network conditions emulated by modeled wireless network environment named Colosseum. However, its offline operation, based on randomly selected state-action pairs, presents limitations in terms of temporal coherence and causal dynamics. To overcome these critical issues, this thesis proposes a forecasting system able to estimate the most plausible next state given an initial state and an action, improving the quality and stability of learning. The work involved the analysis of the dataset, the use of clustering techniques and the construction of a predictive model based on transition probabilities. The results, obtained by applying the developed methodology on a realistic dataset, confirm the effectiveness of the forecaster as a useful component for future developments and integrations of intelligent models in O-RAN systems.| File | Dimensione | Formato | |
|---|---|---|---|
|
Clustering_and_Forecasting_for_improving_Deep_Reinforcement_Learning_Offline_Training_in_5G_Open_RAN.pdf
accessibile in internet solo dagli utenti autorizzati
Dimensione
6.67 MB
Formato
Adobe PDF
|
6.67 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/240407