Reinforcement learning based traffic engineering in SD-WAN

The demand for reliable and efficient Wide Area Networks (WANs) is continuously increasing; Enterprises use WANs to transmit critical data between multiple business branches and cloud data centers. With the goal of optimizing performances, providing new services, increasing speed, security and reducing costs, many WANs solutions have been proposed over the years, such as leased lines, Frame Relay or Multi-Protocol Label Switching (MPLS). Today, the emerging technology for WAN is Software-Defined Wide Area Networking (SD-WAN) that extends the Software-Defined Networking (SDN) paradigm to the Enterprises’ network. SD-WAN exploits the main advantages of SDN: network abstraction and programmability. The resulting availability of information leads the way for the implementation of applications able to exploit real-time network measurements, such as the ones based on Machine Learning (ML), able to dynamically learn from data. In this thesis, we study and develop optimization algorithms based on ML capable of increasing the overall network availability and guaranteeing network protection and restoration in SD-WAN. To do this, we exploit a subfield of Machine Learning, named Reinforcement Learning (RL), which is a set of techniques that aims to find the optimal actions in a given environment in order to reach some defined goal. This thesis focuses on a basic SDWAN network in which an Enterprise needs to connect two branch offices through two different networks. The goal is to improve network availability by dynamically "switching" the traffic flows between the two networks. The "switching" occurs based on the status of the networks, such as delay, jitter and packet loss rate. The aim of this thesis is to develop intelligent algorithms that trigger the switching taking into account the past history of network status. Moreover, the proposed algorithms are capable not only to react to network degradation but also to predict traffic patterns, enhancing the overall network availability. Different RL algorithms, together with baselines algorithms are implemented and tested.

La richiesta di WAN (Wide Area Network) affidabili ed efficienti è in continua crescita; le aziende utilizzano le WAN per trasmettere dati critici tra più filiali e cloud data-centers. Con lo scopo di ottimizzare le performance, fornire nuovi servizi, aumentare la velocità, la sicurezza e ridurre i costi, molte soluzioni sono state proposte nel corso degli anni, come, ad esempio, linee dedicate, Frame Relay o MPLS (Multi-Protocol Label switching). Oggi la tecnologia emergente per le WAN è Software-Defined Wide Area Networking (SD-WAN), che estende il paradigma SDN (Software-Defined Networking) alle reti aziendali. SD-WAN sfrutta i principali vantaggi di SDN, cioè visione centralizzata di rete e programmabilità. La conseguente disponibilità di informazioni apre la strada per l’implementazione di applicazioni in grado di sfruttare le misure di rete in tempo reale, come ad esempio quelle basate su tecniche di apprendimento automatico (Machine Learning), in grado di imparare dinamicamente dai dati. In questa tesi studiamo e sviluppiamo algoritmi di ottimizzazione basati su tecniche di ML capaci di aumentare la disponibilità di rete complessiva e garantire capacità di protezione e ripristino in SD-WAN. Per fare ciò, sfruttiamo un sotto campo del Machine Learning, chiamato Reinforcement Learning (RL), che è un insieme di tecniche che mirano a trovare le azioni ottime in un determinato ambiente, con lo scopo di raggiungere un fissato obiettivo. Questa tesi si focalizza su una semplice rete SD-WAN utilizzata per connettere due sedi aziendali attraverso due reti differenti. Lo scopo è quello di migliorare la disponibilità di rete, "instradando" dinamicamente i flussi di traffico tra le due reti. L’ "instradamento" avviene basandosi sullo stato delle reti, definito da ritardo, jitter e tasso di perdita di pacchetti. L’obiettivo di questa tesi è di sviluppare algoritmi intelligenti che azionino il cambio di canale tenendo in considerazione la storia recente dello stato della rete. Gli algoritmi proposti, inoltre, sono capaci non solo di reagire alle condizioni del canale, ma anche di prevedere i pattern di traffico, migliorando la disponibilità di rete complessiva. Differenti algoritmi di RL, insieme ad algoritmi di confronto, sono implementati e testati.