In the past decade, a dramatic growth of mobile data traffic has been observed, which is mainly due to the prosperity of mobile devices. At the same time, to face such huge growth of internet subscribers, Internet Service Providers (ISPs) have invested to augment the capacity of the existing mobile network infrastructures. In particular, what mobile ISPs can exploit to support strategic decisions are the huge data volumes generated by their networks, i.e. Big Data. Usually, these data are collected at the access of the cellular network, where cellular Base Stations (BSs) retains the history of the connections with users terminals established during time. A large effort is required to extract knowledge from data which typically focuses on the search of regular patterns in mobile traffic which in turn can be exploited to forecast some quantities of interest. In this work, traffic patterns have been identified in a massive dataset provided by one of the major ISP in the Italian soil. First, a detailed study has been done on these patterns, in particular on the busy hour of the downlink traffic. Secondly, forecasting of aggregate downlink traffic has been implemented, with both short and long term prediction horizons, the latter being of primary importance for mobile ISPs with respect to the planning of network expansions or technological upgrading. A novel approach has been introduced on the forecasting of mobile traffic, as i) the quantity considered is the busy-hour downlink traffic, ii) it aims to determine best training set size and to study different time horizons of predictions and iii) clustering is exploited to do forecasting. For what regards forecasting performance, Seasonal ARIMA yielded best forecasting error, with a MAPE of 3.60% against 4.72% and 4.87% reached by Naïve and Neural Network models respectively.
Nell'ultimo decennio è stata osservata una rapida crescita del traffico dati mobile, che è dovuto principalmente alla prosperità dei dispositivi mobili. Allo stesso tempo, per affrontare una crescita smisurata di utenti, gli Internet Service Provider (ISPs) hanno investito per aumentare la capacità dell'esistente infrastruttura di rete mobile. In particolare, quello che possono sfruttare gli ISP per attuare delle decisioni strategiche, sono gli enormi volumi di dati generati dalle loro reti, ovvero Big Data. Di solito, questi dati vengono raccolti all' accesso della rete cellulare, in cui le Base Station (BS) conservano la cronologia di connessioni con gli utenti instaurati nel tempo. Un impegno notevole è necessario per estrapolare informazioni dai dati, che in genere si concentrano sulla ricerca di schemi regolari nel traffico mobile (traffic pattern) e che a loro volta possono essere sfruttati per predire alcune quantità di interesse. In questo lavoro, traffic pattern sono stati identificati in un enorme dataset fornito da uno dei maggiori ISP operante nel suolo italiano. In primo luogo, uno studio dettagliato è stato fatto su questi pattern, in particolare nelle ore di punta (busy-hour) del traffico in downlink. In secondo luogo, è stata implementata la previsione del traffico di downlink aggregato, con orizzonti di previsione sia a breve che a lungo termine, quest' ultimo di primaria importanza per gli ISP rispetto alla pianificazione dell’espansione della rete o upgrade tecnologici. È stato introdotto un nuovo approccio sulla previsione del traffico mobile, in quanto i) la quantità considerata è il traffico in downlink nelle ore di punta, ii) studia le dimensioni migliori del training set e diversi orizzonti temporali delle previsioni e iii) il clustering è sfruttato per fare forecasting. Per quanto riguarda le prestazioni nelle previsioni, Seasonal ARIMA ha prodotto il miglior errore di previsione, con un MAPE del 3.60% contro il 4.72% e il 4.87% ottenuti rispettivamente dai modelli Naïve e Neural Network.
Cluster-based forecasting of busy-hour downlink traffic in cellular networks
Di Giusto, Federico
2019/2020
Abstract
In the past decade, a dramatic growth of mobile data traffic has been observed, which is mainly due to the prosperity of mobile devices. At the same time, to face such huge growth of internet subscribers, Internet Service Providers (ISPs) have invested to augment the capacity of the existing mobile network infrastructures. In particular, what mobile ISPs can exploit to support strategic decisions are the huge data volumes generated by their networks, i.e. Big Data. Usually, these data are collected at the access of the cellular network, where cellular Base Stations (BSs) retains the history of the connections with users terminals established during time. A large effort is required to extract knowledge from data which typically focuses on the search of regular patterns in mobile traffic which in turn can be exploited to forecast some quantities of interest. In this work, traffic patterns have been identified in a massive dataset provided by one of the major ISP in the Italian soil. First, a detailed study has been done on these patterns, in particular on the busy hour of the downlink traffic. Secondly, forecasting of aggregate downlink traffic has been implemented, with both short and long term prediction horizons, the latter being of primary importance for mobile ISPs with respect to the planning of network expansions or technological upgrading. A novel approach has been introduced on the forecasting of mobile traffic, as i) the quantity considered is the busy-hour downlink traffic, ii) it aims to determine best training set size and to study different time horizons of predictions and iii) clustering is exploited to do forecasting. For what regards forecasting performance, Seasonal ARIMA yielded best forecasting error, with a MAPE of 3.60% against 4.72% and 4.87% reached by Naïve and Neural Network models respectively.File | Dimensione | Formato | |
---|---|---|---|
2021_04_Di_Giusto.pdf
accessibile in internet solo dagli utenti autorizzati
Descrizione: Testo della Tesi
Dimensione
8.44 MB
Formato
Adobe PDF
|
8.44 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/175860