The Multi-armed Bandit (MAB) framework has been applied successfully in many application fields, like medical treatments or online recommender systems for multimedia contents. In the last years, the use of active approaches to tackle the non-stationary MAB setting, i.e., algorithms capable of detecting changes in the environment and re-configuring automatically to the change, has been widening the areas of application of MAB techniques. However, such approaches have the drawback of not reusing information in those settings where the same environment conditions recur over time. This work presents a framework to integrate past information in the abruptly changing non-stationary setting, which allows the active MAB approaches to recover from changes quickly. The proposed framework, named Rec-NS-MAB, is based on the definition of recurring concepts specifically introduced for the MAB setting to reuse information from recurring MAB states, and optionally accompanied with a break-point prediction techniques (BPP). Also, this framework does not change the order of the regret suffered by the active approaches commonly used in the bandit field. Finally, we provide extensive experimental analysis on both synthetically generated and real-world data, showing the improvement provided by the framework.
Il modello Multi-armed Bandit (MAB) è stato utilizzato con successo in molti campi di applicazione, come trattamenti medici o sistemi di raccomandazione online per contenuti multimediali. Negli ultimi anni, l'uso di approcci attivi per affrontare un MAB non stazionario, ovvero algoritmi in grado di rilevare i cambiamenti nell'ambiente e riconfigurarsi automaticamente al cambiamento, ha ampliato le aree di applicazione di tali tecniche. Tuttavia, tali approcci non sono in grado di riutilizzare le informazioni in quei contesti in cui le stesse condizioni dell'ambiente si ripresentano nel tempo. Questo lavoro di tesi presenta una metodologia per integrare le informazioni del passato in un contesto non stazionario che cambia bruscamente (abruptly changing), che consente agli approcci attivi per MAB non stazionari di adattarsi rapidamente dopo i cambiamenti. Il framework proposto, chiamato Rec-NS-MAB, è basato sulla definizione dei concetti ricorrenti, specificatamente utilizzati per la formulazione di un algoritmo per riutilizzare le informazioni da stati ricorrenti del MAB stesso. Inoltre, questo metodo non cambia l'ordine del regret accumulato dagli approcci attivi comunemente usati nel campo dei MAB. Infine, forniamo un'ampia analisi sperimentale su dati generati sinteticamente e da dati reali, mostrando il miglioramento apportato dal framework proposto.
REC-NS-MAB : an algorithm for recurrent concepts in non-stationary multi-armed bandits
RE, GERLANDO
2020/2021
Abstract
The Multi-armed Bandit (MAB) framework has been applied successfully in many application fields, like medical treatments or online recommender systems for multimedia contents. In the last years, the use of active approaches to tackle the non-stationary MAB setting, i.e., algorithms capable of detecting changes in the environment and re-configuring automatically to the change, has been widening the areas of application of MAB techniques. However, such approaches have the drawback of not reusing information in those settings where the same environment conditions recur over time. This work presents a framework to integrate past information in the abruptly changing non-stationary setting, which allows the active MAB approaches to recover from changes quickly. The proposed framework, named Rec-NS-MAB, is based on the definition of recurring concepts specifically introduced for the MAB setting to reuse information from recurring MAB states, and optionally accompanied with a break-point prediction techniques (BPP). Also, this framework does not change the order of the regret suffered by the active approaches commonly used in the bandit field. Finally, we provide extensive experimental analysis on both synthetically generated and real-world data, showing the improvement provided by the framework.File | Dimensione | Formato | |
---|---|---|---|
2021_06_Re.pdf
accessibile in internet solo dagli utenti autorizzati
Descrizione: Final submission
Dimensione
14.48 MB
Formato
Adobe PDF
|
14.48 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/175997