REC-NS-MAB : an algorithm for recurrent concepts in non-stationary multi-armed bandits

Biblioteche e Archivi
POLITesi - Archivio digitale delle tesi di laurea e di dottorato

The Multi-armed Bandit (MAB) framework has been applied successfully in many application fields, like medical treatments or online recommender systems for multimedia contents. In the last years, the use of active approaches to tackle the non-stationary MAB setting, i.e., algorithms capable of detecting changes in the environment and re-configuring automatically to the change, has been widening the areas of application of MAB techniques. However, such approaches have the drawback of not reusing information in those settings where the same environment conditions recur over time. This work presents a framework to integrate past information in the abruptly changing non-stationary setting, which allows the active MAB approaches to recover from changes quickly. The proposed framework, named Rec-NS-MAB, is based on the definition of recurring concepts specifically introduced for the MAB setting to reuse information from recurring MAB states, and optionally accompanied with a break-point prediction techniques (BPP). Also, this framework does not change the order of the regret suffered by the active approaches commonly used in the bandit field. Finally, we provide extensive experimental analysis on both synthetically generated and real-world data, showing the improvement provided by the framework.

Il modello Multi-armed Bandit (MAB) è stato utilizzato con successo in molti campi di applicazione, come trattamenti medici o sistemi di raccomandazione online per contenuti multimediali. Negli ultimi anni, l'uso di approcci attivi per affrontare un MAB non stazionario, ovvero algoritmi in grado di rilevare i cambiamenti nell'ambiente e riconfigurarsi automaticamente al cambiamento, ha ampliato le aree di applicazione di tali tecniche. Tuttavia, tali approcci non sono in grado di riutilizzare le informazioni in quei contesti in cui le stesse condizioni dell'ambiente si ripresentano nel tempo. Questo lavoro di tesi presenta una metodologia per integrare le informazioni del passato in un contesto non stazionario che cambia bruscamente (abruptly changing), che consente agli approcci attivi per MAB non stazionari di adattarsi rapidamente dopo i cambiamenti. Il framework proposto, chiamato Rec-NS-MAB, è basato sulla definizione dei concetti ricorrenti, specificatamente utilizzati per la formulazione di un algoritmo per riutilizzare le informazioni da stati ricorrenti del MAB stesso. Inoltre, questo metodo non cambia l'ordine del regret accumulato dagli approcci attivi comunemente usati nel campo dei MAB. Infine, forniamo un'ampia analisi sperimentale su dati generati sinteticamente e da dati reali, mostrando il miglioramento apportato dal framework proposto.

REC-NS-MAB : an algorithm for recurrent concepts in non-stationary multi-armed bandits

RE, GERLANDO

2020/2021

Abstract

The Multi-armed Bandit (MAB) framework has been applied successfully in many application fields, like medical treatments or online recommender systems for multimedia contents. In the last years, the use of active approaches to tackle the non-stationary MAB setting, i.e., algorithms capable of detecting changes in the environment and re-configuring automatically to the change, has been widening the areas of application of MAB techniques. However, such approaches have the drawback of not reusing information in those settings where the same environment conditions recur over time. This work presents a framework to integrate past information in the abruptly changing non-stationary setting, which allows the active MAB approaches to recover from changes quickly. The proposed framework, named Rec-NS-MAB, is based on the definition of recurring concepts specifically introduced for the MAB setting to reuse information from recurring MAB states, and optionally accompanied with a break-point prediction techniques (BPP). Also, this framework does not change the order of the regret suffered by the active approaches commonly used in the bandit field. Finally, we provide extensive experimental analysis on both synthetically generated and real-world data, showing the improvement provided by the framework.

Scheda breve

Scheda completa

	Relatore
	
				TROVO', FRANCESCO
			
	Correlatore/i
	
				BORACCHI, GIACOMO
RESTELLI, MARCELLO
			
	Scuola / Dip.
	
				ING  - Scuola di Ingegneria Industriale e dell'Informazione
			
	Data
	
				9-giu-2021
			
	Anno accademico
	
				2020/2021
			
	Abstract in italiano
	
				Il modello Multi-armed Bandit (MAB) è stato utilizzato con successo in molti campi di applicazione, come trattamenti medici o sistemi di raccomandazione online per contenuti multimediali. Negli ultimi anni, l'uso di approcci attivi per affrontare un MAB non stazionario, ovvero algoritmi in grado di rilevare i cambiamenti nell'ambiente e riconfigurarsi automaticamente al cambiamento, ha ampliato le aree di applicazione di tali tecniche.
Tuttavia, tali approcci non sono in grado di riutilizzare le informazioni in quei contesti in cui le stesse condizioni dell'ambiente si ripresentano nel tempo. Questo lavoro di tesi presenta una metodologia per integrare le informazioni del passato in un contesto non stazionario che cambia bruscamente (abruptly changing), che consente agli approcci attivi per MAB non stazionari di adattarsi rapidamente dopo i cambiamenti.
Il framework proposto, chiamato Rec-NS-MAB, è basato sulla definizione dei concetti ricorrenti, specificatamente utilizzati per la formulazione di un algoritmo per riutilizzare le informazioni da stati ricorrenti del MAB stesso. Inoltre, questo metodo non cambia l'ordine del regret accumulato dagli approcci attivi comunemente usati nel campo dei MAB.
Infine, forniamo un'ampia analisi sperimentale su dati generati sinteticamente e da dati reali, mostrando il miglioramento apportato dal framework proposto.
			
	Appare nelle tipologie:
	
				Tesi di laurea Magistrale

File allegati

File	Dimensione	Formato
2021_06_Re.pdf accessibile in internet solo dagli utenti autorizzati Descrizione: Final submission Dimensione 14.48 MB Formato Adobe PDF Visualizza/Apri	14.48 MB	Adobe PDF	Visualizza/Apri

I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10589/175997