Analysis of combinatorial bandits under adversarial attacks

Combinatorial Multi-Armed Bandit (CMAB) frameworks offer an approach to addressing complex decision-making problems where multiple options can be simultaneously selected, optimizing outcomes based on aggregated feedback from these choices. Adversarial attacks, on the other hand, represent a significant challenge in the field of machine learning, where malicious entities intentionally manipulate data or feedback to deceive learning algorithms, undermining their performance and reliability. While extensive research has been conducted on CMAB and adversarial attacks as distinct subjects, the exploration of CMAB systems under adversarial attacks remains notably underinvestigated in recent literature. In this thesis, we provide a theoretical study of the successfulness and attack cost of CMABs under adversarial attacks in three settings. Each setting differs in the assumption about arms distribution: we consider bounded rewards, unbounded rewards with positive means, and unbounded rewards. For each setting, we provide two attack strategies considering an ideal case with an omniscient attacker and a real case where the attacker has the same knowledge as the learner. We show that our attack strategies are successful except for some degenerate cases, where successfulness is defined as the ability of the attacker to induce the learner to choose a predetermined target superarm T-o(T) times. For every attack, we also show that its cost is sublinear in T, which aligns with the interest of the attacker wanting to minimize his expense during the attack. Ultimately, we provide numerical experiments to validate our theoretical conclusions.

Il framework dei Combinatorial Multi-Armed Bandits (CMAB) offre un approccio per affrontare problemi decisionali complessi in cui è possibile selezionare simultaneamente più opzioni, ottimizzando i risultati in base al feedback aggregato di queste scelte. Gli attacchi avversariali, d'altra parte, rappresentano una sfida significativa nell'ambito del machine learning, dove entità malintenzionate manipolano intenzionalmente i dati o i feedback per ingannare gli algoritmi di apprendimento, minandone le prestazioni e l'affidabilità. Sebbene siano state condotte numerose ricerche sui CMAB e sugli attacchi avversariali come argomenti distinti, l'esplorazione dei sistemi CMAB in presenza di attacchi avversariali rimane notevolmente poco indagata nella letteratura. In questo lavoro, forniamo uno studio teorico del successo e del costo di attacco dei CMAB sotto attacchi avversariali in tre casi, che si differenziano per un'assunzione sulle distribuzioni di probabilità dei bracci: consideriamo ricompense vincolate, ricompense non vincolate con media positiva e ricompense non vincolate. Per ciascun caso forniamo due strategie di attacco, considerando un caso ideale con un attaccante onnisciente e un caso reale in cui l'attaccante ha la stessa conoscenza del learner. Dimostriamo che le nostre strategie di attacco hanno successo, tranne in alcuni casi degeneri, dove il successo è definito come la capacità dell'attaccante di indurre il learner a scegliere un superarm target predeterminato almeno T-o(T) volte. Per ogni attacco, dimostriamo anche che il suo costo è sublineare in T, in linea con l'interesse dell'attaccante di minimizzare le spese durante l'attacco. Infine, forniamo esperimenti numerici per convalidare le nostre conclusioni teoriche.