Dealing with partial information in follower's behavior identification

In this thesis, we study a security game, a specific class of leadership game in which one agent, called defender, is acting as leader and faces another agent, called attacker, acting as follower, whose behavior is not known a priori by the leader, but it is one among a set of possible behavioral profiles. Indeed, behavioural profiles are introduced since in real-world applications followers may fail to address the game theoretic assumption of perfect rationality. Such a model is particularly well suited for all those applications regarding the defense of physical systems, e.g., patrolling against thieves and poachers. We consider situations in which the leader has partial information over the attacker behavior. Specifically, we tackle the problem of identifying the follower's behavior in two cases: when the set of possible behavioral profiles is not completely known to the leader and the feedback received by the leader on the moves made by the follower is partial. In the first problem we consider situations in which the set of possible attackers may contain profiles with parameters unknown to the leader and, therefore, have to be estimated: this problem arises in many real-world applications in which, usually, the parameters of some of the behavioral profiles are unknown. To solve the first problem we propose two algorithms based on the state-of-the-art that are able to estimate the parameters unknown to the leader. In the second problem we consider situations in which the leader receives a feedback only regarding the target she decided to defend and, thus, has only partial observation on the moves of the follower: we justify this problem because, in some real-world-applications, attackers may not leave traces of their crimes. To solve the second problem we modify the state-of-the-art algorithms to deal with the lesser amount of information. For both scenarios, we provide theoretical guarantees about the loss incurred by the developed algorithms, showing that a bound is constant with respect to the time horizon the interaction takes place. Furthermore, we propose a wide experimental evaluation which confirms the theoretical results we derived.

In questa tesi studiamo un security game, un particolare tipo di leadership game in cui un agente, chiamato difensore, agisce da leader e affronta un altro agente, chiamato attaccante, che agisce da follower, il cui comportamento non è conosciuto a priori dal leader, ma fa parte di un insieme di possibili profili comportamentali. Infatti, introduciamo tali profili perché, in alcune applicazioni reali, i followers potrebbero violare l'assunzione fatta in Teoria dei Giochi di assoluta razionalità. Questo modello è particolarmente adatto per le applicazioni per la difesa di sistemi fisici, come il pattugliamento contro ladri e bracconieri, o la lotta allo spaccio. Consideriamo situazioni in cui il leader riceve un'informazione parziale sul comportamento dell'attaccante. In particolare, affrontiamo il problema dell'identificazione del profilo comportamentale del follower in due casi: quando l'insieme dei possibili profili non è del tutto noto al leader e quando il feedback ricevuto dal leader sulle mosse fatte dal follower è parziale. Nel primo problema consideriamo situazioni in cui l'insieme può contenere profili con parametri non noti al leader e, pertanto, devono essere stimati: questo problema si incontra in molte applicazioni reali dove, di solito, i parametri di alcuni profili non sono noti. Per risolvere il primo problema proponiamo due algoritmi basati sullo stato dell'arte che sono in grado di stimare i parametri non noti al leader. Nel secondo problema consideriamo situazioni in cui il leader riceve un feedback solo per il target difeso e, quindi, ha solo un'osservazione parziale delle mosse del follower: giustifichiamo questo problema perché, in alcune applicazioni reali, gli attaccanti potrebbero non lasciare tracce dei loro crimini. Per risolvere il secondo problema modifichiamo lo stato dell'arte per gestire la minore quantità di informazione. Per entrambi gli scenari, forniamo garanzie teoriche sulla perdita in cui incorrono gli algoritmi sviluppati, mostrando che esiste un bound costante rispetto all'orizzonte temporale in cui avviene l'interazione. Inoltre, proponiamo una vasta valutazione sperimentale che conferma i risultati teorici ottenuti.