Digital banking has completely changed the way we operate our bank accounts and is now the leading channel for many banks and financial institutions, making it possible for money to be quickly transferred worldwide with a click. The ease of use of digital banking, combined with the anonymity of the real person behind the screen performing transactions, attracted fraudsters, now turned into cyber-criminals, that continue to exploit the many different defensive measures adopted by banks for their gains. Many of these measures are Machine Learning based, but they are generally vulnerable to meticulously crafted attacks that can easily deceive the detection power of these algorithms by exploiting their blind spots. Adversarial Machine Learning is a research field of Machine Learning that studies these attacks and defensive measures. In this thesis, we select and implement some of the best hardening techniques from the state-of-the-art in the hope to make Fraud Detection Systems more robust against malicious transactions, and we propose a novel analysis to benchmark and evaluate their benefits and side effects. Our results show that the hardened versions of the models achieve superior performances than the standard models, particularly true for Random Forests. It managed to catch all fraud campaigns on the same day that they started, it obtained the lowest evasion rates (24% on average, with gaps from 4% to 30% with respect to the second-best classifier), and it best prevented the bank from losing money, saving up to over €245,000 in losses. The Gradient Boosting Decision Trees, instead, are greatly damaged by the defensive technique we introduce.
Il digital banking ha completamente cambiato le modalità con cui i conti correnti vengono utilizzati, ed è diventato il principale canale di operatività per molte banche e istituti finanziari. Ha inoltre reso possibile il trasferimento di denaro in tutto il mondo, rapidamente e a portata di click. La facilità d'uso del digital banking, combinata con l'anonimità della persona che si cela dietro lo schermo e che effettivamente effettua le transazioni, ha attratto i frodatori, ora anche cyber-criminali, che continuano a sfruttare le misure di difesa attuate dalle banche per il loro vantaggio. Molti di questi sistemi sono basati sul Machine Learning, ma sono generalmente vulnerabili ad attacchi meticolosamente creati che, sfruttando le falle degli algoritmi, riescono facilmente a ingannarli e a ridurne il potere predittivo. L'Adversarial Machine Learning è un campo di ricerca del Machine Learning che studia questi attacchi e le rispettive misure di difesa; in questa tesi selezioniamo e implementiamo alcune delle migliori tecniche di difesa dallo stato dell'arte nella speranza di rendere i Sistemi di Rilevamento Frodi robusti contro le transizioni malevoli, e proponiamo una nuova analisi, teorica e sperimentale, per confrontarne e valutarne benefici ed effeti collaterali. I nostri risultati mostrano che le versioni robuste dei modelli ottengono performance superiori rispetto ai modelli standard, in particolare per quanto riguarda le Random Forest. Infatti, questo classificatore è riuscito a riconoscere tutte le campagne di frode nello stesso giorno in cui sono state avviate, ha ottenuto i più bassi tassi di evasione (24% in media, distaccando il secondo miglior classificatore del 4%–30%), e ha totalizzato le minori perdite monetarie per la banca, impedendo il trasferimento di fondi fraudolenti fino a oltre €245,000. I Gradient Boosting Decision Tree, invece, risultano gravemente danneggiati dalla tecnica difensiva che introduciamo.
An analysis of defence mechanisms against evasion attacks in the fraud detection domain
Benati, Filippo Maria
2020/2021
Abstract
Digital banking has completely changed the way we operate our bank accounts and is now the leading channel for many banks and financial institutions, making it possible for money to be quickly transferred worldwide with a click. The ease of use of digital banking, combined with the anonymity of the real person behind the screen performing transactions, attracted fraudsters, now turned into cyber-criminals, that continue to exploit the many different defensive measures adopted by banks for their gains. Many of these measures are Machine Learning based, but they are generally vulnerable to meticulously crafted attacks that can easily deceive the detection power of these algorithms by exploiting their blind spots. Adversarial Machine Learning is a research field of Machine Learning that studies these attacks and defensive measures. In this thesis, we select and implement some of the best hardening techniques from the state-of-the-art in the hope to make Fraud Detection Systems more robust against malicious transactions, and we propose a novel analysis to benchmark and evaluate their benefits and side effects. Our results show that the hardened versions of the models achieve superior performances than the standard models, particularly true for Random Forests. It managed to catch all fraud campaigns on the same day that they started, it obtained the lowest evasion rates (24% on average, with gaps from 4% to 30% with respect to the second-best classifier), and it best prevented the bank from losing money, saving up to over €245,000 in losses. The Gradient Boosting Decision Trees, instead, are greatly damaged by the defensive technique we introduce.File | Dimensione | Formato | |
---|---|---|---|
2022_04_Benati.pdf
accessibile in internet per tutti
Descrizione: Testo della tesi
Dimensione
2.5 MB
Formato
Adobe PDF
|
2.5 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/187577