Enhancing fraud detection through interpretable machine learning

With the term Automated Fraud Detection is intended the set of automated (i.e. carried out by machines) activities performed to detect illegitimate usages of services and/or products. It represents a rising and expanding field where more and more use cases are benefiting from the usage of modern artificial intelligence techniques. At the same time, several Fraud Detection applications can not be fully automated due to different reasons such as the impossibility of being totally sure about the illegitimacy of a use or the necessity of having humans making the final decision. In this view, the importance of the interaction between the automated part of the detection process (e.g. the machine learning algorithm) and the humans that are interacting with the tools becomes clear. Therefore, in this thesis we propose and design a fraud detection business process, applied to the domain of online marketplaces, which combines modern machine learning algorithms with predictions' interpretation so that the interaction between humans and machine is designed to be as smooth and efficient as possible. At first, a state-of-the-art machine learning classifier has been implemented to solve the problem of discriminating between which usages are legit and which are not. On top of this, four different machine learning explainability methods have been implemented and evaluated on real tasks. Among these methods, a novel approach to interpretable machine learning has been designed and proposed. This method, named EVADE, through the usage of an optimization procedure based on Genetic Algorithms, generates machine learning explanations which proved to achieve state-of-the-art performances.

Con il termine Automated Fraud Detection (Identificazione automatica di Frode) viene indicato l'insieme delle azioni automatizzate (i.e. svolte da macchine) eseguite allo scopo di identificare utilizzi illegittimi di prodotti e/o servizi. Esso rappresenta un ambito emergente ed in espansione dove un numero sempre cresente di casi beneficia dell'utilizzo di moderne tecniche di intelligenza artificale. Allo stesso tempo, numerose applicazioni di Fraud Detection non possono essere completamente automatizzate per motivi quali l'impossibilità di essere totalmente sicuri dell'illegittimità di un utilizzo o la necessità che la decisione finale sia presa da un umano. In quest'ottica, diventa chiara l'importanza dell'interazione tra la parte automatizzata del processo di identificazione (e.g. un algoritmo di intelligenza artificale) e gli umani che interagiscono con esso. In questa tesi viene proposto un processo aziendale di identificazione di frodi applicato al dominio degli online marketplace. Tale processo, grazie alla combinazione di moderne tecniche di apprendimento automatico associate a Machine Learning Interpretability (interpretazione di tecniche di apprendimento automatico), permette di ottenere un'interazione tra umani e macchine che sia efficiente ed efficace. Come prima cosa, un algoritmo di classificazione, basato sullo stato dell'arte dell'intelligenza artificiale, è stato implementato per distinguere tra utilizzi legittimi e fraudolenti. In aggiunta, quattro differenti metodi di Machine Learning Explainability sono stati implementati e valutati su applicazioni reali. Tra questi è stato progetto e proposto un nuovo approccio all'interpretazione dei modelli di apprendimento automatico denominato EVADE. Tale metodo, grazie ad una procedura di ottimizzazione basata su Algoritmi Genetici, ha ottenuto risultati comparabili allo stato dell'arte.