Bitcoin, a decentralized cryptocurrency introduced in 2008, has transformed the financial landscape and emerged as a point of interest for forensic analysis. Its pseudonymous nature, coupled with its potential for concealing illicit activities, underscores the need for reliable investigative techniques. Address clustering refers to the category of deanonymization techniques whose objective is to group aliases that share ownership, mainly through the use of heuristics, making address clustering an essential step to complement any de-anonymization attempt. This thesis focuses on the development of a framework to perform heuristic application and evaluation. Existing methodologies outline multiple challenges: the lack of a tool to apply and evaluate heuristics efficiently, aggregate results, and an effective method to handle the supercluster problem. Our framework aims to tackle these issues using graphs to handle the data obtained through heuristic application, allowing researchers to represent complex relationships between aliases not limited to their aggregation. This representation permits seamless integration of results from multiple heuristics or external sources. In addition, the graph representation offers many insights and, through a set of evaluation metrics, provides a standardized method to analyze the effectiveness of the involved heuristics. Finally, our framework provides a method to extract clusters from such graphs. We provide two implementations of the framework, each suited to different needs. The first relies on a relational database to store relevant blockchain data, while the second opts for more streamlined data handling. To validate our approach, we conduct a large-scale evaluation of eleven de-anonymization heuristics, analyzing their applicability and efficacy. We study different combinations of heuristics, aiming to improve blockchain coverage and enhance reliability by ensuring that the results are validated by multiple heuristics.
Bitcoin, una criptovaluta decentralizzata che ha fatto la sua comparsa nel 2008, ha rivoluzionato il panorama finanziario e si è collocato al centro di molte indagini forensi. La sua natura pseudonima e il suo conseguente potenziale nell’occultare attività illecite sottolineano la necessità di metodologie investigative affidabili. L’address clustering rappresenta una categoria di tecniche per la deanonimizzazione il cui obiettivo è raggruppare alias appartenenti alla stessa entità, principalmente tramite l’uso di euristiche, rendendo l’address clustering uno step fondamentale per complementare qualsiasi tentativo di deanonimizzazione. Questa tesi tratta lo sviluppo di un framework per applicare e valutare euristiche. Le metodologie esistenti evidenziano varie sfide: la mancanza di uno strumento per applicare e valutare euristiche efficientemente, aggregate risultati e un metodo per gestire la problematica dei supercluster. Il nostro framework affronta questi limiti utilizzando dei grafi per gestire i risultati ottenuti tramite l’applicazione delle euristiche, permettendo di rappresentare relazioni complesse tra gli indirizzi non limitate alla loro aggregazione. Questa rappresentazione favorisce e semplifica l’integrazione dei risultati derivati da più euristiche o da fonti esterne. Inoltre, questa rappresentazione offre molte informazioni e, tramite nuove metriche di valutazione, fornisce un metodo sistematico per analizzare l’efficacia delle euristiche studiate. Infine, il nostro framework fornisce un metodo per estrarre clusters dai suddetti grafi. Presentiamo due implementazioni del nostro metodo, ognuna adatta a diverse necessità. La prima utilizza un database relazionale per contenere gli elementi più rilevanti della blockchain, mentre la seconda opta per un uso più ottimizzato dei dati. Per validare il nostro approccio, conduciamo uno studio su larga scala di undici euristiche, analizzando la loro applicabilità ed efficacia. Studiamo diverse combinazioni di queste euristiche con l’obiettivo di aumentare la copertura della blockchain e migliorare l’affidabilità dei risultati, assicurando che vengano validati da diverse euristiche.
A graph-based approach for the application and evaluation of address clustering heuristics on the Bitcoin blockchain
Dragonetti, Giovanni
2022/2023
Abstract
Bitcoin, a decentralized cryptocurrency introduced in 2008, has transformed the financial landscape and emerged as a point of interest for forensic analysis. Its pseudonymous nature, coupled with its potential for concealing illicit activities, underscores the need for reliable investigative techniques. Address clustering refers to the category of deanonymization techniques whose objective is to group aliases that share ownership, mainly through the use of heuristics, making address clustering an essential step to complement any de-anonymization attempt. This thesis focuses on the development of a framework to perform heuristic application and evaluation. Existing methodologies outline multiple challenges: the lack of a tool to apply and evaluate heuristics efficiently, aggregate results, and an effective method to handle the supercluster problem. Our framework aims to tackle these issues using graphs to handle the data obtained through heuristic application, allowing researchers to represent complex relationships between aliases not limited to their aggregation. This representation permits seamless integration of results from multiple heuristics or external sources. In addition, the graph representation offers many insights and, through a set of evaluation metrics, provides a standardized method to analyze the effectiveness of the involved heuristics. Finally, our framework provides a method to extract clusters from such graphs. We provide two implementations of the framework, each suited to different needs. The first relies on a relational database to store relevant blockchain data, while the second opts for more streamlined data handling. To validate our approach, we conduct a large-scale evaluation of eleven de-anonymization heuristics, analyzing their applicability and efficacy. We study different combinations of heuristics, aiming to improve blockchain coverage and enhance reliability by ensuring that the results are validated by multiple heuristics.File | Dimensione | Formato | |
---|---|---|---|
Master_Thesis.pdf
accessibile in internet solo dagli utenti autorizzati
Descrizione: Thesis
Dimensione
3.32 MB
Formato
Adobe PDF
|
3.32 MB | Adobe PDF | Visualizza/Apri |
Executive_Summary.pdf
accessibile in internet solo dagli utenti autorizzati
Descrizione: Executive Summary
Dimensione
712.75 kB
Formato
Adobe PDF
|
712.75 kB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/214391