Automated malware behavioral analysis

Malicious programs are a constant modern threat to everyone. To be able to defend ourselves from this menace, we need updated tools capable of quickly and efficiently analyze those programs. Our research focused on the development of such tools. First of all, we approached the task of automated behavioral malware analysis. By developing an unsupervised system to identify common behavioral pattern in malware binaries. The behaviors that our system extracts carry also static information in form of control-flow graph based fingerprints. Then, Jackdaw associates semantic information to the behaviors, to create a descriptive summary that helps the analysts, especially the inexperienced ones. All produced information can be easily browsed through a visualization tools that we implemented. The approach presented in this thesis exploits machine learning techniques to identify similar malware samples and graph mining to extrapolate the aforementioned behaviors. We tested our system on a dataset of 2136 distinct binaries, including both malicious and benign libraries and executables. We compared the behaviors extracted automatically against a ground truth of 44 behaviors created manually by expert analysts founding 77.3% of them. To be able to perform the aforementioned analysis this tool need to unobfuscated access to the malware binary. However, most malware nowadays implement some sort of packing techniques. This led us to develop a generic unpacker that is able to unpack binary for 63% of randomly collected samples. We also explored the possibility to develop a dynamic protection framework that can be used to defend PIN, one of the most used and supported DBI, against anti-instrumentation attacks. Starting from the techniques discovered in literature, we classified them and implemented a set of countermeasure as generic as possible to defeat them. The framework was tested with three main test cases: eXait, a tool which aims to detect DBI exploiting different techniques, Obsidium, a very complete packer known to employ anti- instrumentation attacks, and PEspin, another packer which employs self- modifying code that could crash the DBI framework. In every case, our tool was able to avoid PIN from being detected, permitting the analysis of the original protected program.

I programmi malevoli sono una costante minaccia per chiunque. Per poterci difendere da questa minaccia, necessitiamo di strumenti che ci permettano di analizzare velocemente ed efficacemente questi programmi. La nostra ricerca si è focalizzata sullo sviluppo di questi strumenti. In primis, abbiamo approciato il problema di automatizzare l'analisi comportamentale dei malware. Abbiamo sviluppato un sistema non supervisionato per identificare modelli di comportamento nei binari malevoli. I comportamente che il nostro sistema è in grado di strarre contengono anche informazione derivante dall'analisi statica. Jackdaw, è quindi, capace di associare informazioni semanticamente significative in grado di aiutare gli analisti, in particolare quelli inesperti, nel loro lavoro di analisi. Le tecniche presentate in questa testi sfruttano algoritmi di machine learning per identificare malware simili e quelli di graph mining per estrapolare i comportamenti. Abbiamo testato il nostro sistema su un insiede di 2136 binari distinti, comprendenti sia binari malevoli che benigni. Abbiamo confrontato i comportamenti prodotto dalla nostra analisi con 44 comportamenti generati da analisti esperti, Il nostro sistema è stato in grado di modellizzare correttamente il 77.3% di qusti comportamenti. Tutto questo processo di analisi abbisogna di accedere al codice assembler non offuscato del bianrio sotto analisi. Tuttavia, la maggior parte di malware moderni implementa un qualche sistema di offuscamento del codice. Per risolvere questo problema abbiamo sviluppato un unpacker generico che riesce a deoffuscare il 63% di binari scelti randomicamente. Abbiamo, infine, sviluppato un framework dinamico per proteggere PIN da processi di anti-instrumentazione. Basandoci sulle tecniche trovate in letterato, abbiamo creato una tassonomia e sviluppato delle contromisure generiche.

Automated malware behavioral analysis

POLINO, MARIO

Abstract

Malicious programs are a constant modern threat to everyone. To be able to defend ourselves from this menace, we need updated tools capable of quickly and efficiently analyze those programs. Our research focused on the development of such tools. First of all, we approached the task of automated behavioral malware analysis. By developing an unsupervised system to identify common behavioral pattern in malware binaries. The behaviors that our system extracts carry also static information in form of control-flow graph based fingerprints. Then, Jackdaw associates semantic information to the behaviors, to create a descriptive summary that helps the analysts, especially the inexperienced ones. All produced information can be easily browsed through a visualization tools that we implemented. The approach presented in this thesis exploits machine learning techniques to identify similar malware samples and graph mining to extrapolate the aforementioned behaviors. We tested our system on a dataset of 2136 distinct binaries, including both malicious and benign libraries and executables. We compared the behaviors extracted automatically against a ground truth of 44 behaviors created manually by expert analysts founding 77.3% of them. To be able to perform the aforementioned analysis this tool need to unobfuscated access to the malware binary. However, most malware nowadays implement some sort of packing techniques. This led us to develop a generic unpacker that is able to unpack binary for 63% of randomly collected samples. We also explored the possibility to develop a dynamic protection framework that can be used to defend PIN, one of the most used and supported DBI, against anti-instrumentation attacks. Starting from the techniques discovered in literature, we classified them and implemented a set of countermeasure as generic as possible to defeat them. The framework was tested with three main test cases: eXait, a tool which aims to detect DBI exploiting different techniques, Obsidium, a very complete packer known to employ anti- instrumentation attacks, and PEspin, another packer which employs self- modifying code that could crash the DBI framework. In every case, our tool was able to avoid PIN from being detected, permitting the analysis of the original protected program.

Scheda breve

Scheda completa

	Relatore
	
				ZANERO, STEFANO
			
	Coordinatore
	
				BONARINI, ANDREA
			
	Tutor
	
				BONARINI, ANDREA
			
	Data
	
				17-feb-2017
			
	Abstract in italiano
	
				I programmi malevoli sono una costante minaccia per chiunque. Per poterci
difendere da questa minaccia,  necessitiamo di strumenti che ci permettano di
analizzare velocemente ed efficacemente questi programmi. La nostra ricerca si
è focalizzata sullo sviluppo di questi strumenti.

In primis, abbiamo approciato il problema di automatizzare l'analisi
comportamentale dei malware. Abbiamo sviluppato un sistema non supervisionato
per identificare modelli di comportamento nei binari malevoli. I comportamente
che il nostro sistema è in grado di strarre contengono anche informazione
derivante dall'analisi statica. Jackdaw, è quindi, capace di associare
informazioni semanticamente significative in grado di aiutare gli analisti, in
particolare quelli inesperti, nel loro lavoro di analisi.

Le tecniche presentate in questa testi sfruttano algoritmi di machine learning per identificare
malware simili e quelli di graph mining per estrapolare i comportamenti.

Abbiamo testato il nostro sistema su un insiede di 2136 binari distinti,
comprendenti sia binari malevoli che benigni. Abbiamo confrontato i
comportamenti prodotto dalla nostra analisi con 44 comportamenti generati da
analisti esperti, Il nostro sistema è stato in grado di modellizzare
correttamente il 77.3% di qusti comportamenti.

Tutto questo processo di analisi abbisogna di accedere al codice assembler non
offuscato del bianrio sotto analisi. Tuttavia, la maggior parte di malware
moderni implementa un qualche sistema di offuscamento del codice. Per
risolvere questo problema abbiamo sviluppato un unpacker generico che riesce a
deoffuscare il 63% di binari scelti randomicamente.

Abbiamo, infine, sviluppato un framework dinamico per proteggere PIN da
processi di anti-instrumentazione. Basandoci sulle tecniche trovate in
letterato, abbiamo creato una tassonomia e sviluppato delle contromisure generiche.
			
	Tipo di documento
	
				Tesi di dottorato
			
	Appare nelle tipologie:
	
				Tesi di Dottorato

File allegati

File	Dimensione	Formato
thesis.pdf non accessibile Descrizione: Thesis Dimensione 3.25 MB Formato Adobe PDF Visualizza/Apri	3.25 MB	Adobe PDF	Visualizza/Apri

I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10589/132106