Malicious Software, from now on malware, is any software that brings harm to a computer system. According to Pandalabs 75 million new malware samples were observed during the last year, out of 350 million total malware specimens. This translates to 200,000 new malware samples every day. Malware industry is rising as a real underground economy that generates huge illegal profits: stealing bank accounts, abusing credit cards number or penetrating email accounts. Malware samples are regularly sold on the market, and they can reach high prices. For this reason automatic malware analysis tools are strongly needed to optimize analysis time of new samples and understanding of their malicious behaviors. Jackdaw, an automatic behavior extractor and semantic tagger, was built to address this need. Jackdaw is a tool that analyzes malware samples exploiting static and dynamic analysis procedures. Unfortunately, Jackdaw models are created and saved as logical formulas, and show some limits in preserving information about API calls numbers and taint dependencies between them. Jackdaw creates a basic model that does not allow, for example, to track the presence in the model of more files or more system resources in use in the analyzed behavior. This thesis will focus on improving Jackdaw's model generation providing additional information with respect to the previous ones. Using static and dynamic analysis techniques we are going to generate taint dependencies between system calls and we are going to put them in a graph using taint dependencies analysis. Our main goal will be to extract common behavioral models from clusters of malware, created by Jackdaw extracting common API call sequences that these shares. New behavioral models will be a graphs in which nodes represent API calls and edges dependencies between them. The work presented in this thesis leads to the identification of 607 malicious behaviors models starting from a population of a large dataset of malware samples - those behaviors were divided into 37 groups of indistinguishable behaviors according to the old modeling system to prove the effective improvements in the quantity of the behaviors we can distinguish. Thanks to the model introduced in this thesis, the granularity of malware behavior distinguishable in our population increased of 85%.

Extracting common malicious temporal dependent behaviors from malware

MASSETTI, ALESSIO
2014/2015

Abstract

Malicious Software, from now on malware, is any software that brings harm to a computer system. According to Pandalabs 75 million new malware samples were observed during the last year, out of 350 million total malware specimens. This translates to 200,000 new malware samples every day. Malware industry is rising as a real underground economy that generates huge illegal profits: stealing bank accounts, abusing credit cards number or penetrating email accounts. Malware samples are regularly sold on the market, and they can reach high prices. For this reason automatic malware analysis tools are strongly needed to optimize analysis time of new samples and understanding of their malicious behaviors. Jackdaw, an automatic behavior extractor and semantic tagger, was built to address this need. Jackdaw is a tool that analyzes malware samples exploiting static and dynamic analysis procedures. Unfortunately, Jackdaw models are created and saved as logical formulas, and show some limits in preserving information about API calls numbers and taint dependencies between them. Jackdaw creates a basic model that does not allow, for example, to track the presence in the model of more files or more system resources in use in the analyzed behavior. This thesis will focus on improving Jackdaw's model generation providing additional information with respect to the previous ones. Using static and dynamic analysis techniques we are going to generate taint dependencies between system calls and we are going to put them in a graph using taint dependencies analysis. Our main goal will be to extract common behavioral models from clusters of malware, created by Jackdaw extracting common API call sequences that these shares. New behavioral models will be a graphs in which nodes represent API calls and edges dependencies between them. The work presented in this thesis leads to the identification of 607 malicious behaviors models starting from a population of a large dataset of malware samples - those behaviors were divided into 37 groups of indistinguishable behaviors according to the old modeling system to prove the effective improvements in the quantity of the behaviors we can distinguish. Thanks to the model introduced in this thesis, the granularity of malware behavior distinguishable in our population increased of 85%.
MAGGI, FEDERICO
POLINO, MARIO
ING - Scuola di Ingegneria Industriale e dell'Informazione
28-lug-2015
2014/2015
Tesi di laurea Magistrale
File allegati
File Dimensione Formato  
2015_07_Massetti.pdf

accessibile in internet per tutti

Dimensione 1.15 MB
Formato Adobe PDF
1.15 MB Adobe PDF Visualizza/Apri

I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10589/108696