Toward live memory forensics for malware identification

Malware is software that is specifically designed to disrupt, damage, or gain unauthorized access to computer systems. Several protection mechanisms have been implemented to protect computer systems against this threat, but they all some issues and are not infallible. The major flaw of many of these defense mechanisms is that they depend directly on the computer system they are trying to protect. For example, antimalware software that runs within the system it protects relies on functionalities that the system provides. Therefore, malware could interfere with the correct functioning of antimalware software by altering the behavior of components of the system. The same holds true for anything that depends directly on the system. Memory forensics solves this dependency problem allowing the analysis to be carried on a separate computer system. However, the content of the memory of the computer systems to be studied must be first extracted and transferred. This operation can be challenging and it gets more problematic as the memory of the system grows in size. We hereby present an efficient and effective way to study the memory of computer systems without depending directly on them. Instead of obtaining the entire content of the memory at once, we access only those regions relevant for the analysis. We also read the content of the memory without relying on software running inside the system, thus ensuring the integrity of the information obtained. This method of analysis allows to keep a computer system under analysis as it runs and do continuous verifications. We evaluated our approach with both virtual machines, accessing their memory using a modified hypervisor, and physical machines, accessing their memory using an I/O peripheral with Direct Memory Access. The analysis of physical machines can be performed by any device capable of accessing the memory of other computer systems through Direct Memory Access, including embedded portable devices. We also tested our approach implementing a malware detector capable of finding autonomously potential threats by continuously looking at specific data structures. We tested our detector on three different setups, two using virtual machines and one a physical machine, executing 2,050 different samples on each in a totally automated way. We collected our samples from different sources and created three different sets. For two of the sets, which comprised both malicious and non-malicious software, our detector found suspicious activity for 50% and 60% of the samples respectively. Of the other set consisting only of malware, our detector found suspicious activity for 90% of the samples.

Il malware è software realizzato con lo scopo di interrompere l'esecuzione di un sistema informatico, danneggiarlo o ottenerne accesso non autorizzato. Il malware rappresenta tutt'oggi una seria minaccia per i sistemi informatici e per tale motivo sono stati implementati diversi meccanismi di protezione. Tuttavia, ciascuno di questi meccanismi di protezione presenta una qualche problematica tale da non renderlo infallibile. Uno dei principali problemi di questi meccanismi di difesa è la dipendenza diretta dal sistema informatico che cercano di proteggere. Per esempio, un software anti malware in esecuzione all'interno del sistema da proteggere fa affidamento a componenti e funzionalità messe a disposizione dal sistema. Un malware potrebbe quindi interferire col suo corretto funzionamento alternando il comportamento di questi componenti e funzionalità su cui il software anti malware fa affidamento. La memory forensics permette di risolvere questo problema di dipendenza rendendo possibile l'analisi di un sistema in un ambiente sicuro e separato da quello del sistema da studiare. Tuttavia, il contenuto della memoria del sistema informatico da studiare deve innanzitutto essere estratto e trasferito altrove. Questa operazione può risultare onerosa e lenta, soprattutto al crescere della dimensione della memoria del sistema da studiare. Estrazioni singole possono inoltre risultare insufficienti in quelle situazioni che necessitano di una verifica rapida e continua dello stato del sistema. Si hanno inoltre problemi legati al metodo di estrazione dei dati. Diverse soluzioni si affidano a software che gira nel sistema da analizzare, tuttavia queste soluzioni mostrano lo stesso problema del software anti malware, ovvero l'affidabilità delle informazioni ottenute non è garantita. Altri metodi di estrazione effettuano invece un'acquisizione della memoria tramite hardware, ad esempio tramite periferiche con Direct Memory Access, non permettendo in questo modo a eventuale malware di interferire. Quest'ultimo metodo di estrazione risulta quindi essere affidabile, ma l'acquisizione in sé non è sufficiente alla realizzazione di un sistema che consenta l'analisi in tempo reale. Con il nostro lavoro presentiamo un metodo sicuro, efficiente ed efficace per effettuare lo studio di un sistema informatico. Invece di ottenere l'intera memoria per eseguirne l'analisi in un secondo momento, accediamo in maniera diretta tramite hardware e estraiamo informazioni senza il bisogno di memorizzare in maniera persistente i dati letti. Leggiamo inoltre la memoria senza fare affidamento a software che gira all'interno del sistema, assicurando in questo modo l'integrità delle informazioni ottenute. Il metodo di analisi consente di tenere sotto controllo un sistema informatico in funzione e effettuare verifiche continue solo leggendo parti della memoria rilevanti per l'analisi. Poiché nel nostro approccio l'analisi avviene contemporaneamente all'estrazione, è possibile rilevare attività sospette non appena queste si verificano. Abbiamo valutato il nostro approccio sia usando macchine virtuali, accedendo alla loro memoria tramite un hypervisor da noi modificato, sia usando macchine fisiche, accedendo alla memoria tramite periferiche I/O con Direct Memory Access. Il nostro approccio può essere messo in pratica con qualunque dispositivo in grado di accedere alla memoria di sistemi informatici tramite Direct Memory Access, permettendo la realizzazione di dispositivi embedded specificamente realizzati con lo scopo di monitorare sistemi informatici. Al fine di valutare il nostro approccio, abbiamo implementando un rilevatore di malware in grado di identificare in maniera autonoma la presenza di potenziali minacce, analizzando in maniera continua specifiche strutture dati. Abbiamo testato il rilevatore su una collezione di 2.050 campioni di malware che abbiamo ottenuto da diverse fonti. I test sono stati eseguiti su tre diversi sistemi, due basati su macchine virtuali e uno su macchine fisiche, in maniera del tutto autonoma. Il rilevatore è stato in grado di identificare attività sospette per il 50% e il 60% dei campioni di due collezioni costituite sia da software maligno che non e per 90% dei campioni di una collezione costituita esclusivamente da software maligno.