OpenST : feasibility study and prototype of a low cost, hardware-based system call tracer

There is a need for appropriate analysis approaches that allow researchers to understand what malware, or generic unknown programs, do on the target system. Hardware-based malware-analysis sandboxes have been recently proposed to replace emulator-based sandboxes, thanks to their transparency and resilience to emulator-detection attacks. A core part of any sandbox is its capability of "tracing" a (malicious) running program, such that the actions (e.g., instructions, operating system calls) that it performs on the system can be observed. In state-of-the-art emulator-based sandboxes tracing relies on so-called virtual machine introspection (VMI) techniques, which consist in tracing the instructions from outside the virtual CPU for reconstructing high-level events such as system calls. In hardware-based sandboxes tracing is still an open problem, as it is highly dependent from the debugging capability of the CPU. Interestingly, we observe that the vast majority of mobile devices (which are among the targets of malware authors) are based on the ARM architecture, which natively supports machine-level debugging from hardware interfaces. In this work we assess the feasibility of implementing a system call tracer for the Android/Linux operating system running on ARM-based computers. We propose OpenST, an open-source tool that leverages the JTAG interface to perform the equivalent of VMI yet in hardware. More precisely, our tool uses hardware breakpoints to track the occurrence of software interrupts and inspect the CPU registers in order to reconstruct system calls. OpenST also inspects the running process' memory to reconstruct the value of each argument passed to the system function, performing pointer de-referencing and data unmarshalling as needed. OpenST is portable across Linux versions because it derives the system call prototypes from the kernel binary image, from which it generates argument-unmarshalling procedures automatically. We implemented OpenST and evaluated its correctness against a testing Linux application that invokes known system calls. Moreover, we performed micro- and macro-benchmarks on 3 real-world applications. Our micro-benchmarks show that the need for pausing and resuming the CPU to inspect the memory for reconstructing the arguments values imposes a substantial overhead, around 180ms, where a system call takes 500--2000ns on average. In comparison the state-of-the-art emulator based sandbox imposes an overhead of a fraction of the millisecond. Our macro-benchmarks show that this overhead has an impact of 70x on average on the overall execution time. In practice, our tests with Android applications showed that this slowdown makes the user interface unusable. We measured that the overhead depends from the speed of the JTAG adapter, so, in principle, it could be reduced by using faster hardware. In conclusion, I believe that our approach is promising yet unfeasible with current low-cost hardware, which is a requirement for large-scale malware analysis.

C'è necessità di nuovi approcci che permettano agli esperti di sicurezza di analizzare e capire il comportamento di malware, o programmi sconosciuti, sul sistema analizzato. In letteratura sono state proposte sandbox hardware per analisi di malware per sostituire quelle basate su emulazione, per via della loro maggiore trasparenza. Una delle caratteristiche fondamentali di una sandbox è la sua capacità di tracciare le operazioni che compie sul sistema (e.g., istruzioni macchina, chiamate di sistema). Nello stato dell'arte, le sandbox basate sull'emulazione utilizzano tecniche di virtual machine introspection (VMI), che consistono nel tracciare le istruzioni da fuori la macchina virtuale per ricostruire eventi di alto livello come chiamate di sistema. Il tracciamento su sandbox basate su hardware è ancora un problema aperto, in quanto è fortemente dipendente dalla capacità di debug della CPU. È interessante notare che, la maggior parte dei dispositivi mobili (che sono tra gli obiettivi di autori di malware) sono basati su architettura ARM e quindi supportano nativamente il debugging a livello macchina. In questo lavoro studiamo la fattibilità di implementare un tracer delle chiamate di sistema per Android/Linux in esecuzione su processori ARM. OpenST propone uno strumento open source che sfrutta l'interfaccia JTAG per implementare l'equivalente di VMI in hardware. Più precisamente, il nostro strumento utilizza breakpoint hardware per monitorare i software interrupt (istruzione SWI) e leggere i registri della CPU per la loro ricostruzione. OpenST ispeziona anche il processo in esecuzione di memoria per ricostruire il valore degli argomenti passati alla funzione di sistema, ed eseguendo de-referenziazione dei puntatori e unmarshalling dei dati in base alle esigenze. OpenST è portabile su differenti versioni di Linux perché ricostruisce i prototipi delle chiamate di sistema dall'immagine binaria del kernel, da cui generiamo automaticamente le procedure automatiche per l'unmarshalling. Abbiamo implementato OpenST e valutato la sua correttezza con un'applicazione di test che invoca alcune chiamate di sistema. Inoltre, abbiamo effettuato micro e macro-benchmark su 3 applicazioni di uso comune. I risultati del micro-benchmark mostrano che la necessità di mettere in pausa la CPU per leggere la memoria per ricostruire i valori di argomenti impone un overhead significativo, intorno a 180 ms, laddove una chiamata di sistema utilizza circa 500-- 2000ns. All'attuale stato dell'arte le sandbox basate su emulazione impongono un overhead di una frazione di millisecondo. I nostri macro- benchmark dimostrano che questo overhead ha un impatto di 70x, in media, il tempo complessivo di esecuzione. In pratica, i nostri test con applicazioni Android hanno dimostrato che questo rallentamento rende l'interfaccia utente inutilizzabile. Abbiamo misurato che il overhead dipende dalla velocità della scheda JTAG, quindi, in linea di principio, può essere ridotto utilizzando hardware più veloce. In conclusione, riteniamo che il nostro approccio sia promettente, ma irrealizzabile con l'attuale hardware a basso costo, che è un requisito per un utilizzo massivo nell'analisi di malware.