Exploring execution variability for problem-space evasion of ML-based behavioral malware detectors

Machine learning algorithms are effective for classifying and detecting malware by analyzing its dynamic behavior, but they are vulnerable to adversarial attacks. Existing attacks often focus on feature space and neglect the problem space. Considering the intrinsic nondeterministic nature of the malware is a factor that should not be overlooked as it can considerably increase the probability of success of the attack. Since different executions of the same malware can lead to different behaviors, perturbations tailored for a single specific behavior can be ineffective for others. In this research, we developed an attack on sequential data, that consider multiple possible behavior of the same malware, to compute perturbations that are highly effective in more than a single execution. We find the most influential positions inside the sequence and the most influential API calls to inject using the knowledge of the internal architecture of the ML model that we use as oracle. Consecutively, we aggregate the results to maximize the impact of the perturbation on the detector decision among all behaviors. Our analyses, conducted in a controlled environment, show that our approach can effectively deceive the two Recurrent Neural Network (RNN) malware detector that we employed in the experiments with a high effectiveness in both the white and black-box scenarios. The results of the experiments show that the attack achieves an evasion rate up to 89% in the problem space. Our approach outperforms the current state of the art in both the effectiveness of the attack and in the total number of API calls injected necessary to achieve misclassification.

Gli algoritmi di machine learning (apprendimento automatico) vengono utilizzati in maniera efficace per la classificazione e il rilevamento di malware analizzando il loro comportamento dinamico, ma risultano vulnerabili ad attacchi avversari. Gli attacchi esistenti si concentrano spesso sullo spazio delle caratteristiche e trascurano invece lo spazio operativo del problema. Tuttavia, questo è un aspetto da non sottovalutare perchè considerare l'intrinseca natura nondeterministica dei malware può aumentare notevolmente le probabilità di evasione. Siccome diverse esecuzioni dello stesso malware possono risultare in comportamenti diversi tra di loro, delle perturbazioni create appositamente per un comportamento nello specifico possono risultare inefficaci per altri comportamenti. In questa ricerca abbiamo sviluppato un attacco su sequenze discrete, che tiene conto di più possibili comportamenti che un malware potrebbe esibire, per creare delle perturbazioni che sono molto efficaci in più di una singola esecuzione. Nel nostro approccio troviamo la posizione più influente all'interno della sequenza e le chiamate API da iniettare più impattanti utilizzando la conoscenaza della struttura interna del modello che usiamo come oracolo. In seguito aggreghiamo i riusltati ottenuti per massimizare l'efficacia della perturbazione finale su tutti i possibili comportamenti del malware. Le nostre analisi, condotte in un ambiente controllato, dimostrano che il nostro approccio riesce a ingannare in maniera efficace le due Reti Neurali Ricorrenti, che abbiamo usato negli esperimenti, con un'elevata efficacia in entrambi gli scenari white-box e black-box. Dai riusltati degli eseperimenti è emerso che l' attacco ha una percentuale di successo che arriva all' 89%. Il nostro approccio ha dimostrato delle prestazioni migliori rispetto allo stato dell'arte attuale sia per quanto riguarda l'efficacia dell'attacco che per quanto rigurada il numero medio di chiamate API iniettate necessarie per ingannare il modello di machine learning.