vETH: first steps towards virtualising multi-channel Ethernet NICs on safety-focused automotive MCUs

The increasing complexity of functions in modern cars has outgrown traditional distributed Electronic/Electrical (E/E) architectures and communication protocols, prompting the development of the zone architecture and of Automotive Ethernet with virtualisation as one of the key enablers. Existing literature shows that deploying virtualised environments in embedded mixed-criticality systems can meet the performances required by the industry. Most work on virtualisation has been done on powerful application-level processors like ARM's Cortex-A53, which supports a mature ecosystem of rich OSs such as Linux. However, little work has been done on more safety-oriented processors like ARM's Cortex-R52+ that have only a limited set of features. Although these MCUs support only simple real-time operating systems (RTOSs), they are accompanied by powerful Network Interface Controllers (NICs) which would go underutilised if not efficiently shared among guests. In this thesis, we propose that NICs with multiple DMA channels enable efficient virtualisation on MCUs. To do so, we first devised an architecture that splits interface handling between two parties: (1) a backend VM, referred to as the "broker," which directly interacts with the hardware and exposes the various DMA channels as virtual interfaces; (2) multiple frontend VMs, which will consume the interfaces exposed by the broker through hypervisor-provided Inter-Processor-Communication (IPC) services. This approach has two major flaws. The first is the security risk inherent to directly exposing DMA to untrusted guest. This issue is mitigated by the widespread presence on the platforms targeted by this architecture of specialized security hardware designed specifically for such cases. The second one is performance-related and stems from the fact that we are introducing a bottleneck in the form of the broker. Performance analysis showed that this is not an issue for the workloads we considered. We implemented a proof-of-concept using the Bao hypervisor on top of STMicroelectronics Stellar P7 safety-focused MCU. We evaluated its performance under some synthetic workloads, obtaining good results in latency, but running into issues with throughput tests due to outstanding bugs which we were not able to fix in the timespan of this thesis work. Finally, we gathered insights into the performance of our architecture in order to identify potential performance issues and identified the hypervisor's IPC mechanism to be the dominating factor in packet processing times, in line with results established by previous works.

L'aumento della complessità delle funzioni nelle automobili di ultima generazione ha ormai sorpassato le capacità delle tradizionali architetture Elettroniche/Elettriche (E/E) distribuite e dei classici protocolli di comunicazione, spronando lo sviluppo dell'architetttura a zone e dell'Automotive Ethernet, con la virtualizzazione come fulcro. La letteratura esistente mostra che è possibile raggiungere le performance richieste dall'industria utilizzando ambienti virtualizzati su sistemi embedded mixed-criticality. La maggior parte del lavoro già svolto sulla virtualizzazione in questo ambito è stato svolto su processori application-level come il Cortex-A53, i quali supportano un vasto ecosistema di sistemi operativi maturi come Linux. Purtroppo, sono stati trattati poco i processori più incentrati sulla safety come il Cortex-R52+, i quali possiedono solo un numero limitato di funzionalità. Anche se questi processori supportano solo semplici sistemi operativi real-time, essi sono spesso corredati di Network Interface Controller (NIC) molto potenti che verrebbero sprecate se non condivise efficientemente tra diverse applicazioni. In questa tesi, tenteremo di mostrare che NIC con più canali DMA possono essere utilizzate per creare architetture di virtualizzazione efficienti su microcontrollori. Per fare ciò, prima abbiamo progettato una metodologia che divide la gestione dell'interfaccia tra due entità: (1) una VM "backend", che chiameremo "broker", che interagirà direttamente con l'hardware ed esporrà i vari canali DMA dell'interfaccia sotto forma di interfacce virtuali; (2) più VM "frontend" che consumeranno le interfacce virtuali esposte dal broker tramite primitivi di Inter-Processor-Communication (IPC) forniti dall'hypervisor. Questo approccio presenta due importanti problemi. Il primo è relativo al rischio che viene con l'esporre il DMA a VM untrusted. Questo problema è mitigato dalla presenza diffusa sulle piattaforme per cui questa architettura è stata pensata di hardware specializzato a gestire proprio questa evenienza. Il secondo problema è relativo al bottleneck introdotto dall'uso del broker. L'analisi della performance condotta ha rivelato che questo secondo punto non è rilevante per gli workload considerati. Abbiamo realizzato un'implementazione dimostrativa di questa architettura usando l'hypervisor Bao per il microcontroller Stellar P7 di STMicroelectronics. Abbiamo valutato le performance di questa implementazione sottoponendola a vari benchmark sintetici, ottenendo buoni risultati per quanto riguarda la latenza, ma riscontrando problemi coi test di throughput a causa di bug che non siamo riusciti a sistemare nell'arco di tempo di questo lavoro. Infine, abbiamo analizzato nel dettaglio il comportamento della nostra architettura per identificare potenziali inefficienze, scoprendo che il meccanismo di IPC dell'hypervisor è il fattore dominante nel tempo di elaborazione dei pacchetti, osservazione in linea con i risultati in letteratura.