Privacy-preserving machine and deep learning solutions will enable new and exciting breakthroughs in many different application fields in the next few years. In fact, their ability to process encrypted input data through machine and deep learning models will allow to guarantee the privacy of users during the processing, hence allowing to match the stricter and stricter legislation and recommendations in terms of data protection and user privacy. For this reason, the research interest in this field is steadily growing, with relevant results only in specific application fields. In this work, for the first time in the literature, we introduce a privacy-preserving natural language processing solution, called HErBERT, able to perform text classification on encrypted data. The proposed solution, which relies on Homomorphic Encryption to process encrypted data, is inspired by the well-known BERT architecture and introduces privacy-preserving Transformers. A computationally-efficient inference of HErBERT has been designed and developed and made available to the scientific community. Experimental results on two real-world benchmarks for text classification show the effectiveness of the proposed solution.
I modelli di Machine Learning e Deep Learning che preservano la privacy consentiranno nuove ed entusiasmanti scoperte in svariati campi di applicazione nei prossimi anni. Infatti, la loro capacità di elaborare i dati di input crittografati attraverso modelli di machine e deep learning consentirà di garantire la privacy degli utenti durante il trattamento, consentendo così di adeguarsi alla legislazione e alle raccomandazioni più severe in termini di protezione dei dati e privacy degli utenti. Per questo motivo l'interesse della ricerca in questo campo è in costante crescita, con risultati rilevanti solo in specifici campi di applicazione. In questo lavoro, per la prima volta in letteratura, introduciamo una soluzione di elaborazione del linguaggio naturale che preserva la privacy, chiamata HErBERT, in grado di eseguire la classificazione del testo su dati crittografati. La soluzione proposta, che si basa sulla crittografia omomorfica per elaborare i dati crittografati, si ispira alla nota architettura BERT e introduce i Trasformers che preservano la privacy. Un'inferenza computazionalmente efficiente di HErBERT è stata progettata e sviluppata e resa disponibile alla comunità scientifica. I risultati sperimentali, su due dataset di riferimento su problemi reali per la classificazione del testo, mostrano l'efficacia della soluzione proposta.
HErBERT : a privacy-preserving natural language processing solution for text classification
Comi, Daniele
2020/2021
Abstract
Privacy-preserving machine and deep learning solutions will enable new and exciting breakthroughs in many different application fields in the next few years. In fact, their ability to process encrypted input data through machine and deep learning models will allow to guarantee the privacy of users during the processing, hence allowing to match the stricter and stricter legislation and recommendations in terms of data protection and user privacy. For this reason, the research interest in this field is steadily growing, with relevant results only in specific application fields. In this work, for the first time in the literature, we introduce a privacy-preserving natural language processing solution, called HErBERT, able to perform text classification on encrypted data. The proposed solution, which relies on Homomorphic Encryption to process encrypted data, is inspired by the well-known BERT architecture and introduces privacy-preserving Transformers. A computationally-efficient inference of HErBERT has been designed and developed and made available to the scientific community. Experimental results on two real-world benchmarks for text classification show the effectiveness of the proposed solution.File | Dimensione | Formato | |
---|---|---|---|
Thesis.pdf
accessibile in internet per tutti
Descrizione: HErBERT: a privacy-preserving natural language processing solution for text classification
Dimensione
3.71 MB
Formato
Adobe PDF
|
3.71 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/180135