Due to the important role that the sleepiness identification covers in today’s world, the objective of this work is to find, in physiological signals, patterns related to a drowsiness condition and to determine, through feature extraction processes and Machine Learning Models, the level of alertness of the subjects. For this purpose, analyses on the EOG and EEG, considered the gold-standard signals in this field of research, have been performed. Three datasets have been used: an external dataset (SEED-VIG), that consists of electroencephalographic and oculographic signals and a vigilance ground truth variable called PERCLOS, acquired by the authors during driving simulations of 23 subjects; LIFE dataset, which includes EOG signals previously acquired by L.I.F.E. Italia S.r.l from 10 subjects during a free blinking task; and finally, a VIGILANCE dataset, acquired by the author of this project, composed both by EOG and EEG acquisitions, made on 18 subjects through a wearable headset prototype (called Balaclava), designed and engineered by L.I.F.E. Italia S.r.l, and by a vigilance target variable collected through visual reaction tasks. In the VIGILANCE dataset, the ground truth variable, used both for the models training phase and for their prediction accuracy calculation, derives from a combined approach that links perceptual, motor, and cognitive skills. The reaction task has been modelled on the standard 10-minutes PVT, with some modifications respect to its conventional version: e.g., the duration and the insertion of batteries of simple and complex exercises. From this experiment, the reaction times, lapses and percentage of error have been extracted. In addition to them, a subjective variable has been annotated through a Karolinska Sleeping Scale (KSS) questionnaire, submitted to the subjects every 3 minutes during the test. With the aim to identify the characteristic elements of both EEG and EOG signals able to provide an accurate discrimination between wakefulness and sleepiness, several algorithms have been implemented. The EOG algorithm has been thought to achieve an accurate identification of blinks and double blinks, ocular components often considered to be of higher importance for the drowsiness detection. Since these elements are affected by important intra and intersubjective differences, an innovative process has been developed to overcome the derived limitations. The procedure involves a Dynamic Time Warping technique, a similarity measure able to provide a non-linear transformation on the compared sequences, stretching them to minimize and quantify their differences. The DTW has been applied as guideline for two K-medoid clustering approaches and for the training and test of two FFNNs. The blink and double blink identification algorithms have been validated on LIFE dataset, where they have achieved a detection sensitivity equal to 87%. The EEG algorithm, instead, is mainly based on the analysis in the frequency domain, extracting, from PSDs, relevant characteristics related to the theta, alpha and beta bands, after that the signals have been submitted to an artefact removal approach (ICA). EOG and EEG features obtained have been used as inputs for supervised models, that share a regressive approach able to compute the vigilance level in a continuous scale between 0 and 1. Among the chosen models, SVR with linear kernel, Random forest and Gradient Boosting have performed better than the others. Moreover, inputs composed by EOG and EEG features together have reached the best prediction accuracies and the best correlations with the target in the SEED-VIG dataset. In particular, Random Forest model has prevailed on the others in most of the cases (the best accuracy for SEED-VIG is equal to 89,41% with the model trained on both EEG and EOG features and equal to 98,02% in the VIGILANCE dataset, with the model trained only on the EEG features extracted from the firsts 10 minutes of the test).
A causa del ruolo fondamentale che l’identificazione della sonnolenza ricopre nel mondo attuale, lo scopo di questa tesi è quello di cercare, all’interno di segnali fisiologici, pattern associati a tale condizione e determinare, attraverso un processo di estrazione di features e mediante modelli di Machine Learning con apprendimento supervisionato, il livello di vigilanza di un soggetto. A questo scopo, sono state fatte analisi su segnali EOG ed EEG, considerati i gold-standard in questo campo di ricerca. Durante il progetto, sono stati usati tre diversi dataset: un dataset esterno (SEED-VIG), il quale è composto da segnali elettroencefalografici e oculografici e da una variabile di ground truth chiamata PERCLOS, acquisita dagli autori durante esperimenti di guida simulata su 23 soggetti; il LIFE dataset, il quale include segnali EOG precedentemente acquisiti da L.I.F.E. Italia S.r.l da 10 soggetti durante un task “free blinking”; ed, infine, il VIGILANCE dataset, acquisito dall’autrice di questo progetto e composto sia da segnali EOG ed EEG, acquisiti da 18 soggetti attraverso un prototipo indossabile (chiamato Balaclava), progettato e ingegnerizzato da L.I.F.E. Italia S.r.l, sia da una variabile target collezionata attraverso un test di reazione visivo. Nel VIGILANCE dataset, la variabile ground truth, usata sia per la fase di training dei modelli sia per il calcolo della loro accuratezza di predizione, deriva da un approccio combinato che lega skills percettive, motorie e cognitive. Il task di reazione è stato modellato sullo standard PVT di 10 minuti, con l’apporto di modifiche rispetto alla sua versione convenzionale: ad esempio, per la durata e per l’inserimento di batterie di esercizi semplici e complessi. Dall’esperimento vengono estratti tempi di reazione, lapses e la percentuale di errori commessi. In aggiunta a tali parametri, una variabile soggettiva viene annotata attraverso il questionario Karolinska Sleeping Scale (KSS), sottoposto ai soggetti ogni 3 minuti durante il test. Diversi algoritmi sono stati implementati con lo scopo di identificare, nei segnali EEG ed EOG, gli elementi caratteristici in grado di portare ad una accurata discriminazione tra lo stato di veglia e di sonnolenza. L’algoritmo EOG è stato pensato per raggiungere un’accurata identificazione di blink e double blink, i componenti oculari spesso considerati di maggiore importanza per la detezione di questa condizione. In quanto questi elementi sono affetti da importanti differenze intra ed intersoggettive, è stato sviluppato un processo innovativo per superare i limiti derivanti. La procedura prevede l’utilizzo del Dynamic Time Warping, una misura di similarità in grado di applicare una trasformazione non lineare alle sequenze comparate, stretchandole per minimizzarne e quantificarne le differenze. Il DTW è stato applicato come guida per due approcci di clustering K-medoid e per la fase di training e test di due reti neurali Feed-forward. Gli algoritmi di identificazione di blink e double blink sono stati validati sul dataset LIFE, in cui la sensitività di detezione è risultata pari all’87%. L’algoritmo EEG è basato, invece, su analisi nel dominio della frequenza, e prevede l’estrazione, in seguito all’utilizzo di un approccio di rimozione di artefatti (ICA), di caratteristiche rilevanti associate alle bande theta, alpha e beta a partire dai PSD calcolati. Le features EOG ed EEG ottenute sono state successivamente usate come input per modelli con apprendimento supervisionato, i quali hanno in comune un approccio regressivo in grado di calcolare il livello di sonnolenza su una scala continua tra 0 e 1. Tra tutti i modelli usati, SVR con kernel lineare, Random Forest e Gradient Boosting sono risultati i più performanti. Inoltre, per il dataset SEED-VIG, input composti dalla combinazione di features EOG ed EEG hanno raggiunto le accuratezze di predizione più alte. In particolare, il modello RF ha prevalso sugli altri nella maggior parte delle analisi (la migliore accuratezza per il dataset SEED-VIG è stata ottenuta con tale modello allenato su entrambi i tipi di features ed ha raggiunto l’89,41%, mentre un’accuratezza pari al 98,02 % è stata ottenuta nel VIGILANCE dataset con il modello allenato sulle sole features EEG estratte nei primi 10 minuti del test).
Vigilance estimation through analysis of EEG and EOG signals and machine learning models
CERRUTI, LIVIA
2019/2020
Abstract
Due to the important role that the sleepiness identification covers in today’s world, the objective of this work is to find, in physiological signals, patterns related to a drowsiness condition and to determine, through feature extraction processes and Machine Learning Models, the level of alertness of the subjects. For this purpose, analyses on the EOG and EEG, considered the gold-standard signals in this field of research, have been performed. Three datasets have been used: an external dataset (SEED-VIG), that consists of electroencephalographic and oculographic signals and a vigilance ground truth variable called PERCLOS, acquired by the authors during driving simulations of 23 subjects; LIFE dataset, which includes EOG signals previously acquired by L.I.F.E. Italia S.r.l from 10 subjects during a free blinking task; and finally, a VIGILANCE dataset, acquired by the author of this project, composed both by EOG and EEG acquisitions, made on 18 subjects through a wearable headset prototype (called Balaclava), designed and engineered by L.I.F.E. Italia S.r.l, and by a vigilance target variable collected through visual reaction tasks. In the VIGILANCE dataset, the ground truth variable, used both for the models training phase and for their prediction accuracy calculation, derives from a combined approach that links perceptual, motor, and cognitive skills. The reaction task has been modelled on the standard 10-minutes PVT, with some modifications respect to its conventional version: e.g., the duration and the insertion of batteries of simple and complex exercises. From this experiment, the reaction times, lapses and percentage of error have been extracted. In addition to them, a subjective variable has been annotated through a Karolinska Sleeping Scale (KSS) questionnaire, submitted to the subjects every 3 minutes during the test. With the aim to identify the characteristic elements of both EEG and EOG signals able to provide an accurate discrimination between wakefulness and sleepiness, several algorithms have been implemented. The EOG algorithm has been thought to achieve an accurate identification of blinks and double blinks, ocular components often considered to be of higher importance for the drowsiness detection. Since these elements are affected by important intra and intersubjective differences, an innovative process has been developed to overcome the derived limitations. The procedure involves a Dynamic Time Warping technique, a similarity measure able to provide a non-linear transformation on the compared sequences, stretching them to minimize and quantify their differences. The DTW has been applied as guideline for two K-medoid clustering approaches and for the training and test of two FFNNs. The blink and double blink identification algorithms have been validated on LIFE dataset, where they have achieved a detection sensitivity equal to 87%. The EEG algorithm, instead, is mainly based on the analysis in the frequency domain, extracting, from PSDs, relevant characteristics related to the theta, alpha and beta bands, after that the signals have been submitted to an artefact removal approach (ICA). EOG and EEG features obtained have been used as inputs for supervised models, that share a regressive approach able to compute the vigilance level in a continuous scale between 0 and 1. Among the chosen models, SVR with linear kernel, Random forest and Gradient Boosting have performed better than the others. Moreover, inputs composed by EOG and EEG features together have reached the best prediction accuracies and the best correlations with the target in the SEED-VIG dataset. In particular, Random Forest model has prevailed on the others in most of the cases (the best accuracy for SEED-VIG is equal to 89,41% with the model trained on both EEG and EOG features and equal to 98,02% in the VIGILANCE dataset, with the model trained only on the EEG features extracted from the firsts 10 minutes of the test).| File | Dimensione | Formato | |
|---|---|---|---|
|
2021_04_CERRUTI.pdf
accessibile in internet solo dagli utenti autorizzati
Descrizione: Testo della tesi
Dimensione
8.03 MB
Formato
Adobe PDF
|
8.03 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/174134