Speech processing is made possible through a complex interplay of bottom-up and top-down information gathering strategies. Remarkably, the motor system seems to be the origin of top-down signals in the brain to selectively tune the listener’s attention during speech perception. Indeed, during speech listening, the brain infers and reconstructs the movement of invisible articulatory features associated to speech production. Here we investigated how this reconstructive process affects the speech comprehension and perception of acoustic stimuli. We therefore investigated the following research question: “How does the top-down inference on articulatory features associated to speech production affect speech perception?”. In order to answer the aforementioned question, an EEG study has been performed involving 23 participants listening to 50 sentences for which the articulatory description of lips, jaws and tongue was recorded via ElectroMagneticArticulography (EMA) and consequently reduced in their dimensionality through Principal Component Analysis (PCA). To encourage participants to attentively listen to the sentences, a speech rhyming task was designed to be performed after the presentation of every stimulus. We a-posteriori split the trials depending on the difficulty of the presented acoustic stimulus (Easy stimuli vs Hard stimuli) and based on subject-specific speech comprehension performance (Good performance vs Bad performance). We firstly employed Gaussian Copula Mutual Information (GCMI) estimator to evaluate the coupling between the speech-related cues and the neural activity and most importantly Partial Information Decomposition (PID) to disentangle unique, synergistic, and redundant contributions of acoustic and motor signals. The results were then statistically evaluated, one condition against the other using a nonparametric cluster-based statistics accounting for the Multiple Comparison Problem (MCP). Nothing significant was found in the performance-based split; however, the results of the study highlighted that only motor activity, specifically tongue movement, lip protrusion and mouth opening/closing, contain significant unique information about the difficulty of the presented stimulus. In the delta band, much information is encoded in the Easy Stimuli condition, while in the theta band, the information is in favor of the Harder Stimuli. This study hence allowed us to shed light into a phenomenon of frequency dissociation in the encoding of the invisible articulatory movement, thus suggesting a completely different role of these bands in the process of top-down speech reconstruction during speech perception of acoustic stimuli with variable difficulty.
L'elaborazione del linguaggio si basa sulla complessa interazione di strategie di gestione dell’ informazione di tipo bottom-up e top-down. I segnali top-down nel cervello sembrano originare dal sistema motorio, e vengono utilizzati per sintonizzare selettivamente l'attenzione durante l’ascolto di un discorso. Infatti, in questa fase, il cervello deduce e ricostruisce le caratteristiche articolatorie legate al movimento di bocca e lingua, relative alla produzione del linguaggio. In questo progetto ci siamo posti l’obiettivo di studiare come questo processo ricostruttivo influisca sulla comprensione del linguaggio e sulla percezione degli stimoli acustici. Specificatamente, è stata esplorata la seguente research question: “In che modo e con quali caratteristiche il meccanismo top-down di deduzione e ricostruzione dei movimenti articolatori del tratto vocale associati alla produzione del linguaggio influenza la percezione degli stimoli acustici?”. A tal fine, è stato condotto uno studio empirico in cui abbiamo analizzato l’elettroencefalografia (EEG) di 23 partecipanti durante l’ascolto di 50 frasi. Per ognuna di queste, erano stati precedentemente registrati i movimenti articolatori di labbra, mascella e lingua, utilizzando la tecnica dell’ElectroMagnetic articulography (EMA). La dimensionalità dei dati EMA è stata successivamente ridotta, utilizzando l’analisi delle componenti principali (Principal Component Analysis - PCA). Per incoraggiare i partecipanti ad ascoltare attentamente le frasi, è stato progettato un compito di rima da eseguire dopo la presentazione di ogni stimolo. Abbiamo suddiviso a posteriori i trial in base alla difficoltà dello stimolo acustico presentato (stimoli facili vs stimoli difficili) e in base alla prestazione, specifica del soggetto, nella comprensione della frase (performance buona vs performance scarsa). Abbiamo preliminarmente valutato l'accoppiamento tra i segnali relativi al parlato e l'attività neurale calcolando la mutua informazione tramite lo stimatore Gaussian Copula Mutual Information (GCMI), e successivamente abbiamo utilizzato la Partial Information Decomposition (PID), per distinguere i contributi unici, sinergici e ridondanti dei segnali acustici e motori. I risultati sono stati valutati, confrontandoli nelle diverse condizioni sperimentali, utilizzando un metodo statistico non parametrico che tiene in considerazione il Problema dei Confronti Multipli (Multiple Comparison Problem - MCP). Nulla di significativo è stato trovato nello split basato sulle performance; tuttavia, i risultati dello studio evidenziano che solo l'attività motoria, in particolare il movimento della lingua, la protrusione delle labbra e l'apertura/chiusura della bocca, contengano informazioni uniche e significative riguardo la difficoltà dello stimolo presentato. In banda delta, più informazione è codificata nella condizione degli stimoli facili, mentre in banda theta l’informazione è a favore degli stimoli più difficili. Questo studio ci ha dunque permesso di fare luce su un fenomeno di dissociazione in frequenza riguardo la codifica dei movimenti articolatori invisibili, suggerendo, pertanto, che queste bande svolgano un ruolo completamente differente nel processo di ricostruzione top-down del linguaggio durante l’ascolto di stimoli acustici di diversa difficoltà.
EEG encoding of speech acoustic and articulatory features : a partial Information decomposition study
Corsini, Alessandro
2021/2022
Abstract
Speech processing is made possible through a complex interplay of bottom-up and top-down information gathering strategies. Remarkably, the motor system seems to be the origin of top-down signals in the brain to selectively tune the listener’s attention during speech perception. Indeed, during speech listening, the brain infers and reconstructs the movement of invisible articulatory features associated to speech production. Here we investigated how this reconstructive process affects the speech comprehension and perception of acoustic stimuli. We therefore investigated the following research question: “How does the top-down inference on articulatory features associated to speech production affect speech perception?”. In order to answer the aforementioned question, an EEG study has been performed involving 23 participants listening to 50 sentences for which the articulatory description of lips, jaws and tongue was recorded via ElectroMagneticArticulography (EMA) and consequently reduced in their dimensionality through Principal Component Analysis (PCA). To encourage participants to attentively listen to the sentences, a speech rhyming task was designed to be performed after the presentation of every stimulus. We a-posteriori split the trials depending on the difficulty of the presented acoustic stimulus (Easy stimuli vs Hard stimuli) and based on subject-specific speech comprehension performance (Good performance vs Bad performance). We firstly employed Gaussian Copula Mutual Information (GCMI) estimator to evaluate the coupling between the speech-related cues and the neural activity and most importantly Partial Information Decomposition (PID) to disentangle unique, synergistic, and redundant contributions of acoustic and motor signals. The results were then statistically evaluated, one condition against the other using a nonparametric cluster-based statistics accounting for the Multiple Comparison Problem (MCP). Nothing significant was found in the performance-based split; however, the results of the study highlighted that only motor activity, specifically tongue movement, lip protrusion and mouth opening/closing, contain significant unique information about the difficulty of the presented stimulus. In the delta band, much information is encoded in the Easy Stimuli condition, while in the theta band, the information is in favor of the Harder Stimuli. This study hence allowed us to shed light into a phenomenon of frequency dissociation in the encoding of the invisible articulatory movement, thus suggesting a completely different role of these bands in the process of top-down speech reconstruction during speech perception of acoustic stimuli with variable difficulty.File | Dimensione | Formato | |
---|---|---|---|
EEG encoding of speech acoustic and articulatory features - A Partial Information Decomposition study.pdf
accessibile in internet per tutti
Dimensione
3.27 MB
Formato
Adobe PDF
|
3.27 MB | Adobe PDF | Visualizza/Apri |
EEG encoding of speech acoustic and articulatory features- A Partial Information Decomposition study_Executive summary.pdf
accessibile in internet per tutti
Dimensione
862.12 kB
Formato
Adobe PDF
|
862.12 kB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/195240