PESINet 2. A convolutional network for prosodic analysis

Biblioteche e Archivi
POLITesi - Archivio digitale delle tesi di laurea e di dottorato

Prosody is the study of intonation, tone, stress, and rhythm of the spoken language and can convey characteristics of the speaker, such as their emotional state or their attitude, but also the form of the utterance. This thesis aims to the creation of a Deep Neural Network that is able to recognize what defines a question and what are the differences in its structure from other types of phrases, using both a set of complex prosodic features and the fundamental frequency on its own, comparing their results. Deep Neural Networks are complex architectures belonging to the field of deep learning. With the recent popularity of artificial intelligence and machine learning, this kind of architecture has been employed to resolve problems of increasing difficulty that could not be solved using other algorithms. The “deep” part of the name comes from the notion that deep learning uses an unbounded number of layers of bounded size that can be not only heterogeneous but can also deviate from the connectionist models with the purpose of a more efficient and understandable network. This thesis was born with the goal of creating a question classifier using both a set of simple data and a set of complex ones, to compare them and understand what is the best approach.

La prosodia è lo studio dell'intonazione, del tono, dell'accento e del ritmo del linguaggio parlato, che può trasmettere caratteristiche non solo sul parlante, come il loro stato emozionale o il loro atteggiamento, ma anche la forma della frase pronunciata. Questa tesi ha come scopo la creazione di una Deep Neural Network in grado di riconoscere cosa definisce una domanda, quali sono le differenze nella struttura dagli altri tipi di frasi, usando sia un insieme di caratteristiche prosodiche complesse che la frequenza fondamentale da sola, comparando i loro risultati. Le Deep Neural Networks sono architetture complesse che appartengono al campo del deep learning. Con la recente popolarità delle intelligenze artificiali e del machine learning, questo tipo di architettura è stato usato per risolvere problemi di difficoltà crescente che non possono essere risolti usando algoritmi. Le reti neurali solitamente simulano il lavoro interno di un cervello umano e i loro risultati sono comparabili, se non migliori, a quelli di un esperto umano. La parte "deep" del nome deriva dalla nozione che il deep learning usa un numero senza limiti di strati di dimensione limitata che possono non solo essere eterogenei ma possono anche deviare dai modelli connessionisti con lo scopo di una rete che è più efficiente e comprensibile. Questa tesi è nata con lo scopo di creare un classificatore di domande usando sia un insieme di dati semplici che di dati complessi per compararli e capire qual è l'approccio migliore.

PESINet 2. A convolutional network for prosodic analysis

ALONGI, MARTINA

2019/2020

Abstract

Prosody is the study of intonation, tone, stress, and rhythm of the spoken language and can convey characteristics of the speaker, such as their emotional state or their attitude, but also the form of the utterance. This thesis aims to the creation of a Deep Neural Network that is able to recognize what defines a question and what are the differences in its structure from other types of phrases, using both a set of complex prosodic features and the fundamental frequency on its own, comparing their results. Deep Neural Networks are complex architectures belonging to the field of deep learning. With the recent popularity of artificial intelligence and machine learning, this kind of architecture has been employed to resolve problems of increasing difficulty that could not be solved using other algorithms. The “deep” part of the name comes from the notion that deep learning uses an unbounded number of layers of bounded size that can be not only heterogeneous but can also deviate from the connectionist models with the purpose of a more efficient and understandable network. This thesis was born with the goal of creating a question classifier using both a set of simple data and a set of complex ones, to compare them and understand what is the best approach.

Scheda breve

Scheda completa

	Relatore
	
				SBATTELLA, LICIA
			
	Correlatore/i
	
				SCOTTI, VINCENZO
TEDESCO, ROBERTO
			
	Scuola / Dip.
	
				ING  - Scuola di Ingegneria Industriale e dell'Informazione
			
	Data
	
				15-dic-2020
			
	Anno accademico
	
				2019/2020
			
	Abstract in italiano
	
				La prosodia è lo studio dell'intonazione, del tono, dell'accento e del ritmo del linguaggio parlato, che può trasmettere caratteristiche non solo sul parlante, come il loro stato emozionale o il loro atteggiamento, ma anche la forma della frase pronunciata. 
Questa tesi ha come scopo la creazione di una Deep Neural Network in grado di riconoscere cosa definisce una domanda, quali sono le differenze nella struttura dagli altri tipi di frasi, usando sia un insieme di caratteristiche prosodiche complesse che la frequenza fondamentale da sola, comparando i loro risultati. 
Le Deep Neural Networks sono architetture complesse che appartengono al campo del deep learning. Con la recente popolarità delle intelligenze artificiali e del machine learning, questo tipo di architettura è stato usato per risolvere problemi di difficoltà crescente che non possono essere risolti usando algoritmi. Le reti neurali solitamente simulano il lavoro interno di un cervello umano e i loro risultati sono comparabili, se non migliori, a quelli di un esperto umano. 
La parte "deep" del nome deriva dalla nozione che il deep learning usa un numero senza limiti di strati di dimensione limitata che possono non solo essere eterogenei ma possono anche deviare dai modelli connessionisti con lo scopo di una rete che è più efficiente e comprensibile. 
Questa tesi è nata con lo scopo di creare un classificatore di domande usando sia un insieme di dati semplici che di dati complessi per compararli e capire qual è l'approccio migliore.
			
	Appare nelle tipologie:
	
				Tesi di laurea Magistrale

File allegati

File	Dimensione	Formato
PESINet 2 — A Convolutional Network for Prosodic Analysis.pdf accessibile in internet solo dagli utenti autorizzati Descrizione: Tesi Dimensione 2.94 MB Formato Adobe PDF Visualizza/Apri	2.94 MB	Adobe PDF	Visualizza/Apri

I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10589/170268