A Bayesian approach for network estimation with noisy data

Biblioteche e Archivi
POLITesi - Archivio digitale delle tesi di laurea e di dottorato

Although most current techniques for network analysis assume that the network data available are reliable, many real world data in fact contain measurement errors. The thesis tackles the problem of estimating a network from multiple noisy observations where the edges of the true underlying network are recorded with both false positives and false negatives. The aim is to infer the true structure and properties of the undetected network by using a Bayesian approach. We derive an Expectation-Maximization method and a Markov chain Monte Carlo algorithm to sample from the posterior distribution of the parameters given the data. In order to test the set of statistical methods proposed we consider both some simulated data and a financial set of real networks consisting of connections between European financial institutions in the period 2003-2013. All the algorithms are implemented using software R, while the igraph package is used to analyse and represent the networks.

Sebbene la maggior parte delle tecniche attuali per l'analisi delle reti (networks) presupponga che i dati disponibili siano affidabili, nel mondo reale molti dati contengono in realtà errori di misurazione. La tesi affronta il problema di stimare una rete utilizzando più osservazioni soggette a rumore, in cui gli archi (edges) della vera rete ignota sono soggetti alla presenza di falsi positivi e falsi negativi. L'obiettivo è la stima della vera rete sottostante e delle sue proprietà, utilizzando un approccio Bayesiano. Ricaviamo un metodo di Expectation-Maximization e un algoritmo Markov chain Monte Carlo per campionare dalla distribuzione a posteriori dei parametri date le osservazioni acquisite. Al fine di testare i metodi statistici proposti, consideriamo sia alcuni dati simulati sia un set di dati finanziari reali, costituiti da collegamenti tra alcune istituzioni finanziarie europee nel periodo 2003-2013. Tutti gli algoritmi sono implementati usando il software R, mentre il pacchetto igraph è utilizzato per analizzare e rappresentare le reti.

A Bayesian approach for network estimation with noisy data

BORSANI, LUCA

2018/2019

Abstract

Although most current techniques for network analysis assume that the network data available are reliable, many real world data in fact contain measurement errors. The thesis tackles the problem of estimating a network from multiple noisy observations where the edges of the true underlying network are recorded with both false positives and false negatives. The aim is to infer the true structure and properties of the undetected network by using a Bayesian approach. We derive an Expectation-Maximization method and a Markov chain Monte Carlo algorithm to sample from the posterior distribution of the parameters given the data. In order to test the set of statistical methods proposed we consider both some simulated data and a financial set of real networks consisting of connections between European financial institutions in the period 2003-2013. All the algorithms are implemented using software R, while the igraph package is used to analyse and represent the networks.

Scheda breve

Scheda completa

	Relatore
	
				BASSETTI, FEDERICO
			
	Correlatore/i
	
				EPIFANI, ILENIA
			
	Scuola / Dip.
	
				ING  - Scuola di Ingegneria Industriale e dell'Informazione
			
	Data
	
				16-apr-2019
			
	Anno accademico
	
				2018/2019
			
	Abstract in italiano
	
				Sebbene la maggior parte delle tecniche attuali per l'analisi delle reti (networks) presupponga che i dati disponibili siano affidabili, nel mondo reale molti dati contengono in realtà errori di misurazione.
La tesi affronta il problema di stimare una rete utilizzando più osservazioni soggette a rumore, in cui gli archi (edges) della vera rete ignota sono soggetti alla presenza di falsi positivi e falsi negativi. L'obiettivo è la stima della vera rete sottostante e delle sue proprietà, utilizzando un approccio Bayesiano. Ricaviamo un metodo di Expectation-Maximization e un algoritmo Markov chain Monte Carlo per campionare dalla distribuzione a posteriori dei parametri date le osservazioni acquisite. Al fine di testare i metodi statistici proposti, consideriamo sia alcuni dati simulati sia un set di dati finanziari reali, costituiti da collegamenti tra alcune istituzioni finanziarie europee nel periodo 2003-2013.	               
Tutti gli algoritmi sono implementati usando il software R, mentre il pacchetto igraph è utilizzato per analizzare e rappresentare le reti.
			
	Tipo di documento
	
				Tesi di laurea Magistrale
			
	Appare nelle tipologie:
	
				Tesi di laurea Magistrale

File allegati

File	Dimensione	Formato
2019_04_Borsani.pdf accessibile in internet solo dagli utenti autorizzati Descrizione: Testo della tesi Dimensione 6.84 MB Formato Adobe PDF Visualizza/Apri	6.84 MB	Adobe PDF	Visualizza/Apri

I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10589/146085