Molecular interactions and biological systems are inherently complex, requiring advanced methods and innovative approaches to decipher them. Systems biology is a growing field in biology which addresses this complexity by combining experimental data from multiple molecular layers with computational methods to explore the interaction across the omics cascade – the path of genetic information from genes to transcripts, proteins, and metabolites. A central strategy in systems biology is multi-omics integration, the use of multiple omics platforms to capture the dynamics of a system as a whole. This thesis provides an extensive review of omics integration methods, detailing the primary techniques employed in recent literature and emphasizing the significant challenges and limitations of these approaches. The following chapters are focused on investigating omics integration strategies in Neurodegenerative Diseases (NDs). NDs are the leading cause of disability and the second leading cause of death worldwide. As the population ages, these disorders are becoming an increasingly pressing concern. These disorders are characterized by significant clinical and molecular heterogeneity, making them particularly suitable for omics integration strategies. This thesis explores the role of multi-omics integration in NDs, with a focus on methodological innovation and network-based strategies. Specifically, the thesis focuses on data-driven, network-based approaches to integrate transcriptomics, epigenomics, proteomics, and metabolomics in NDs. The thesis is organized around three specific research studies, each of which corresponds to a specific aim and thesis chapter. The first objective is to identify disease-specific co-regulatory patterns through correlation networks. The second objective is to characterize the influence of covariates on molecular variation in ALS-derived motor neuron models. The third objective is to integrate metabolomics and proteomics data to uncover metabolic underpinnings of clinical phenotypes in ALS. In the first study, we analyzed proteomics data from peripheral blood mononuclear cells (PBMCs) of Multiple System Atrophy Cerebellar type (MSA-C), Spinocerebellar Ataxia type 2 (SCA2) patients, and healthy controls using Gaussian graphical models and graphical Lasso. This analysis revealed distinct protein co-regulation networks. These networks outperformed traditional multivariate techniques by identifying disease-specific interactions and highlighting potential diagnostic markers. The second study integrates transcriptomic, epigenomic, and proteomic data from induced Pluripotent Stem Cells (iPSC)-derived motor neurons to assess the influence of some covariates that were identified in single-omics previous studies in a population of Amyotrophic Lateral Sclerosis (ALS) patients. Finally, a subset of lipids from a metabolomics panel in plasma from the same ALS patients was investigated in relation to clinical outcomes. This data was then integrated with proteomics data, revealing connections between lipid metabolism and ALS-relevant pathways. In addition to these three studies, the thesis presents EASY-FIA, a user-friendly, standalone software tool designed to preprocess data from high-resolution mass spectrometry-based flow injection analysis (FIA-HRMS) experiments. Due to the absence of accessible, automated tools in the FIA domain, EASY-FIA is a valuable resource for metabolomics researchers, as it streamlines the conversion of raw data into meaningful metabolite profiles. The tool's accuracy and usability were validated through case studies in real-world scenarios. In conclusion, this thesis provides a comprehensive exploration of multi-omics integration strategies, which were specifically applied to NDs. Network-based strategies emerged as particularly effective in capturing the complex interplay among omics layers. Taken together, these findings show that integrating multiple omics layers is not just a methodological improvement, but a necessary approach to better understand and address diseases in the context of modern precision medicine.
Le interazioni molecolari e i sistemi biologici sono intrinsecamente complessi, e richiedono metodi avanzati e approcci innovativi per poterli decifrare. La biologia dei sistemi (systems biology) è un campo in espansione che affronta questa complessità combinando dati sperimentali da diversi strati molecolari con metodi computazionali, per esplorare le interazioni lungo la cascata delle omiche (omics cascade) – il percorso dell’informazione genetica dai geni ai trascritti, alle proteine e ai metaboliti. Una strategia centrale nella biologia dei sistemi è l’integrazione multi-omica (multi-omics integration), ovvero l’utilizzo di più dati omici per catturare la dinamica di un sistema nel suo insieme. Questa tesi fornisce un’ampia panoramica dei metodi di integrazione omica, descrivendo in dettaglio le principali tecniche impiegate nella letteratura recente e sottolineando le principali sfide e limiti di questi approcci. I capitoli seguenti sono focalizzati sull’indagine delle strategie di integrazione omica nelle Malattie Neurodegenerative (Neurological diseases - ND). Le ND rappresentano la principale causa di disabilità e la seconda causa di morte a livello mondiale. Con l’invecchiamento della popolazione, questi disturbi stanno diventando una problematica sempre più urgente. Le ND sono caratterizzate da una significativa eterogeneità clinica e molecolare, che le rende particolarmente adatte a strategie di integrazione omica. Questa tesi esplora il ruolo dell’integrazione multi-omica nelle ND, con un focus sull’innovazione metodologica e sulle strategie basate su reti (network). In particolare, la tesi si concentra su approcci guidati dai dati (data-driven) e basati su reti per integrare trascrittomica, epigenomica, proteomica e metabolomica nelle ND. La tesi è basata su tre studi specifici, ognuno dei quali corrisponde a un obiettivo e a un capitolo della tesi. Il primo obiettivo è identificare specifici pattern di co-regolazione attraverso reti di correlazione. Il secondo obiettivo è caratterizzare l’influenza di covariate sulla variazione molecolare in modelli di motoneuroni derivati da ALS. Il terzo obiettivo è integrare dati di metabolomica e proteomica per identificare i meccanismi metabolici alla base di fenotipi clinici nell’ALS. Nel primo studio, sono stati analizzati dati di proteomica provenienti da cellule mononucleate del sangue periferico (Peripheral Blood Mononuclear Cells - PBMC) di pazienti con Atrofia Multisistemica di tipo cerebellare (Cerebellar Multiple Systems Atrophy - MSA-C), Atassia Spinocerebellare di tipo 2 (Spinocerebellar Ataxia Type 2 - SCA2) e controlli sani utilizzando modelli grafici gaussiani e graphical lasso. Questa analisi ha rivelato reti distinte di co-regolazione proteica, che hanno permesso di superare i limiti delle tecniche multivariate tradizionali nell’identificare interazioni specifiche della malattia ed evidenziare potenziali marker diagnostici. Il secondo studio integra dati trascrittomici, epigenomici e proteomici da motoneuroni derivati da cellule staminali pluripotenti indotte (induced Pluripotent Stem Cells - iPSC) per valutare l’influenza di alcune covariate identificate in precedenti studi a singola omica in una popolazione di pazienti con Sclerosi Laterale Amiotrofica (ALS). Infine, è stato analizzato un sottoinsieme di lipidi da un pannello di metaboliti nel plasma degli stessi pazienti ALS in relazione a parametri clinici. Questi dati sono stati poi integrati con dati proteomici, rivelando connessioni tra metabolismo lipidico e pathway rilevanti per l’ALS. Oltre a questi tre studi, la tesi presenta EASY-FIA, un software autonomo e user-friendly, progettato per il pre-processing di dati da esperimenti di analisi a iniezione di flusso basata su spettrometria di massa ad alta risoluzione (Flow Injection Analysis High Resolution Mass Spectrometry - FIA-HRMS). A causa dell’assenza di strumenti automatici e accessibili nel campo del FIA, EASY-FIA rappresenta una risorsa preziosa per i ricercatori in metabolomica, poiché semplifica la conversione dei dati grezzi in dati metabolici significativi. L’accuratezza e l’usabilità dello strumento sono state validate attraverso casi studio in scenari reali. In conclusione, questa tesi fornisce un’esplorazione completa delle strategie di integrazione multi-omica, applicate in modo specifico alle ND. Le strategie basate su network si sono rivelate particolarmente efficaci nel catturare la complessa interazione tra i livelli omici. Nel complesso, questi risultati dimostrano che integrare diverse omiche non è solo un miglioramento metodologico, ma un approccio necessario per comprendere e affrontare meglio le malattie nel contesto della medicina di precisione moderna.
Unravelling the omics cascade through systems biology: methodological strategies and integration approaches in neurological diseases
MORABITO, AURELIA
2024/2025
Abstract
Molecular interactions and biological systems are inherently complex, requiring advanced methods and innovative approaches to decipher them. Systems biology is a growing field in biology which addresses this complexity by combining experimental data from multiple molecular layers with computational methods to explore the interaction across the omics cascade – the path of genetic information from genes to transcripts, proteins, and metabolites. A central strategy in systems biology is multi-omics integration, the use of multiple omics platforms to capture the dynamics of a system as a whole. This thesis provides an extensive review of omics integration methods, detailing the primary techniques employed in recent literature and emphasizing the significant challenges and limitations of these approaches. The following chapters are focused on investigating omics integration strategies in Neurodegenerative Diseases (NDs). NDs are the leading cause of disability and the second leading cause of death worldwide. As the population ages, these disorders are becoming an increasingly pressing concern. These disorders are characterized by significant clinical and molecular heterogeneity, making them particularly suitable for omics integration strategies. This thesis explores the role of multi-omics integration in NDs, with a focus on methodological innovation and network-based strategies. Specifically, the thesis focuses on data-driven, network-based approaches to integrate transcriptomics, epigenomics, proteomics, and metabolomics in NDs. The thesis is organized around three specific research studies, each of which corresponds to a specific aim and thesis chapter. The first objective is to identify disease-specific co-regulatory patterns through correlation networks. The second objective is to characterize the influence of covariates on molecular variation in ALS-derived motor neuron models. The third objective is to integrate metabolomics and proteomics data to uncover metabolic underpinnings of clinical phenotypes in ALS. In the first study, we analyzed proteomics data from peripheral blood mononuclear cells (PBMCs) of Multiple System Atrophy Cerebellar type (MSA-C), Spinocerebellar Ataxia type 2 (SCA2) patients, and healthy controls using Gaussian graphical models and graphical Lasso. This analysis revealed distinct protein co-regulation networks. These networks outperformed traditional multivariate techniques by identifying disease-specific interactions and highlighting potential diagnostic markers. The second study integrates transcriptomic, epigenomic, and proteomic data from induced Pluripotent Stem Cells (iPSC)-derived motor neurons to assess the influence of some covariates that were identified in single-omics previous studies in a population of Amyotrophic Lateral Sclerosis (ALS) patients. Finally, a subset of lipids from a metabolomics panel in plasma from the same ALS patients was investigated in relation to clinical outcomes. This data was then integrated with proteomics data, revealing connections between lipid metabolism and ALS-relevant pathways. In addition to these three studies, the thesis presents EASY-FIA, a user-friendly, standalone software tool designed to preprocess data from high-resolution mass spectrometry-based flow injection analysis (FIA-HRMS) experiments. Due to the absence of accessible, automated tools in the FIA domain, EASY-FIA is a valuable resource for metabolomics researchers, as it streamlines the conversion of raw data into meaningful metabolite profiles. The tool's accuracy and usability were validated through case studies in real-world scenarios. In conclusion, this thesis provides a comprehensive exploration of multi-omics integration strategies, which were specifically applied to NDs. Network-based strategies emerged as particularly effective in capturing the complex interplay among omics layers. Taken together, these findings show that integrating multiple omics layers is not just a methodological improvement, but a necessary approach to better understand and address diseases in the context of modern precision medicine.File | Dimensione | Formato | |
---|---|---|---|
AureliaMorabito_PhD_Thesis.pdf
accessibile in internet per tutti a partire dal 04/07/2026
Descrizione: Tesi
Dimensione
17.3 MB
Formato
Adobe PDF
|
17.3 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/241297