Raman spectroscopy offers a powerful, non-invasive approach to probe molecular composition of biological and chemical systems. However, challenges in interpreting Raman data—such as spectral sparsity, noise interference, and the complexity of vibrational modes—limit its widespread application in high-throughput and real-time analysis. Traditional computational methods often do not succeed in understanding the complex relationships between molecular structures and their Raman spectra, impeding advancements in molecular characterization and diagnostics. This thesis introduces an innovative framework that leverages deep learning to bridge this gap, enhancing the predictive capabilities and interpretability of Raman spectral data. The first part of the thesis describes Mol2Raman, a graph-based neural network designed to predict spontaneous Raman spectra directly from molecular structures. By integrating a Graph Isomorphism Network with edge features (GINE), the model effectively encodes atomic and bond-level information, enabling accurate prediction of both peak positions and intensities. Benchmarking against existing models, it demonstrates its higher performance across multiple evaluation metrics. The model also shows strong generalization capabilities, maintaining high accuracy when predicting spectra for structurally diverse and previously unseen molecules. To address persistent issues in Coherent Anti-Stokes Raman Scattering (CARS) microscopy, the thesis further develops advanced deep learning models for the removal of Non-Resonant Background (NRB). These models improve the clarity and accuracy of CARS images, preserving essential spectral information while reducing background interference. This advancement allows a more reliable and detailed molecular imaging. In addition, this work explores the application of deep learning for classifying cellular senescence using multimodal nonlinear optical (NLO) microscopy data. By combining hyperspectral information with morphological features, the developed models achieve high accuracy in distinguishing senescent from proliferative cells, offering insights into aging and cancer progression. Furthermore, the thesis investigates the morpho-molecular dynamics of embryonic stem cells during early differentiation, employing Raman spectroscopy and tomographic phase microscopy to reveal critical transitions during pluripotency exit. This thesis therefore proves the potential of integrating deep learning with biophotonic techniques to overcome some limitations in molecular analysis. The developed models not only improve the accuracy and efficiency of Raman spectral prediction but also enhance molecular imaging and cell state classification. These contributions hold significant promise for advancing biomedical research, clinical diagnostics, and material science, providing new tools for exploring complex biological systems in a non-invasive and highly informative manner.
La spettroscopia Raman rappresenta un approccio potente e non invasivo per analizzare la composizione molecolare di sistemi biologici e chimici. Tuttavia, diverse sfide nell’interpretazione dei dati Raman—come la scarsità spettrale, l’interferenza del rumore e la complessità dei modi vibrazionali—ne limitano l’applicazione diffusa nelle analisi ad alto rendimento e in tempo reale. I metodi computazionali tradizionali spesso non riescono a comprendere le relazioni complesse tra le strutture molecolari e i loro spettri Raman, ostacolando i progressi nella caratterizzazione molecolare e nella diagnostica. Questa tesi introduce un framework innovativo che sfrutta il deep learning per colmare questa lacuna, migliorando le capacità predittive e l’interpretabilità dei dati spettrali Raman. La prima parte della tesi descrive Mol2Raman, una rete neurale basata su grafi progettata per prevedere gli spettri Raman spontanei direttamente a partire dalle strutture molecolari. Integrando una Graph Isomorphism Network con informazioni sui legami chimici (GINE), il modello codifica in modo efficace i dettagli atomici e di connettività, consentendo la previsione accurata sia delle posizioni dei picchi che delle loro intensità. Confrontato con modelli esistenti, dimostra prestazioni superiori su diverse metriche di valutazione e un'elevata capacità di generalizzazione, mantenendo un'accuratezza elevata anche nella previsione degli spettri di molecole strutturalmente diverse e mai viste in fase di addestramento. Per affrontare le problematiche persistenti nella microscopia Coherent Anti-Stokes Raman Scattering (CARS), la tesi sviluppa inoltre modelli avanzati di deep learning per la rimozione del Non-Resonant Background (NRB). Questi modelli migliorano la chiarezza e l'accuratezza delle immagini CARS, preservando le informazioni spettrali essenziali e riducendo le interferenze di fondo, permettendo così un’analisi molecolare più affidabile e dettagliata. In aggiunta, questo lavoro esplora l’applicazione del deep learning nella classificazione della senescenza cellulare utilizzando dati di microscopia ottica non lineare (NLO) multimodale. Combinando informazioni iperspettrali con caratteristiche morfologiche, i modelli sviluppati raggiungono un’elevata accuratezza nella distinzione tra cellule senescenti e proliferative, offrendo nuove prospettive sui processi di invecchiamento e progressione tumorale. Inoltre, la tesi indaga le dinamiche morfo-molecolari delle cellule staminali embrionali durante le prime fasi di differenziamento, impiegando la spettroscopia Raman e la microscopia a fase tomografica per rivelare transizioni critiche durante l’uscita dallo stato di pluripotenza. Questa tesi dimostra quindi il potenziale dell'integrazione tra deep learning e tecniche biofotoniche nel superare alcune limitazioni dell'analisi molecolare. I modelli sviluppati non solo migliorano l'accuratezza e l'efficienza della previsione degli spettri Raman, ma potenziano anche l'imaging molecolare e la classificazione degli stati cellulari. Questi contributi offrono prospettive promettenti per il progresso della ricerca biomedica, della diagnostica clinica e della scienza dei materiali, fornendo nuovi strumenti per esplorare sistemi biologici complessi in modo non invasivo e altamente informativo.
Deep learning analytics and interpretability of molecular level through system level biophotonics data
Sorrentino, Salvatore
2024/2025
Abstract
Raman spectroscopy offers a powerful, non-invasive approach to probe molecular composition of biological and chemical systems. However, challenges in interpreting Raman data—such as spectral sparsity, noise interference, and the complexity of vibrational modes—limit its widespread application in high-throughput and real-time analysis. Traditional computational methods often do not succeed in understanding the complex relationships between molecular structures and their Raman spectra, impeding advancements in molecular characterization and diagnostics. This thesis introduces an innovative framework that leverages deep learning to bridge this gap, enhancing the predictive capabilities and interpretability of Raman spectral data. The first part of the thesis describes Mol2Raman, a graph-based neural network designed to predict spontaneous Raman spectra directly from molecular structures. By integrating a Graph Isomorphism Network with edge features (GINE), the model effectively encodes atomic and bond-level information, enabling accurate prediction of both peak positions and intensities. Benchmarking against existing models, it demonstrates its higher performance across multiple evaluation metrics. The model also shows strong generalization capabilities, maintaining high accuracy when predicting spectra for structurally diverse and previously unseen molecules. To address persistent issues in Coherent Anti-Stokes Raman Scattering (CARS) microscopy, the thesis further develops advanced deep learning models for the removal of Non-Resonant Background (NRB). These models improve the clarity and accuracy of CARS images, preserving essential spectral information while reducing background interference. This advancement allows a more reliable and detailed molecular imaging. In addition, this work explores the application of deep learning for classifying cellular senescence using multimodal nonlinear optical (NLO) microscopy data. By combining hyperspectral information with morphological features, the developed models achieve high accuracy in distinguishing senescent from proliferative cells, offering insights into aging and cancer progression. Furthermore, the thesis investigates the morpho-molecular dynamics of embryonic stem cells during early differentiation, employing Raman spectroscopy and tomographic phase microscopy to reveal critical transitions during pluripotency exit. This thesis therefore proves the potential of integrating deep learning with biophotonic techniques to overcome some limitations in molecular analysis. The developed models not only improve the accuracy and efficiency of Raman spectral prediction but also enhance molecular imaging and cell state classification. These contributions hold significant promise for advancing biomedical research, clinical diagnostics, and material science, providing new tools for exploring complex biological systems in a non-invasive and highly informative manner.File | Dimensione | Formato | |
---|---|---|---|
Salvatore_Sorrentino_PhD_Thesis.pdf
accessibile in internet per tutti a partire dal 28/02/2026
Dimensione
28.31 MB
Formato
Adobe PDF
|
28.31 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/234334