This thesis, a collaborative effort between Stockholm’s KTH Royal Institute of Technol- ogy and Politecnico di Milano, explores the development and evaluation of a multimodal system for survival prediction in clinical settings, a complex real-life scenario, leverag- ing both structured electronic health records and unstructured clinical notes. The core objective is to enhance the accuracy and reliability of survival predictions in Intensive Care Units, by integrating diverse data types through advanced machine learning models. The system combines the novel architecture of Tabular Transformers, adapted to pro- cess structured data such as patient demographics, medical history, and diagnoses, with Multi-Layer Perceptrons processing text embeddings obtained from BioClinicalBERT, a specialized model for clinical narratives. The integration of these models is aimed at capturing the complex and multifaceted nature of patient profiles, thereby improving prediction performance. The finalized sys- tem, a Linear Regression instance which aggregates information from both the Tabular Transformer, BioClinicalBERT, and the Multi-Layer Perceptron, demonstrates superior performance on evaluation metrics, which highlight the system’s ability to accurately identify high-risk patients. Comprehensive benchmarking against decision trees and various configurations underscored the robustness of the proposed system. This research highlights the transformative potential of multimodal data integration in medical predictive modeling, and the unexplored capabilities of Tabular Transformers in the field, paving the way for more comprehensive and accurate clinical decision support tools that can significantly impact patient care and outcomes. By predicting the survival of patients in the ICU, medical professionals can guide their efforts and prioritization of patient care, enabling more efficient and targeted allocation of medical timing and resources during triage.
Questa tesi, frutto di collaborazione tra il KTH Royal Institute of Technology di Stoccolma e il Politecnico di Milano, esplora lo sviluppo e la valutazione di un sistema multimodale per la previsione della sopravvivenza in contesti clinici, uno scenario realistico e complesso, sfruttando l’informazione strutturata presente nelle cartelle cliniche elettroniche e le note testuali redatte dai medici. L’obiettivo principale è migliorare l’accuratezza e l’affidabilità delle previsioni di sopravvivenza nei reparti di Terapia Intensiva, integrando diversi tipi di dati tramite modelli avanzati di machine learning. Il sistema combina Tabular Trans- formers, adattati per elaborare dati strutturati come la demografia dei pazienti, la storia medica e le diagnosi, con Multi-Layer Perceptrons che elaborano embeddings di testo ot- tenuti da BioClinicalBERT, un modello specializzato per i testi clinici. L’integrazione di questi modelli mira a catturare la natura complessa e sfaccettata dei profili dei pazienti, migliorando così le prestazioni predittive. Il sistema finalizzato aggrega i risultati ottenuti sfruttando l’innovativa architettura Tabular Transformer, il modello BioClinicalBERT, e un Multi-Layer Perceptron, i quali mostrano prestazioni superiori sulle metriche di valu- tazione, evidenziando la capacità del sistema di identificare accuratamente i pazienti ad alto rischio. Un benchmarking completo contro alberi decisionali e varie analoghe configurazioni ha sottolineato la robustezza del framework proposto. Questa ricerca evidenzia il potenziale trasformativo dell’integrazione di dati multimodali nella modellazione predittiva medica, e le capacità inesplorate dei Tabular Transformers nel campo, aprendo la strada a strumenti di supporto decisionale clinico più completi e accurati. Questi ultimi possono avere un impatto significativo sulla cura dei pazienti: prevedendo la sopravvivenza degli accolti in Terapia Intensiva, i professionisti medici possono guidare i loro sforzi e la priorizzazione delle cure, consentendo una allocazione di tempi e risorse più efficiente e mirata durante il triage.
Enhancing multimodal systems for survival prediction with tabular transformers
Insalata, Beatrice
2023/2024
Abstract
This thesis, a collaborative effort between Stockholm’s KTH Royal Institute of Technol- ogy and Politecnico di Milano, explores the development and evaluation of a multimodal system for survival prediction in clinical settings, a complex real-life scenario, leverag- ing both structured electronic health records and unstructured clinical notes. The core objective is to enhance the accuracy and reliability of survival predictions in Intensive Care Units, by integrating diverse data types through advanced machine learning models. The system combines the novel architecture of Tabular Transformers, adapted to pro- cess structured data such as patient demographics, medical history, and diagnoses, with Multi-Layer Perceptrons processing text embeddings obtained from BioClinicalBERT, a specialized model for clinical narratives. The integration of these models is aimed at capturing the complex and multifaceted nature of patient profiles, thereby improving prediction performance. The finalized sys- tem, a Linear Regression instance which aggregates information from both the Tabular Transformer, BioClinicalBERT, and the Multi-Layer Perceptron, demonstrates superior performance on evaluation metrics, which highlight the system’s ability to accurately identify high-risk patients. Comprehensive benchmarking against decision trees and various configurations underscored the robustness of the proposed system. This research highlights the transformative potential of multimodal data integration in medical predictive modeling, and the unexplored capabilities of Tabular Transformers in the field, paving the way for more comprehensive and accurate clinical decision support tools that can significantly impact patient care and outcomes. By predicting the survival of patients in the ICU, medical professionals can guide their efforts and prioritization of patient care, enabling more efficient and targeted allocation of medical timing and resources during triage.File | Dimensione | Formato | |
---|---|---|---|
2024_7_Insalata.pdf
accessibile in internet per tutti
Descrizione: Tesi
Dimensione
2.44 MB
Formato
Adobe PDF
|
2.44 MB | Adobe PDF | Visualizza/Apri |
2024_7_Insalata_Executive_Summary.pdf
accessibile in internet per tutti
Descrizione: Executive Summary
Dimensione
562.28 kB
Formato
Adobe PDF
|
562.28 kB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/222515