Medical imaging constitutes a cornerstone of modern clinical practice, embedding not only pixel data but also heterogeneous metadata describing technical, clinical, administrative, and acquisition protocol information. The Digital Imaging and Communications in Medicine (DICOM) standard has become the cornerstone for medical imaging by enabling interoperability across heterogeneous systems and ensuring the reliable exchange of diagnostic data. Nevertheless, the file–centric architecture of DICOM imposes structural limitations on large–scale metadata management, hindering effective data discovery, semantic integration, and advanced analytics. This thesis investigates the representation of DICOM metadata within a graph–based database model, specifically leveraging Neo4j as a graph database management system (GDBMS). The proposed approach demonstrates how graph structures facilitate flexible metadata organization, efficient query execution, and semantic interoperability, thereby enhancing data governance and supporting FAIR (Findable, Accessible, Interoperable, Reusable) principles. Furthermore, the study highlights the benefits of this paradigm for data catalog construction, integration of multi–modal datasets, and the enablement of artificial intelligence applications in clinical domains. Experimental results on real–world datasets confirm the feasibility and scalability of the model, offering novel perspectives for the design of healthcare data platforms.
L’imaging medico rappresenta un pilastro della pratica clinica moderna, includendo non solo i dati di immagine ma anche metadati eterogenei di natura tecnica, clinica, amministrativa e legata ai protocolli di acquisizione. Lo standard Digital Imaging and Communications in Medicine (DICOM) è divenuto il punto di riferimento per l’imaging medico, consentendo l’interoperabilità tra sistemi eterogenei e garantendo lo scambio affidabile di dati diagnostici. Tuttavia, l’architettura file–centrica del DICOM introduce limitazioni strutturali nella gestione su larga scala dei metadati, ostacolando un’efficace scoperta dei dati, l’integrazione semantica e le analisi avanzate. La presente tesi analizza la rappresentazione dei metadati DICOM all’interno di un modello basato su database a grafo, sfruttando in particolare Neo4j come sistema di gestione di basi di dati a grafo (GDBMS). L’approccio proposto dimostra come le strutture a grafo favoriscano un’organizzazione flessibile dei metadati, l’esecuzione efficiente delle query e l’interoperabilità semantica, migliorando così la data governance e supportando i principi FAIR (Findable, Accessible, Interoperable, Reusable). Inoltre, lo studio mette in evidenza i vantaggi di tale paradigma per la costruzione di data catalog, l’integrazione di dataset multi–modali e l’abilitazione di applicazioni di intelligenza artificiale in ambito clinico. I risultati sperimentali su dataset reali confermano la fattibilità e la scalabilità del modello, offrendo nuove prospettive per la progettazione di piattaforme dati sanitarie.
Graph-based representation of clinical images metadata: a Neo4j implementation
Auletta, Lorenzo
2024/2025
Abstract
Medical imaging constitutes a cornerstone of modern clinical practice, embedding not only pixel data but also heterogeneous metadata describing technical, clinical, administrative, and acquisition protocol information. The Digital Imaging and Communications in Medicine (DICOM) standard has become the cornerstone for medical imaging by enabling interoperability across heterogeneous systems and ensuring the reliable exchange of diagnostic data. Nevertheless, the file–centric architecture of DICOM imposes structural limitations on large–scale metadata management, hindering effective data discovery, semantic integration, and advanced analytics. This thesis investigates the representation of DICOM metadata within a graph–based database model, specifically leveraging Neo4j as a graph database management system (GDBMS). The proposed approach demonstrates how graph structures facilitate flexible metadata organization, efficient query execution, and semantic interoperability, thereby enhancing data governance and supporting FAIR (Findable, Accessible, Interoperable, Reusable) principles. Furthermore, the study highlights the benefits of this paradigm for data catalog construction, integration of multi–modal datasets, and the enablement of artificial intelligence applications in clinical domains. Experimental results on real–world datasets confirm the feasibility and scalability of the model, offering novel perspectives for the design of healthcare data platforms.| File | Dimensione | Formato | |
|---|---|---|---|
|
2025_10_Auletta_01.pdf
accessibile in internet solo dagli utenti autorizzati
Descrizione: Tesi
Dimensione
4.81 MB
Formato
Adobe PDF
|
4.81 MB | Adobe PDF | Visualizza/Apri |
|
2025_10_Auletta_02.pdf
accessibile in internet per tutti
Descrizione: Executive Summary
Dimensione
516 kB
Formato
Adobe PDF
|
516 kB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/243925