The increased availability of health data has opened new possibilities to study treatments in everyday life. The thesis examines how medical treatments are implemented and effective in the real world by integrating population-scale electronic health records (EHRs) and biobank resources. It advances two tightly linked themes: treatment dynamics (initiation, persistence, adherence) and treatment response, arguing that real-world effectiveness depends as much on the continuity and implementation of care as on biological efficacy. To support credible inference from routine data, the work emphasises scalable, harmonised, and interpretable analyses that bridge heterogeneous health systems and national contexts. Methodologically, the thesis leverages time-varying survival models for evolving, schedule-based exposures, Functional Data Analysis to represent long-horizon medication trajectories, and augments these with genetic epidemiology (genome-wide association studies and polygenic scores) when biobank data are available. Empirically, the thesis spans complementary data infrastructures and questions. Using Lombardy’s linked administrative EHRs, it quantifies the landscape and consequences of COVID-19 undervaccination with a dynamic exposure definition. In Finnish nationwide registers and linked biobanks (with replication in Estonia), it characterises medication adherence across common drug classes, separating persistence from adherence, and tests whether genetic predisposition adds explanatory or predictive value beyond social and clinical factors. Furthermore, it defines and investigates socioeconomic, health and polytherapy determinants in more than one million individuals across Finland and Italy. Finally, with FinnGen’s laboratory and genetic data, it integrates short- and long-term LDL-cholesterol responses to statins with adherence, dosage, and pharmacogenomics, showing when genetics informs baseline risk and when treatment implementation dominates observed outcomes. Across these applications, the thesis provides practical guidance for extracting and curating EHR information in Italy, reusable representations and tools for dynamic treatments, and cross-cohort comparisons that distinguish between shared regularities and context-specific features. The overarching contribution is a set of population-scale, reproducible approaches that turn routine health data into interpretable evidence on who starts, continues, and benefits from therapy; clarify the limited but targeted roles for genetics in predicting response in treated populations; and inform policies and clinical strategies to support sustained implementation, thereby moving from average efficacy toward dependable effectiveness in everyday practice.
La maggiore disponibilità di dati sanitari ha aperto nuove possibilità per studiare i trattamenti farmacologici nella vita quotidiana. La tesi esamina come i trattamenti medici vengono implementati e quanto siano efficaci nel mondo reale, integrando electronic health records (EHR) su scala di popolazione e le risorse delle biobanche. Affronta due temi strettamente connessi: le dinamiche del trattamento (avvio, persistenza, aderenza) e la risposta al trattamento, sostenendo che l'efficacia del trattamento dipenda tanto dalla continuità e dall'implementazione delle cure quanto dall'efficacia biologica. Per supportare inferenze credibili dai dati di routine, il lavoro enfatizza analisi scalabili, armonizzate e interpretabili che colmano il divario tra sistemi sanitari eterogenei e contesti nazionali. Dal punto di vista metodologico, la tesi utilizza modelli di sopravvivenza per esposizioni variabili nel tempo e l'analisi dei dati funzionali per rappresentare traiettorie terapeutiche a lungo termine, integrandole con la genetica (GWAS, score poligenici) quando sono disponibili dati provenienti da biobanche. Dal punto di vista pratico, la tesi tratta varie tipologie di dati e di trattamenti. Utilizzando electronic health records e dati amministrativi della Lombardia, quantifica il panorama e le conseguenze della sottovaccinazione contro la COVID-19 con una definizione dinamica dell'esposizione. I registri nazionali finlandesi e Finngen (con replica in Estonian Biobank) caratterizzano l'implementazione del trattamento in cinque classi di farmaci, distinguendo la persistenza dall'aderenza e verificando se la predisposizione genetica aggiunge valore esplicativo o predittivo oltre ai fattori sociali e clinici. Inoltre, definisce e indaga i determinanti socioeconomici, sanitari e politerapici in oltre un milione di individui in Finlandia e in Italia. Infine, con i dati di laboratorio e genetici, integra le risposte a breve e a lungo termine del colesterolo LDL alle statine, mostrando quanto la genetica informi il rischio di base e quanto l'implementazione del trattamento determini un risultato terapeutico ottimale. Attraverso queste applicazioni, la tesi fornisce una guida pratica per l'estrazione e l'elaborazione delle informazioni contenute in EHRs in Italia e strumenti riutilizzabili per l'utilizzo di dati su trattamenti dinamici e confronti tra coorti. Il contributo generale consiste in una serie di approcci riproducibili su scala che trasformano i dati sanitari di routine in evidenza scientifica su chi inizia, continua e beneficia della terapia. Si chiariscono anche i ruoli limitati ma mirati della genetica nella previsione della risposta nelle popolazioni trattate. Infine, si informano le decisioni e le strategie cliniche a sostegno di un'attuazione terapeutica sostenibile, passando così da un'efficacia media nella popolazione a un'efficacia personalizzata e mirata nella pratica quotidiana.
Understanding treatment dynamics and response from population electronic health records and Biobank data
Corbetta, Andrea
2025/2026
Abstract
The increased availability of health data has opened new possibilities to study treatments in everyday life. The thesis examines how medical treatments are implemented and effective in the real world by integrating population-scale electronic health records (EHRs) and biobank resources. It advances two tightly linked themes: treatment dynamics (initiation, persistence, adherence) and treatment response, arguing that real-world effectiveness depends as much on the continuity and implementation of care as on biological efficacy. To support credible inference from routine data, the work emphasises scalable, harmonised, and interpretable analyses that bridge heterogeneous health systems and national contexts. Methodologically, the thesis leverages time-varying survival models for evolving, schedule-based exposures, Functional Data Analysis to represent long-horizon medication trajectories, and augments these with genetic epidemiology (genome-wide association studies and polygenic scores) when biobank data are available. Empirically, the thesis spans complementary data infrastructures and questions. Using Lombardy’s linked administrative EHRs, it quantifies the landscape and consequences of COVID-19 undervaccination with a dynamic exposure definition. In Finnish nationwide registers and linked biobanks (with replication in Estonia), it characterises medication adherence across common drug classes, separating persistence from adherence, and tests whether genetic predisposition adds explanatory or predictive value beyond social and clinical factors. Furthermore, it defines and investigates socioeconomic, health and polytherapy determinants in more than one million individuals across Finland and Italy. Finally, with FinnGen’s laboratory and genetic data, it integrates short- and long-term LDL-cholesterol responses to statins with adherence, dosage, and pharmacogenomics, showing when genetics informs baseline risk and when treatment implementation dominates observed outcomes. Across these applications, the thesis provides practical guidance for extracting and curating EHR information in Italy, reusable representations and tools for dynamic treatments, and cross-cohort comparisons that distinguish between shared regularities and context-specific features. The overarching contribution is a set of population-scale, reproducible approaches that turn routine health data into interpretable evidence on who starts, continues, and benefits from therapy; clarify the limited but targeted roles for genetics in predicting response in treated populations; and inform policies and clinical strategies to support sustained implementation, thereby moving from average efficacy toward dependable effectiveness in everyday practice.| File | Dimensione | Formato | |
|---|---|---|---|
|
Understanding_Medication_Treatment_Dynamics_from_Population_Electronic_Health_Records_and_Biobank_Data_final.pdf
accessibile in internet per tutti
Dimensione
10.81 MB
Formato
Adobe PDF
|
10.81 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/252910