Liposarcoma (LPS) is the second most frequent subtype of soft tissue sarcoma, originating from adipose tissue and arising mainly from the limbs and the abdominal cavity or from the retroperitoneal areas. The World Health Organization (WHO) has classified four molecular subtypes: well-differentiated LPS, WDLPS; dedifferentiated LPS, DDLPS; myxoid LPS, MLPS; pleomorphic LPS, PLPS. Each of them is characterized by a distinct morphology and by specific molecular features that influence the clinical outcome, the sensitivity to drugs and the clinical treatments. MLPS represents more than 30% of cases of LPS, it is characterized by a reticulated vascularization, by the presence of spindle/ovoid cells in a myxoid stroma and by the presence of areas with high density of round cells (RC). A fraction of RC more than 5% is associated to a worse clinical outcome. The translocation t(12;16)(q13;p11) gives rise to the FUS-CHOP fusion gene, which is the main molecular feature of this tumor. The FUS-CHOP chimera plays a main role in primary oncogenesis of MLPS due to its unusual transcriptional activity that leads to a block of adipocyte differentiation. Trabectedin is a drug characterized by a complex mechanism of action, it acts as a transcriptional regulator, establishes covalent bonds with the DNA in a way that is unique in the category of alkylating agents, influences not only tumoral cells but also the tumor microenvironment and it also acts differently in cells with deficit in the DNA repair pathway. Trabectedin has been approved for the treatment of soft tissue sarcomas by both the European Medicines Agency (EMA) and the U.S. Food and Drug Administration (FDA). Unfortunately, most patients initially sensitive to the drug develop drug resistance after prolonged treatments. This thesis has been developed in this framework in collaboration between the Politecnico di Milano and the Istituto di Ricerche Farmacologiche Mario Negri where trabectedin has been studied for years. The aim is to study the mechanism of drug-resistance arising in MLPS by investigating drug-induced changes at the genomic level through massive parallel sequencing techniques. Due to the rarity of MLPS the availability of patient biopsies is very limited. Thus, the Isti-tuto di Ricerche Farmacologiche Mario Negri has developed patient-derived xenograft (PDX) models, which accurately reflect the histological and pharmacological characteristics of the primary tumor and are recognized by the scientific community as one of the only ex-isting models of myxoid liposarcoma worldwide. For this study we used a model sensitive to trabectedin called ML017 and a model which was made resistant to the drug called ML01/ET. Both of them underwent the same cycle of treatment with trabectedin (0.15 mg/kg) once every seven days for three times and three time points were analyzed, such as 24 and 72 hours after the first dose, to study the early responses to the drug, and 15 days after the third dose, to study the late effects. In addition, the models were also inde-pendently treated with the first-line drug used for MLPS, doxorubicin (8 mg/kg), whose effect was analyzed at 24 hours after the first dose. Finally, a sample of healthy tissue of the patient from which the PDX ML017 and ML017/ET models were derived was provided by the IRCCS Fondazione Istituto Na-zionale dei tumori (INT) and used as reference in the analysis. This study was carried out by means of next-generation sequencing (NGS) which allows the parallel sequencing of the DNA of many samples simultaneously at high depth. Among many sequencing strategies such as whole exome sequencing or targeted resequencing, in this work we used a combined approach through the selection of 5971 disease-related genes and a set of non-coding regions called "backbone" spread all over the genome. Regardless the specific technique, the analysis of the NGS data includes some fundamental steps. The first is the quality control which allows the exception of low quality samples which could give false positives results. The second step is the alignment which deals with determining the position of each sequenced fragment called read on a reference genome, which here was chosen as the hg19 version of the human genome. Many alignment soft-wares have been developed during the years, among which one of the most used in the bio-informatics field is the Burrows-Wheeler Aligner (BWA) which was used to this purpose. At this stage the use of PDX could invalidate the data with a possible contamination be-tween human and murine genome. To avoid this bias, the Disambiquate software was used. Through the BWA algorithm, Disambiquate performs two alignments: one on the human reference genome hg19 and one on the murine mm10, attributing each sequence to that with greater correspondence. Sequences associated to mouse or uncertain are discarded. From now onwards, the analysis of the data becomes specific for the purpose we want to achieve, and in this case it has been divided into two levels: the analysis of single nucleo-tide variants and the analysis of structural variants. Variant calling deals with the identification of loci of the genome in which the aligned reads differ from the reference, such as single nucleotide polymorphisms (SNP) and inser-tions / deletions (indels) of a few base pairs. First, we have analysed the individual and he-reditary variants, called germline variants, if classified as pathogenic, could be associated with a predisposition to pathologies or tumors. To this aim, the healthy tissue sample was studied using the GATK-haplotype software, a complex and articulated algorithm that al-lows the selection of this type of variants at high confidence. Furthermore, the analysis focused on tumor-specific variants which are called somatic variants. The variant calling algorithms are used to extract such mutations through comparisons on paired samples (e.g healthy-tumor). The scientific literature suggests the combined use of different variant callers, therefore in this work we used MuTect2, a hard-filters algorithm paired with VarDict which is less strict. There are many public databases which collect information on the clinical significance of variants. The ClinVar is one of the most popular and updated database that collects the clinical information associated with the variations reported in scientific studies. Furthermore, predictive algorithms have also been developed such as the CADD (Combined Annotation Dependent Depletion) which associate a unique score with each variant, the higher the score the worse the pathogenicity. The second level of analysis involved the study of the genomic instabilities that act at the structural level, such as large deletions or duplications, translocations and inversions. These events, although less frequent than SNPs could have a great phenotypic impact. The algorithms that identify these variants analyze the arrangement of the sequenced fragments and build statistical models which allow the classification of variants in a tumor-healthy paired analysis. Among these, CNVkit is specific for the recognition of amplified or deleted regions also defined as Copy Number Variation (CNV), while Manta allows the identification of events such as fusion genes or inversions. Through the analysis of germline variants we identified 7172 variants, however none of them was evaluated as pathogenic according to the clinical databases. Among them, three belonged to the PIK3CA gene, already associated to MLPS, and one to the PTEN gene was confirmed as it was previously diagnosed at the Istituto dei Tumori di Milano where the patient was being treated. The absence of pathogenic events has allowed to exclude a specific predisposition of the response to the drug. Through the somatic variants analysis we identified a group of genes, such as MUC7, KIF5B, PLCB4, SEC16B, AGA, FAT4, ABCG1, PTEN, ANK3, PIK3CA, DSC3, and IMPDH2, mutated on the same loci in all samples and in all conditions in the sensitive model ML017 and with allelic fraction around 50%. This evidence together with the diploidy associated to these samples prompted us to hypothesize the presence of a dominant clone not subjected to the selective pressure of tra-bectedin. Furthermore, the same mutations were found in ML017/ET, together with other six genes mutated exclusively in this model, namely KMT2C, SLC1A2, SSTR5, NLGN1, NUP155 and UVSSA, also with an allelic fraction of around 50% with the exception of UVSSA whose fraction was of 100%. This gene is very interesting since it is involved in the transcription coupled-nucleotide excision repair pathway (TC-NER). As reported in litera-ture, cells with deficits in this pathway exhibit resistance to trabectedin and for this reason the onset of UVSSA mutation only in the ML017/ET model could be associated with drug resistance. Then we moved to the study of the structural variants. We found and confirmed in both models the presence of main feature of MLPS, the FUS-CHOP translocation. Interestingly, we also identified a reciprocal and balanced fusion between the sequenced of CHOP and FUS which were not involved in the canonical chimera that we named CHOP-FUS. This fusion has never been described in the literature, therefore it would be worth investigating its role in case of a possible translation into protein. In addition to these two structural events, no other fusion genes have been detected either after treatment or in the two models, so it could be hypothesized that trabectedin does not influence this genomic level. The study of CNV in the ML017 model showed that chromosomal instability in untreated control is limited to a few focal areas and that after treatment with either trabectedin or doxorubicin the regions involved in CNV events do not increase a lot, with new CNVs found on chromosomes 2, 3, 6, 8 and 13 after trabectedin treatment. In the resistant model ML017/ET we noticed a greater instability compared to ML017, with a higher number of focal regions all over the chromosomes and the presence of three large altered regions, such as a deletion on chromosomes 1 and 3, and an amplification on chromosome 8. However, this CNV frame remains almost the same after the administration of either trabectedin or doxorubicin. In order to associate a biological and functional impact on CNV regions, we focused on the genes involved in these regions. This level of investigation allowed the identification of a group of genes such as HOXD13, HOXD11, HOXD12, HOXD10, HOXD-AS2, MIR7704, HOXD9, MIR10B, LOC4010201, HOXD1, HOXD3, HOXD4, HAGLR, HAGLROS, EVX2, HOXD8, which were amplified exclusively after treatment with trabectedin in the sensitive model ML017, and which persists in the same condition also in the resistant model ML017/ET. All genes belong to the 2q31.1 and most of them is a member of the HOXD family, mainly involved in embryonic development, but whose over-expression has been associated with cancer. Given these evidences, this group of genes could represent a clonal characteristic that develops in response to the drug and that is maintained also under the resistance state. A second interesting group is related to the deletion of the p arm of chromosome 4 in the resistant model only. This chromosome region includes also the UVSSA gene. The already mentioned mutation and this deletion on the entire UVSSA gene reinforces the hypothesis of a damaged TC-NER pathway, therefore UVSSA could be considered as the main cause of drug-resistance. Through independent techniques of digital droplet PCR, it has been confirmed that both the mutation and the deletion in UVSSA are not already present in the sensitive model ML017, but that they have occurred after several cycles of treatment through the acquisition of the resistance process. Therefore it could be hypothesize that the resistance in MLPS is an acquired event associated to the selective pressure of the drug ra-ther than a an innate feature. Due to the strong evidences of a different response to trabectedin in relation to the deficit of DNA repair pathways, and not having identified any somatic mutations related to this function excluding that in UVSSA, we investigated the copy number status associated with genes belonging to one of the following pathways: the TC-NER, the Homologus Recombination (HR) and the DNA mismatch repair (MMR). Two genes have been identified, GTF2H2 and GTF2H4, belonging to the TFIIH complex which plays an important role in the TC-NER pathway. GTF2H2 is completely lost in heterozygosity, which means that it could not have a relation with the acquired resistance process, while GTF2H4 is deleted in homozygosity in both models determining a clonal characteristic that is maintained in the tumor, therefore it could be a necessary but not sufficient condition for the resistance to trabectedin. In conclusion, we could state that this work revealed the role UVSSA as the core event in the mechanism of acquired resistance in MLPS and showed that this mechanism is due to a selective pressure of the drug.Certainly, we found also limits such as the use of a single patient derived xenograft xenograft and the absence of a cohort of patients. However, this was an obligatory choice given the rarity of the pathology. Future developments include laboratory experiments that further clarify the role of UVSSA in drug resistance. To this aim, experiments on cell lines have already be set. Furthermore, transcriptomic investigations are planned, through RNA-Seq, and epigenomic investigations, through ChIP-Seq, which could allow to have a complete view of the mechanisms underlying the response to the drug. Finally, a small cohort of patient biopsies is available on which these data can be validated.
Il liposarcoma (LPS) è il secondo principale sottotipo di sarcomi dei tessuti molli, hanno origine dal tessuto adiposo e insorge soprattutto negli arti e nelle zone addominali o retroperitoneali. L’organizzazione Mondiale della Sanità (OMS) ha classificato 4 sottotipi molecolari di LPS: il liposarcoma ben-differenziato (Well-differentiated LPS, WDLPS); il liposarcoma dedifferenziato (Dedifferentiated LPS, DDLPS); il liposarcoma mixoide (MLPS); il liposarcoma pleomorfo (Pleomorphic LPS, PLPS). Ognuno di essi è caratterizzato da una distinta morfologia e da specifiche caratteristiche molecolari che influiscono sul decorso clinico, sulla sensibilità ai farmaci e quindi sui potenziali trattamenti clinici. In particolare, il MLPS rappresenta più del 30% dei casi di liposarcoma, è caratterizzato da una vascolarizzazione reticolata, dalla presenza di cellule a fuso/ovoidali in uno stroma mixoide e dalla presenza di aree ad alta densità cellulare che vengono definite aree a cellule rotonde (CR). Una presenza maggiore del 5% di CR è associata ad un peggior risultato clinico. A livello molecolare, la più importante caratteristica genetica di tale tumore è la traslocazione t(12;16)(q13;p11) che dà origine al gene di fusione FUS-CHOP. La chimera FUS-CHOP gioca un ruolo fondamentale nell’oncogenesi primaria del MLPS a causa della sua insolita attività trascrizionale che conduce ad un blocco del differenziamento adipocitario. La trabectedina è un farmaco caratterizzato da un meccanismo d’azione complesso, in quanto agisce come regolatore trascrizionale, instaura dei legami forti con il DNA in un modo che è unico nella categoria degli agenti alchilanti, influenza non solo le cellule tumorali ma anche il microambiente tumorale e inoltre agisce in maniera diversa in cellule con deficit nei pathway di riparazione del DNA. La trabectedina ha ricevuto l’approvazione per il trattamento dei sarcomi dei tessuti molli sia dall’Agenzia Europea per i Medicinali (European Medicines Agency, EMA), sia dalla U.S. Food and Drug Administration (FDA). Purtroppo la maggior parte dei pazienti, inizialmente sensibili al farmaco, non risponde più alla terapia dopo trattamenti prolungati avendo sviluppato un meccanismo noto come farmaco-resistenza. Questa tesi si inserisce in questo contesto ed è stata sviluppata in collaborazione tra il Politecnico di Milano e l’Istituto di Ricerche Farmacologiche Mario Negri, dove la trabectedina è stata studiata per anni. Lo scopo è quello di studiare il meccanismo di farmaco-resistenza che si instaura nel MLPS indagando le modifiche indotte dal farmaco a livello genomico mediante tecniche di sequenziamento massivo. A causa della rarità del MLPS la disponibilità di biopsie da pazienti è molto limitata. L’Istituto di Ricerche Farmacologiche Mario Negri ha quindi sviluppato dei modelli xenograft derivati da paziente (XDP) che rispecchiano fedelmente le caratteristiche istologiche e farmacologiche del tumore primario e sono riconosciuti a livello scientifico come tra i pochi modelli di liposarcoma mixoide esistenti. Per questo studio sono stati utilizzati il modello ML017 sensibile alla trabectedina e il modello ML017/ET che è stato reso resistente al farmaco. Entrambi i modelli sono stati sottoposti allo stesso ciclo di trattamento con trabectedina (0.15 mg/kg) una volta ogni sette giorni per tre volte e sono stati analizzati tre punti temporali, cioè a 24 e a 72 ore dopo la prima dose, per studiare le risposte precoci al farmaco, e a 15 giorni dopo la terza dose, per studiarne gli effetti tardivi. Inoltre, i modelli sono stati trattati in modo indipendente anche con il farmaco usato in prima linea per il MLPS, cioè la doxorubicina (8 mg/kg) di cui è stato analizzato l’effetto a 24 ore dopo la prima dose. Infine, è stato reso disponibile dall’IRCCS Fondazione Istituto Nazionale dei Tumori (INT) un campione di tessuto sano proveniente dal paziente da cui sono stati derivati i modelli XDP ML017 e ML017/ET che è stato usato come riferimento nelle analisi. Questo studio è stato svolto tramite il sequenziamento di nuova generazione, più noto come Next Generation Sequencing (NGS), che permette di sequenziare il DNA di molti campioni in un’unica soluzione in modo parallelo e ad alta profondità. Tra le molte strategie di sequenziamento che prevedono per esempio l’analisi dell’intero esoma oppure di un ridotto pannello di geni, in questo lavoro è stato scelto un approccio combinato tra una selezione di 5971 geni associati a malattia e un set di regioni non codificanti definito “backbone” sparse sull’intero genoma. Indipendentemente dalla tecnica utilizzata, l’analisi dei dati di NGS prevede dei passaggi fondamentali. Il primo è il controllo qualità che permette di escludere quei campioni che non raggiungono i parametri minimi per poter essere processati e che potrebbero dare luogo a dei falsi positivi. Il secondo passaggio prevede l’allineamento delle sequenze che consiste nel determinare la posizione di ogni frammento sequenziato sul genoma di riferimento, che in questo studio è stato scelto nella versione hg19 del genoma umano. Nel corso degli anni sono stati sviluppati diversi algoritmi per l’allineamento delle sequenze genomiche e in questo lavoro è stato applicato l’algoritmo Burrows-Wheeler Aligner (BWA), uno dei più utilizzati a questo scopo nell’ambito scientifico. In questo caso specifico, data la crescita del tumore in topi, è necessario prevedere una contaminazione da genoma murino. Quindi allo scopo di distinguere tra le sequenze umane e quelle murine è stato utilizzato il software Disambiquate che, tramite l’algoritmo BWA, effettua due allineamenti uno sul genoma di riferimento umano hg19 e uno su quello murino mm10, attribuendo ogni sequenza a quello con maggior corrispondenza. Le sequenze associate al genoma murino o incerte vengono scartate. Da questo livello in poi, l’analisi dei dati diventa specifica per lo scopo che si vuole ottenere, e in questo caso è stata suddivisa in due livelli: l’analisi delle varianti a singolo nucleotide e l’analisi delle varianti strutturali. La chiamata delle varianti, dall’inglese variant calling, serve per identificare loci sul genoma in cui le sequenze allineate, note come read, differiscono dal genoma di riferimento, quindi polimorfismi a singolo nucleotide (SNP) e inserzioni/delezioni (indels) di poche paia di basi. Prima di tutto sono state analizzate le mutazioni caratteristiche dell’individuo ed ereditarie, definite varianti germinali che, se classificate come patogene, possono essere associate a predisposizione a patologie o tumori. A questo scopo, è stato studiato il campione di tessuto sano tramite il software GATK-haplotype che permette di selezionare con alta confidenza le varianti di questo tipo. In secondo luogo, l’analisi si è concentrata sulle varianti somatiche, cioè caratteristiche del tessuto tumorale, per cui si utilizzano gli algoritmi chiamati variant caller che permettono di estrarre mutazioni tramite confronti su campioni appaiati del tipo sano-tumore. La letteratura scientifica suggerisce l’utilizzo combinato di diversi variant caller, perciò in questo lavoro sono stati utilizzati MuTect2 e VarDict, il primo molto stringente nei filtri, il secondo più permissivo. I database delle varianti sono costantemente aggiornati e permettono di classificare le mutazioni in base al loro impatto sulla traduzione proteica. Il database ClinVar che raccoglie le informazioni cliniche associate alle varianti che vengono riportate negli studi scientifici risponde a questo scopo. Inoltre, sono stati sviluppati anche algoritmi predittivi come il CADD (Combined Annotation Dependent Depletion) che associano un punteggio univoco ad ogni variante, dove maggiore è il punteggio maggiore sarà la patogenicità associata ad essa. Il secondo livello di analisi ha previsto lo studio delle instabilità genomiche che agiscono a livello strutturale, quindi grandi delezioni o duplicazioni, traslocazioni e inversioni cromosomiche. Questi eventi, seppur statisticamente meno frequenti nel genoma umano rispetto agli SNP, possono avere un impatto fenotipico rilevante. Gli algoritmi che identificano queste varianti analizzano la disposizione dei frammenti sequenziati e costruiscono dei modelli che permettono di associare la tipologia di aberrazione strutturale, anche in questo caso tramite analisi appaiate del tipo sano-tumore. Tra questi, CNVkit è specifico per il riconoscimento di regioni amplificate o delete definite anche come variazione del numero di copie (CNV, Copy Number Variation) rispetto ad un riferimento normale, mentre Manta permette di identificare eventi come i geni di fusione o le inversioni. L’analisi delle varianti germinali ha permesso di identificare 7172 varianti, tuttavia nessuna di esse è stata valutata come patogena secondo i database clinici di riferimento. Sono state però riconosciute delle varianti nel gene PIK3CA, già associato a MLPS, e nel gene PTEN, confermando una variante che era stata precedentemente diagnosticata presso l’Istituto dei Tumori di Milano dove il paziente era in cura. L’assenza di eventi patogeni ha permesso di escludere una predisposizione specifica alla risposta al farmaco. Tramite l’analisi delle varianti somatiche abbiamo individuato un gruppo di geni, cioè MUC7, KIF5B, PLCB4, SEC16B, AGA, FAT4, ABCG1, PTEN, ANK3, PIK3CA, DSC3, e IMPDH2, mutati sugli stessi loci in tutti i campioni ed in tutte le condizioni nel modello sensibile ML017 e con frazione allelica attorno al 50%. Questa informazione assieme alla diploidia associata a questi campioni ha permesso di ipotizzare la presenza di un clone dominante non soggetto alla pressione selettiva della trabectedina. Inoltre le stesse mutazioni sono state ritrovate in ML017/ET, oltre ad altri sei geni mutati esclusivi di questo modello, cioè KMT2C, SLC1A2, SSTR5, NLGN1, NUP155 e UVSSA, anch’essi con frazione allelica intorno al 50% ad eccezione di UVSSA in cui è presente una variante con frazione al 100%. Quest’ultimo gene risulta interessante dato il suo coinvolgimento nel pathway della riparazione del DNA per escissione di nucleotidi accoppiato alla trascrizione, chiamato TC-NER (dall’inglese transcription coupled- nucleotide excision repair). È noto come sistemi cellulari con deficit in questo pathway mostrino resistenza alla trabectedina e per questo motivo l’insorgenza della mutazione in UVSSA solo nel modello ML017/ET potrebbe essere associata alla farmaco-resistenza. Dallo studio delle varianti strutturali abbiamo riscontrato e confermato in entrambi i modelli la presenza della traslocazione caratteristica del MLPS, ovvero la fusione FUS-CHOP. Inoltre, abbiamo identificato anche una fusione reciproca e bilanciata tra le sequenze di CHOP e FUS non coinvolte nella chimera canonica che abbiamo nominato CHOP-FUS. Questa fusione non è mai stata descritta in letteratura, sarebbe perciò molto interessante approfondire il suo ruolo nel caso fosse confermato che possa tradursi in proteina. Oltre a questi due eventi strutturali, non sono stati identificati altri geni di fusione dopo il trattamento con trabectedina in entrambi i modelli, quindi si può ipotizzare che la trabectedina non influenzi questo livello genomico. Lo studio delle variazioni del numero di copie (CNV) nel modello ML017 ha evidenziato come l’instabilità cromosomica nel controllo non trattato sia limitata a poche zone focali e che dopo il trattamento sia con trabectedina che con doxorubicina le regioni coinvolte da eventi di CNV non aumentino in modo preponderante, con nuove CNV solo sui cromosomi 2, 3, 6, 8 e 13 per quanto riguarda la trabectedina. Nel modello resistente ML017/ET si vede una maggiore instabilità rispetto al modello sensibile ML017, con un maggior numero di regioni focali a carico di quasi la totalità dei cromosomi e la presenza di tre regioni ampie alterate sui cromosomi 1 e 3 in delezione, e sul cromosoma 8 in amplificazione, ma rimane pressoché costante dopo le somministrazioni di trabectedina e doxorubicina. Al fine di associare un impatto biologico e funzionale alle regioni di CNV, ci siamo focalizzati sui geni coinvolti in queste regioni. Questo livello d’indagine ha permesso di individuare nel modello sensibile ML017 un gruppo di geni amplificati esclusivamente dopo il trattamento con trabectedina, che permangono nella stessa condizione anche nel modello resistente ML017/ET, ad identificare una caratteristica clonale che si sviluppa in risposta al farmaco e che viene mantenuta anche nella condizione di resistenza. Di questi fanno parte HOXD13, HOXD11, HOXD12, HOXD10, HOXD-AS2, MIR7704, HOXD9, MIR10B, LOC4010201, HOXD1, HOXD3, HOXD4, HAGLR, HAGLROS, EVX2, HOXD8, appartenenti alla citobanda 2q31.1. La maggior parte di essi appartiene alla famiglia degli HOXD, coinvolta principalmente nello sviluppo embrionale, ma la cui overespressione è stata associata a patologie tumorali. Un secondo gruppo d’interesse è legato alla delezione del braccio p del cromosoma 4 esclusiva del modello resistente ML017/ET, un’alterazione che comprende anche il gene UVSSA. Tale delezione che interessa l’intero gene UVSSA, oltre alla mutazione identificata in precedenza, rafforza l’ipotesi di compromissione del pathway TC-NER e quindi questo gene potrebbe essere il principale responsabile della farmaco-resistenza. Tramite tecniche di laboratorio indipendenti di digital droplet PCR, è stato confermato che questa mutazione non è presente nel modello sensibile ML017, e lo stesso si può dire della delezione. E’ possibile quindi parlare di un processo di acquisizione della resistenza dovuta alla pressione selettiva del farmaco e non di resistenza innata. Data la forte dipendenza della risposta a trabectedina allo stato dei pathway di riparazione del DNA, non avendo identificato alcuna mutazione somatica riconducibile a questa funzione escluso UVSSA, abbiamo investigato lo stato del numero di copie associato ai geni appartenenti ai principali pathway di riparazione come il TC-NER, la ricombinazione omologa (Homologus Recombination, HR) e la riparazione degli errori di appaiamento tra le basi (DNA mismatch repair, MMR). Vi sono due geni, GTF2H2 e GTF2H4, che appartengono al complesso TFIIH che svolge un ruolo importante nel pathway del TC-NER, il primo deleto in eterozigosi nel solo modello sensibile ad indicare una caratteristica non associabile a resistenza ed il secondo deleto in omozigosi in entrambi i modelli a determinare una caratteristica clonale che si mantiene nel tumore, che potrebbe quindi essere una condizione necessaria ma non sufficiente alla compromissione del pathway ed alla conseguente resistenza alla trabectedina. In conclusione, crediamo che questo lavoro abbia permesso di identificare il gene UVSSA come principale responsabile del meccanismo di resistenza alla trabectedina nel MLPS e come questo processo derivi da una pressione selettiva del farmaco. Di certo vi sono anche delle limitazioni come l’utilizzo di un solo modello xenograft derivato da paziente e l’assenza di una coorte di pazienti. Tuttavia questa è stata una scelta obbligata data la rarità della patologia. Gli sviluppi futuri prevedono degli esperimenti in laboratorio che chiariscano ulteriormente il ruolo di UVSSA nella resistenza al farmaco A questo scopo sono stati avviati degli studi su linee cellulari. Inoltre sono previste delle indagini a livello trascrittomico tramite RNA-Seq, ed epigenomico con ChIP-Seq, che permettano di avere una visione completa dei meccanismi alla base della risposta al farmaco. Infine, è prevista la disponibilità di una piccola coorte di biopsie da paziente sulle quali poter validare questi dati.
Studio delle alterazioni genetiche per la caratterizzazione del meccanismo di resistenza alla trabectedina in modelli di liposarcoma mixoide mediante tecniche di sequenziamento massivo parallelo del DNA
PACCHIAROTTI, MICHELE
2017/2018
Abstract
Liposarcoma (LPS) is the second most frequent subtype of soft tissue sarcoma, originating from adipose tissue and arising mainly from the limbs and the abdominal cavity or from the retroperitoneal areas. The World Health Organization (WHO) has classified four molecular subtypes: well-differentiated LPS, WDLPS; dedifferentiated LPS, DDLPS; myxoid LPS, MLPS; pleomorphic LPS, PLPS. Each of them is characterized by a distinct morphology and by specific molecular features that influence the clinical outcome, the sensitivity to drugs and the clinical treatments. MLPS represents more than 30% of cases of LPS, it is characterized by a reticulated vascularization, by the presence of spindle/ovoid cells in a myxoid stroma and by the presence of areas with high density of round cells (RC). A fraction of RC more than 5% is associated to a worse clinical outcome. The translocation t(12;16)(q13;p11) gives rise to the FUS-CHOP fusion gene, which is the main molecular feature of this tumor. The FUS-CHOP chimera plays a main role in primary oncogenesis of MLPS due to its unusual transcriptional activity that leads to a block of adipocyte differentiation. Trabectedin is a drug characterized by a complex mechanism of action, it acts as a transcriptional regulator, establishes covalent bonds with the DNA in a way that is unique in the category of alkylating agents, influences not only tumoral cells but also the tumor microenvironment and it also acts differently in cells with deficit in the DNA repair pathway. Trabectedin has been approved for the treatment of soft tissue sarcomas by both the European Medicines Agency (EMA) and the U.S. Food and Drug Administration (FDA). Unfortunately, most patients initially sensitive to the drug develop drug resistance after prolonged treatments. This thesis has been developed in this framework in collaboration between the Politecnico di Milano and the Istituto di Ricerche Farmacologiche Mario Negri where trabectedin has been studied for years. The aim is to study the mechanism of drug-resistance arising in MLPS by investigating drug-induced changes at the genomic level through massive parallel sequencing techniques. Due to the rarity of MLPS the availability of patient biopsies is very limited. Thus, the Isti-tuto di Ricerche Farmacologiche Mario Negri has developed patient-derived xenograft (PDX) models, which accurately reflect the histological and pharmacological characteristics of the primary tumor and are recognized by the scientific community as one of the only ex-isting models of myxoid liposarcoma worldwide. For this study we used a model sensitive to trabectedin called ML017 and a model which was made resistant to the drug called ML01/ET. Both of them underwent the same cycle of treatment with trabectedin (0.15 mg/kg) once every seven days for three times and three time points were analyzed, such as 24 and 72 hours after the first dose, to study the early responses to the drug, and 15 days after the third dose, to study the late effects. In addition, the models were also inde-pendently treated with the first-line drug used for MLPS, doxorubicin (8 mg/kg), whose effect was analyzed at 24 hours after the first dose. Finally, a sample of healthy tissue of the patient from which the PDX ML017 and ML017/ET models were derived was provided by the IRCCS Fondazione Istituto Na-zionale dei tumori (INT) and used as reference in the analysis. This study was carried out by means of next-generation sequencing (NGS) which allows the parallel sequencing of the DNA of many samples simultaneously at high depth. Among many sequencing strategies such as whole exome sequencing or targeted resequencing, in this work we used a combined approach through the selection of 5971 disease-related genes and a set of non-coding regions called "backbone" spread all over the genome. Regardless the specific technique, the analysis of the NGS data includes some fundamental steps. The first is the quality control which allows the exception of low quality samples which could give false positives results. The second step is the alignment which deals with determining the position of each sequenced fragment called read on a reference genome, which here was chosen as the hg19 version of the human genome. Many alignment soft-wares have been developed during the years, among which one of the most used in the bio-informatics field is the Burrows-Wheeler Aligner (BWA) which was used to this purpose. At this stage the use of PDX could invalidate the data with a possible contamination be-tween human and murine genome. To avoid this bias, the Disambiquate software was used. Through the BWA algorithm, Disambiquate performs two alignments: one on the human reference genome hg19 and one on the murine mm10, attributing each sequence to that with greater correspondence. Sequences associated to mouse or uncertain are discarded. From now onwards, the analysis of the data becomes specific for the purpose we want to achieve, and in this case it has been divided into two levels: the analysis of single nucleo-tide variants and the analysis of structural variants. Variant calling deals with the identification of loci of the genome in which the aligned reads differ from the reference, such as single nucleotide polymorphisms (SNP) and inser-tions / deletions (indels) of a few base pairs. First, we have analysed the individual and he-reditary variants, called germline variants, if classified as pathogenic, could be associated with a predisposition to pathologies or tumors. To this aim, the healthy tissue sample was studied using the GATK-haplotype software, a complex and articulated algorithm that al-lows the selection of this type of variants at high confidence. Furthermore, the analysis focused on tumor-specific variants which are called somatic variants. The variant calling algorithms are used to extract such mutations through comparisons on paired samples (e.g healthy-tumor). The scientific literature suggests the combined use of different variant callers, therefore in this work we used MuTect2, a hard-filters algorithm paired with VarDict which is less strict. There are many public databases which collect information on the clinical significance of variants. The ClinVar is one of the most popular and updated database that collects the clinical information associated with the variations reported in scientific studies. Furthermore, predictive algorithms have also been developed such as the CADD (Combined Annotation Dependent Depletion) which associate a unique score with each variant, the higher the score the worse the pathogenicity. The second level of analysis involved the study of the genomic instabilities that act at the structural level, such as large deletions or duplications, translocations and inversions. These events, although less frequent than SNPs could have a great phenotypic impact. The algorithms that identify these variants analyze the arrangement of the sequenced fragments and build statistical models which allow the classification of variants in a tumor-healthy paired analysis. Among these, CNVkit is specific for the recognition of amplified or deleted regions also defined as Copy Number Variation (CNV), while Manta allows the identification of events such as fusion genes or inversions. Through the analysis of germline variants we identified 7172 variants, however none of them was evaluated as pathogenic according to the clinical databases. Among them, three belonged to the PIK3CA gene, already associated to MLPS, and one to the PTEN gene was confirmed as it was previously diagnosed at the Istituto dei Tumori di Milano where the patient was being treated. The absence of pathogenic events has allowed to exclude a specific predisposition of the response to the drug. Through the somatic variants analysis we identified a group of genes, such as MUC7, KIF5B, PLCB4, SEC16B, AGA, FAT4, ABCG1, PTEN, ANK3, PIK3CA, DSC3, and IMPDH2, mutated on the same loci in all samples and in all conditions in the sensitive model ML017 and with allelic fraction around 50%. This evidence together with the diploidy associated to these samples prompted us to hypothesize the presence of a dominant clone not subjected to the selective pressure of tra-bectedin. Furthermore, the same mutations were found in ML017/ET, together with other six genes mutated exclusively in this model, namely KMT2C, SLC1A2, SSTR5, NLGN1, NUP155 and UVSSA, also with an allelic fraction of around 50% with the exception of UVSSA whose fraction was of 100%. This gene is very interesting since it is involved in the transcription coupled-nucleotide excision repair pathway (TC-NER). As reported in litera-ture, cells with deficits in this pathway exhibit resistance to trabectedin and for this reason the onset of UVSSA mutation only in the ML017/ET model could be associated with drug resistance. Then we moved to the study of the structural variants. We found and confirmed in both models the presence of main feature of MLPS, the FUS-CHOP translocation. Interestingly, we also identified a reciprocal and balanced fusion between the sequenced of CHOP and FUS which were not involved in the canonical chimera that we named CHOP-FUS. This fusion has never been described in the literature, therefore it would be worth investigating its role in case of a possible translation into protein. In addition to these two structural events, no other fusion genes have been detected either after treatment or in the two models, so it could be hypothesized that trabectedin does not influence this genomic level. The study of CNV in the ML017 model showed that chromosomal instability in untreated control is limited to a few focal areas and that after treatment with either trabectedin or doxorubicin the regions involved in CNV events do not increase a lot, with new CNVs found on chromosomes 2, 3, 6, 8 and 13 after trabectedin treatment. In the resistant model ML017/ET we noticed a greater instability compared to ML017, with a higher number of focal regions all over the chromosomes and the presence of three large altered regions, such as a deletion on chromosomes 1 and 3, and an amplification on chromosome 8. However, this CNV frame remains almost the same after the administration of either trabectedin or doxorubicin. In order to associate a biological and functional impact on CNV regions, we focused on the genes involved in these regions. This level of investigation allowed the identification of a group of genes such as HOXD13, HOXD11, HOXD12, HOXD10, HOXD-AS2, MIR7704, HOXD9, MIR10B, LOC4010201, HOXD1, HOXD3, HOXD4, HAGLR, HAGLROS, EVX2, HOXD8, which were amplified exclusively after treatment with trabectedin in the sensitive model ML017, and which persists in the same condition also in the resistant model ML017/ET. All genes belong to the 2q31.1 and most of them is a member of the HOXD family, mainly involved in embryonic development, but whose over-expression has been associated with cancer. Given these evidences, this group of genes could represent a clonal characteristic that develops in response to the drug and that is maintained also under the resistance state. A second interesting group is related to the deletion of the p arm of chromosome 4 in the resistant model only. This chromosome region includes also the UVSSA gene. The already mentioned mutation and this deletion on the entire UVSSA gene reinforces the hypothesis of a damaged TC-NER pathway, therefore UVSSA could be considered as the main cause of drug-resistance. Through independent techniques of digital droplet PCR, it has been confirmed that both the mutation and the deletion in UVSSA are not already present in the sensitive model ML017, but that they have occurred after several cycles of treatment through the acquisition of the resistance process. Therefore it could be hypothesize that the resistance in MLPS is an acquired event associated to the selective pressure of the drug ra-ther than a an innate feature. Due to the strong evidences of a different response to trabectedin in relation to the deficit of DNA repair pathways, and not having identified any somatic mutations related to this function excluding that in UVSSA, we investigated the copy number status associated with genes belonging to one of the following pathways: the TC-NER, the Homologus Recombination (HR) and the DNA mismatch repair (MMR). Two genes have been identified, GTF2H2 and GTF2H4, belonging to the TFIIH complex which plays an important role in the TC-NER pathway. GTF2H2 is completely lost in heterozygosity, which means that it could not have a relation with the acquired resistance process, while GTF2H4 is deleted in homozygosity in both models determining a clonal characteristic that is maintained in the tumor, therefore it could be a necessary but not sufficient condition for the resistance to trabectedin. In conclusion, we could state that this work revealed the role UVSSA as the core event in the mechanism of acquired resistance in MLPS and showed that this mechanism is due to a selective pressure of the drug.Certainly, we found also limits such as the use of a single patient derived xenograft xenograft and the absence of a cohort of patients. However, this was an obligatory choice given the rarity of the pathology. Future developments include laboratory experiments that further clarify the role of UVSSA in drug resistance. To this aim, experiments on cell lines have already be set. Furthermore, transcriptomic investigations are planned, through RNA-Seq, and epigenomic investigations, through ChIP-Seq, which could allow to have a complete view of the mechanisms underlying the response to the drug. Finally, a small cohort of patient biopsies is available on which these data can be validated.File | Dimensione | Formato | |
---|---|---|---|
2019_04_Pacchiarotti.pdf
non accessibile
Descrizione: Testo della tesi
Dimensione
8.09 MB
Formato
Adobe PDF
|
8.09 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/146118