How do deep networks learn from small medical datasets : an experimental investigation in radiomics for lung cancer

Lung cancer is one of the main causes of worldwide death. NSCLC represents about 85 % of lung cancer cases and the work would be focused on it, because of this higher incidence. Disease progression and response to treatment in lung cancer vary widely among patients within the same stage. It is important to conduct an accurate diagnosis to determine the most effective treatment and the therapeutic planning for each lung cancer patient (personalized medicine). Radiomic may have important applications in personalized medicine, since it is the extraction of quantitative features (quantitative biomarkers) from medical images that could be used to complement clinical prognostic evaluation efforts. Deep learning allows for the automated quantification and selection of the most robust features, and thus they require little to no human input. This is a possible solution for the limitations introduced by Radiomics. The models used in the deep learning approach, usually, are trained on datasets which contain huge amount of images. In medical field this might introduce some difficulties since the medical datasets have small dimensions. To overcome the issue of small dimension of the medical dataset, the concept of transfer learning is introduced together with the concept of Siamese Networks. The thesis aim is to mitigate the problems that arises when CNN are trained on small medical dataset. The dataset used for the experiment is the CLARO dataset, in which we have 121 stage III lung cancer's patients. For each patient we have a different number of slices which contains the CTV volume, manually contoured by experts’ radiation oncologists. In the dataset there are different labels: Adaptive, Non-Adaptive, PFS and OS. The first experiment has the aim to perform a Classification task, in which we want to classify the image in Adaptive/ Non-Adaptive. This task is accomplished using a custom CNN. The second experiment is a Regression task, in which we want to predict the PFS for each patient. This experiment is accomplished with different architectures: custom CNN, VGG16 and ResNet50. Those last two architectures are pre-trained on ImageNet. The results of the Classification and Regression experiments evidence the presence of overfitting. In order to overcome this issue and to improve the performance of the model, we have conducted additional experiments in which the network used to accomplish the Regression and Classification task is pre-trained on a dataset of a domain similar to CLARO. The last experiments are based on the Siamese-Networks approach. Two different Siamese network architectures are developed. The problem of overfitting holds also in the case of this last experiments. Among all the conducted experiments we might state that, for the Regression task, the ResNet50 network pre trained on ImageNet, produces the best results. In particular, the lower MAE is obtained in EXP_5_#1 (8.83) in which the convolution layers are frozen. For the Classification task, the best results are achieved with the pre-training through the CAE of the custom CNN. In this case, in experiments EXP_3_#C1 and EXP_3_#C2, the achieved accuracy is of 61.00%. This allows to state that the application of data augmentation and the pre-training of the network through the CAE leads to an improvement of the model performances.

Il cancro ai polmoni è una delle principali cause di morte nel mondo. Il NSCLC rappresenta circa l'85% dei casi di cancro al polmone. Per questa maggiore incidenza, il lavoro sarà incentrato sul NSCLC. La progressione della malattia e la risposta al trattamento del cancro al polmone variano ampiamente tra i pazienti nello stesso stadio. È importante condurre una diagnosi accurata per determinare il trattamento più efficace e la pianificazione terapeutica per ogni paziente affetto da cancro del polmone (medicina personalizzata). La radiomica può avere importanti applicazioni nella medicina personalizzata, poiché è l'estrazione di features quantitative (biomarcatori quantitativi) da immagini mediche, che potrebbero essere utilizzate per completare la valutazione prognostica. Il deep learning consente la quantificazione e la selezione automatizzata delle features più robuste e quindi richiede un input umano minimo o nullo. Questa è una possibile soluzione alle limitazioni introdotte dalla Radiomica. I modelli utilizzati nell'approccio del deep learning, di solito, sono addestrati su set di dati che contengono un'enorme quantità di immagini. In ambito medico ciò potrebbe introdurre alcune difficoltà in quanto i dataset medici hanno dimensioni ridotte. Per superare il problema della dimensionalità ridotta del dataset medico, viene introdotto il concetto di transfer learning insieme al concetto di Resti Siamesi. Lo scopo della tesi è mitigare i problemi che sorgono quando la CNN viene addestrata su piccoli dataset medici. Il set di dati utilizzato per l'esperimento è il dataset CLARO, in cui abbiamo 121 pazienti con cancro al polmone al III stadio. Per ogni paziente abbiamo un diverso numero di immagini che contengono il volume CTV, sagomato manualmente da radioterapisti esperti. Nel dataset sono presenti diverse etichette: Adaptive, Non-Adaptive, PFS e OS. Il primo esperimento ha lo scopo di eseguire un task di Classificazione, in cui vogliamo classificare l'immagine in Adaptive/ Non-Adaptiveo. Questa attività viene eseguita utilizzando una CNN personalizzata. Il secondo esperimento è un compito di Regressione, in cui vogliamo predirre la PFS per ogni paziente. Questo esperimento viene realizzato con diverse architetture: CNN personalizzata, VGG16 e ResNet50. Queste ultime due architetture sono pre-addestrate su ImageNet. I risultati degli esperimenti di Classificazione e Regressione evidenziano la presenza di overfitting. Per superare questo problema e migliorare le prestazioni del modello, abbiamo condotto ulteriori esperimenti in cui la rete utilizzata per eseguire l'attività di regressione e classificazione è pre-addestrata su un set di dati di un dominio simile a CLARO. Gli ultimi esperimenti si basano sull'approccio delle reti siamesi. A tal proposito, sono sviluppate due diverse architetture. Il problema dell'overfitting è presente anche in questi ultimi esperimenti. Tra tutti gli esperimenti condotti possiamo affermare che, per il task di Regressione, la rete ResNet50 pre-addestrata su ImageNet, produce i risultati migliori. In particolare, il MAE inferiore si ottiene nell'EXP_5_#1 (8.83) in cui gli strati di convoluzione sono bloccati. Per il compito di Classificazione, i migliori risultati si ottengono con il pre-allenamento tramite CAE della CNN personalizzata. In questo caso, negli esperimenti EXP_3_#C1 e EXP_3_#C2, l'accuratezza raggiunta è del 61,00%. Ciò consente di affermare che l'applicazione del data augmentation e il pre-training della rete attraverso CAE portano ad un miglioramento delle prestazioni del modello.