In the last years, thanks to the diffusion of the da Vinci surgical system, robot-assisted partial nephrectomy (RAPN) has been increasingly adopted in the treatment of renal cancer, offering significant advantages over conventional open and laparoscopic nephrectomy. The da Vinci robot superior maneuverability is ideally suited for the delicate cutting and stitching required in kidney surgery, combined with the benefits for the patient of minimally-invasive operation, such as faster recovery time, less trauma and minor scarring. However, surgeons accomplishing nephrectomies with such system still have to face many challenges related with the variety of visual scenarios in terms of light exposure, field of view potentially occluded by the presence of tools, blur, etc. Consequently, injuries of the renal artery, epigastric artery or renal vein and bleeding after vascular clamp removal are two of the main causes of RAPN-related adverse outcomes. These aspects highlight the necessity of improving the robustness of computer vision algorithms supporting surgeons during this kind of operations, reducing possible complications and minimizing, at the same time, their mental workload. In this context, this thesis project is inserted: its aim is to provide a real-time segmentation and tracking of the main blood vessel (renal artery) present in nephrectomy laparoscopic videos in order to improve intervention quality, a way to assist surgeons in avoiding damages or resections of forbidden vascular regions. The most recent state-of-the-art studies on medical image segmentation identify deep learning-based methods as the most successful ones in accomplishing this task in terms of speed and accuracy. Being this research field relatively new, there is a general lack of approaches regarding kidney, despite the clinical importance of the intraoperative vessel segmentation. For this reason, the purpose of this thesis is to exploit the potentiality of the latest deep-learning techniques, such as convolutional neural networks (CNNs), to provide automatic and robust vessel segmentation in nephrectomy images, contributing to fill this state-of-the-art gap. Specifically, an innovative 3D adversarial FCNN (NephCNN) for automatic blood vessel segmentation in RAPN videos is proposed. The implementation of a 3D network, in which the third dimension refers to time, allows exploiting the temporal information naturally encoded in nephrectomy videos. At the same time, the adversarial training boost the segmentation performance by constraining vessel shape across consecutive frames. Due to the reduced availability of labelled data, a dataset of 1871 frames was extracted and manually annotated from RAPN videos of 8 different patients. Then, an ad hoc sliding window algorithm was implemented to generate the 3D volumes feeding the developed network. The latter was trained on ~74% of the whole dataset, validated on ~13% and tested on remaining ~13%. During training phase, on-the-fly data augmentation was done. The performances of the proposed NephCNN were evaluated in comparison with two state-of-the-art architectures: 2D U-Net and 3D U-Net. The results shown that the NephCNN outperformed the state-of-the-art models, obtaining higher Dice Similarity Coefficient (DSC) values and, therefore, a more accurate vessel segmentation. In detail, the median DSC value for 2D U-Net, 3D U-Net and NephCNN was 59.70%, 66.33% and 71.76%, with Inter-Quartile Range (IQR) of 7.71%, 9.05% and 9.31%, respectively. The Wilcoxon Signed Rank Test (significance level equal to 5%) confirmed that there is a significant statistical difference between the tested architectures. Results achieved suggested that the temporal information inclusion, combined with the shape-constrained adversarial training of the network, can be successfully exploited to increase the blood vessel segmentation capability of the model, overcoming some critical issues such as low image quality, light exposure and specularities. Moreover, the NephCNN was able to segment on average 20 input frames per second, that is a value suitable for real-time applications. Therefore, this method may be useful to support surgeons with context awareness, because virtual-fixture control in RAPN procedures could rely on the provided segmentation. Although both 3D and adversarial CNNs have been already used for medical tasks, this work represents the first attempt to integrate them for the blood vessel segmentation in intraoperative nephrectomy images.
Negli ultimi anni, grazie alla diffusione del robot chirurgico da Vinci, la nefrectomia parziale robotizzata (RAPN) è stata sempre più praticata per il trattamento del carcinoma renale, offrendo vantaggi significativi sia rispetto alla nefrectomia convenzionale che a quella laparoscopica. L'elevata manovrabilità del robot da Vinci è ideale per effettuare i minuziosi tagli e cuciture richiesti nella chirurgia renale, oltre agli altri benefici che un intervento minimamente invasivo di questo tipo offre al paziente, come i tempi di recupero più rapidi ed il minore numero di traumi e cicatrici. Tuttavia, i chirurghi che realizzano nefrectomie con tale sistema devono ancora affrontare molte sfide legate alle criticità dello scenario visivo, in termini di esposizione alla luce, campo visivo potenzialmente occluso dalla presenza di strumenti, sfocatura, ecc. Conseguentemente, le lesioni di arteria renale, arteria epigastrica o vena renale e il sanguinamento dopo la rimozione delle clamp sono tra le principali cause di complicanze correlate alla RAPN. Questi aspetti evidenziano la necessità di migliorare la robustezza degli algoritmi di visione artificiale a supporto dei chirurghi durante questo tipo di operazioni, riducendo in tal modo possibili complicazioni post-operatorie e minimizzando, allo stesso tempo, il loro carico di lavoro mentale. In tale contesto si inserisce questo progetto di tesi, il cui scopo è quello di fornire un algoritmo di segmentazione e monitoraggio in tempo reale del principale vaso sanguigno (arteria renale) presente nei video di nefrectomia laparoscopica al fine di migliorare la qualità degli interventi, supportando i chirurghi nell'evitare danni indesiderati ad importanti siti vascolari. Gli studi più recenti riguardanti la segmentazione di immagini mediche identificano i metodi basati sul deep learning come quelli di maggior successo nel portare a termine questo compito, in quanto a velocità e accuratezza. Essendo questo un campo di ricerca relativamente nuovo, vi è ancora una generale mancanza di lavori riguardanti l'ambito renale, nonostante l'importanza clinica che riveste la segmentazione dei vasi come strumento di supporto intraoperatorio per i chirurghi. Per questa ragione, lo scopo di questo progetto di tesi è quello di sfruttare le potenzialità delle più recenti tecniche di deep learning, come le reti neurali convoluzionali (CNN), per fornire un processo automatico e robusto di segmentazione dei vasi sanguigni nelle immagini di nefrectomia, contribuendo a riempire tale lacuna nello stato dell'arte. In particolare, viene qui proposta un'innovativa 3D FCNN avversaria (NephCNN) per la segmentazione automatica dei vasi sanguigni nei video di RAPN. L'implementazione di una rete 3D, in cui la terza dimensione si riferisce al tempo, consente di sfruttare l'informazione temporale naturalmente incorporata nei video di nefrectomia. Allo stesso tempo, l'allenamento avversario incrementa le performance della rete nella segmentazione limitando la forma del vaso in fotogrammi consecutivi. A causa della ridotta disponibilità di dati annotati, un set di 1871 frame è stato estratto e annotato manualmente dai video RAPN di 8 pazienti diversi. Quindi, è stato implementato un algoritmo ad hoc a finestra scorrevole per generare volumi 3D che vengono forniti in input alla rete sviluppata. Quest'ultima è stata addestrata sul ~74% dell'intero dataset, convalidata su ~13% e testata sul restante ~13%. Durante la fase di addestramento è stato fatto un aumento dei dati on-the-fly. Le prestazioni della NephCNN proposta sono state valutate rispetto a due architetture appartenenti allo stato dell'arte: la 2D U-Net e la 3D U-Net. I risultati hanno mostrato che la NephCNN ha avuto performance significativamente superiori rispetto agli altri modelli analizzati, ottenendo valori più elevati del coefficiente di Dice-Sorensen (DSC) e, di conseguenza, una segmentazione dei vasi più accurata. Nel dettaglio, il valore mediano del DSC per la 2D U-Net, la 3D U-Net e la NephCNN è stato di ~59.70%, ~66.33% e ~71.76%, con intervallo inter-quartile (IQR) di ~7.71%, ~9.05% e ~9.31%, rispettivamente. Il test di valutazione di Wilcoxon ha confermato, con livello di significatività pari al 5%, che esiste una differenza statistica significativa tra le architetture testate. I risultati ottenuti hanno suggerito che l'inclusione dell'informazione temporale, combinata con l'addestramento avversario della rete, il quale impone vincoli sulla forma del vaso, può essere sfruttata con successo per aumentare la capacità del modello di segmentare i vasi sanguigni, superando alcune criticità quali la bassa qualità dell'immagine, l'esposizione alla luce e i riflessi. Inoltre, la NephCNN è stata in grado di segmentare in media 20 fotogrammi al secondo, il che la rende adatta per applicazioni in tempo reale. Sebbene sia le 3D CNN che le reti avversarie siano già state utilizzate per scopi medici, questo lavoro di tesi rappresenta il primo tentativo di combinarle al fine di segmentare i vasi presenti nelle immagini di nefrectomia intraoperatoria.
Deep-learning based vessel tracking in minimally invasive nephrectomy
CARLINI, CHIARA
2018/2019
Abstract
In the last years, thanks to the diffusion of the da Vinci surgical system, robot-assisted partial nephrectomy (RAPN) has been increasingly adopted in the treatment of renal cancer, offering significant advantages over conventional open and laparoscopic nephrectomy. The da Vinci robot superior maneuverability is ideally suited for the delicate cutting and stitching required in kidney surgery, combined with the benefits for the patient of minimally-invasive operation, such as faster recovery time, less trauma and minor scarring. However, surgeons accomplishing nephrectomies with such system still have to face many challenges related with the variety of visual scenarios in terms of light exposure, field of view potentially occluded by the presence of tools, blur, etc. Consequently, injuries of the renal artery, epigastric artery or renal vein and bleeding after vascular clamp removal are two of the main causes of RAPN-related adverse outcomes. These aspects highlight the necessity of improving the robustness of computer vision algorithms supporting surgeons during this kind of operations, reducing possible complications and minimizing, at the same time, their mental workload. In this context, this thesis project is inserted: its aim is to provide a real-time segmentation and tracking of the main blood vessel (renal artery) present in nephrectomy laparoscopic videos in order to improve intervention quality, a way to assist surgeons in avoiding damages or resections of forbidden vascular regions. The most recent state-of-the-art studies on medical image segmentation identify deep learning-based methods as the most successful ones in accomplishing this task in terms of speed and accuracy. Being this research field relatively new, there is a general lack of approaches regarding kidney, despite the clinical importance of the intraoperative vessel segmentation. For this reason, the purpose of this thesis is to exploit the potentiality of the latest deep-learning techniques, such as convolutional neural networks (CNNs), to provide automatic and robust vessel segmentation in nephrectomy images, contributing to fill this state-of-the-art gap. Specifically, an innovative 3D adversarial FCNN (NephCNN) for automatic blood vessel segmentation in RAPN videos is proposed. The implementation of a 3D network, in which the third dimension refers to time, allows exploiting the temporal information naturally encoded in nephrectomy videos. At the same time, the adversarial training boost the segmentation performance by constraining vessel shape across consecutive frames. Due to the reduced availability of labelled data, a dataset of 1871 frames was extracted and manually annotated from RAPN videos of 8 different patients. Then, an ad hoc sliding window algorithm was implemented to generate the 3D volumes feeding the developed network. The latter was trained on ~74% of the whole dataset, validated on ~13% and tested on remaining ~13%. During training phase, on-the-fly data augmentation was done. The performances of the proposed NephCNN were evaluated in comparison with two state-of-the-art architectures: 2D U-Net and 3D U-Net. The results shown that the NephCNN outperformed the state-of-the-art models, obtaining higher Dice Similarity Coefficient (DSC) values and, therefore, a more accurate vessel segmentation. In detail, the median DSC value for 2D U-Net, 3D U-Net and NephCNN was 59.70%, 66.33% and 71.76%, with Inter-Quartile Range (IQR) of 7.71%, 9.05% and 9.31%, respectively. The Wilcoxon Signed Rank Test (significance level equal to 5%) confirmed that there is a significant statistical difference between the tested architectures. Results achieved suggested that the temporal information inclusion, combined with the shape-constrained adversarial training of the network, can be successfully exploited to increase the blood vessel segmentation capability of the model, overcoming some critical issues such as low image quality, light exposure and specularities. Moreover, the NephCNN was able to segment on average 20 input frames per second, that is a value suitable for real-time applications. Therefore, this method may be useful to support surgeons with context awareness, because virtual-fixture control in RAPN procedures could rely on the provided segmentation. Although both 3D and adversarial CNNs have been already used for medical tasks, this work represents the first attempt to integrate them for the blood vessel segmentation in intraoperative nephrectomy images.File | Dimensione | Formato | |
---|---|---|---|
Tesi_Carlini_Chiara.pdf
non accessibile
Descrizione: Testo della tesi
Dimensione
14.61 MB
Formato
Adobe PDF
|
14.61 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/164802