Modern medicine is increasingly favoring minimally invasive procedure. These procedures offer numerous advantages, including less trauma to the patient, reduced risk of infection, shorter recovery times, and ultimately, economic benefits due to shorter hospital stays. However, these techniques present challenges for surgeons, such as a reduced field of vision, viewing through a single camera, limited operational space, and difficult lighting conditions. Therefore, aids are necessary to support surgeons as much as possible. The project this thesis is based on is a laparoscopic rooted-guided surgery assistant designed to support surgeons during laparoscopic Robot-Assisted Partial Nephrectomy (RAPN) procedures. This application works in real-time to automatically superimpose a preoperative anatomical virtual 3D model of the kidney onto the endoscopic view, assisting surgeons by displaying the locations of hidden structures visible in the model. This thesis aims to improve the existing software by finding a suitable solution for precise object tracking. The methodology is divided into two main steps, where the output of the first step serves as the input for the second. The first step employs a semantic segmentation task. The second step leverages Computer Vision Optical Flow techniques to track the organ. The specific aim is to explore the implementation of deep learning in the second step of the methodology. A comprehensive literature review is conducted to identify state-of-the-art approaches. The most promising approach involves using optical neural networks to compute the flow of frames. To determine which network delivers the best results, a comparison pipeline is created to test four different state-of-the-art networks. This process computes segmentation and overlays it with optical flow results to isolate the motion of the region of interest, specifically the kidney. From the computed flow, the kidney’s six degrees of freedom in 3D space are estimated and applied to the 3D model of the patient’s kidney. The ground truth is established by overlaying a 3D model of the patient’s kidney and creating a mask of the projection. A mask is then generated from the final position of the 3D kidney, and the Intersection over Union (IoU) is computed with the ground truth to evaluate accuracy. Additionally, the time required for the pipeline to process and obtain a result per frame is measured to ensure efficiency and real-time applicability.
La medicina moderna privilegia sempre più le procedure minimamente invasive. Queste procedure offrono numerosi vantaggi, tra cui un minor trauma per il paziente, un ridotto rischio di infezione, tempi di recupero più brevi e, in ultima analisi, vantaggi economici dovuti a degenze ospedaliere più brevi. Tuttavia, queste tecniche presentano sfide per i chirurghi, come un campo visivo ridotto, la visione attraverso una singola telecamera, uno spazio operativo limitato e condizioni di illuminazione difficili. Pertanto, sono necessari ausili per supportare il più possibile i chirurghi. Il progetto su cui si basa questa tesi è un assistente per la chirurgia laparoscopica guidata dalle radici, progettato per supportare i chirurghi durante gli interventi di nefrectomia parziale assistita da robot (RAPN) in laparoscopia. Questa applicazione lavora in tempo reale per sovrapporre automaticamente un modello 3D anatomico virtuale preoperatorio del rene alla vista endoscopica, assistendo i chirurghi con la visualizzazione delle posizioni delle strutture nascoste visibili nel modello. Questa tesi si propone di migliorare il software esistente trovando una soluzione adeguata per il tracciamento preciso degli oggetti. La metodologia è suddivisa in due fasi principali, in cui il risultato della prima fase serve da input per la seconda. La prima fase impiega un compito di segmentazione semantica. La seconda fase sfrutta le tecniche di Optical Flow della Computer Vision per tracciare l'organo. L'obiettivo specifico è esplorare l'implementazione del deep learning nella seconda fase della metodologia. È stata condotta una revisione completa della letteratura per identificare gli approcci più avanzati. L'approccio più promettente prevede l'utilizzo di reti neurali ottiche per calcolare il flusso dei fotogrammi. Per determinare quale rete offra i risultati migliori, è stata creata una pipeline di confronto per testare quattro diverse reti all'avanguardia. Questo processo calcola la segmentazione e la sovrappone ai risultati del flusso ottico per isolare il movimento della regione di interesse, in particolare il rene. Dal flusso calcolato, vengono stimati i sei gradi di libertà del rene nello spazio 3D e applicati al modello 3D del rene del paziente. La verità di base viene stabilita sovrapponendo un modello 3D del rene del paziente e creando una maschera della proiezione. Viene quindi generata una maschera dalla posizione finale del rene 3D e viene calcolata l'Intersezione su Unione (IoU) con la verità di base per valutare l'accuratezza. Inoltre, per garantire l'efficienza e l'applicabilità in tempo reale, viene misurato il tempo necessario alla pipeline per elaborare e ottenere un risultato per fotogramma.
Deep learning based optical flow for laparoscopic robot-guided surgery support
ADAMS, DAVID MICHAEL
2023/2024
Abstract
Modern medicine is increasingly favoring minimally invasive procedure. These procedures offer numerous advantages, including less trauma to the patient, reduced risk of infection, shorter recovery times, and ultimately, economic benefits due to shorter hospital stays. However, these techniques present challenges for surgeons, such as a reduced field of vision, viewing through a single camera, limited operational space, and difficult lighting conditions. Therefore, aids are necessary to support surgeons as much as possible. The project this thesis is based on is a laparoscopic rooted-guided surgery assistant designed to support surgeons during laparoscopic Robot-Assisted Partial Nephrectomy (RAPN) procedures. This application works in real-time to automatically superimpose a preoperative anatomical virtual 3D model of the kidney onto the endoscopic view, assisting surgeons by displaying the locations of hidden structures visible in the model. This thesis aims to improve the existing software by finding a suitable solution for precise object tracking. The methodology is divided into two main steps, where the output of the first step serves as the input for the second. The first step employs a semantic segmentation task. The second step leverages Computer Vision Optical Flow techniques to track the organ. The specific aim is to explore the implementation of deep learning in the second step of the methodology. A comprehensive literature review is conducted to identify state-of-the-art approaches. The most promising approach involves using optical neural networks to compute the flow of frames. To determine which network delivers the best results, a comparison pipeline is created to test four different state-of-the-art networks. This process computes segmentation and overlays it with optical flow results to isolate the motion of the region of interest, specifically the kidney. From the computed flow, the kidney’s six degrees of freedom in 3D space are estimated and applied to the 3D model of the patient’s kidney. The ground truth is established by overlaying a 3D model of the patient’s kidney and creating a mask of the projection. A mask is then generated from the final position of the 3D kidney, and the Intersection over Union (IoU) is computed with the ground truth to evaluate accuracy. Additionally, the time required for the pipeline to process and obtain a result per frame is measured to ensure efficiency and real-time applicability.| File | Dimensione | Formato | |
|---|---|---|---|
|
Master_Thesis.pdf
accessibile in internet solo dagli utenti autorizzati
Dimensione
55.31 MB
Formato
Adobe PDF
|
55.31 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/222902