Vision based high level architecture for intraoperative supervision and behavioural control of a surgical robot

Medical robotics is an application area in continous evolution. Started with the premise of reaching accuracy, speed and payload not possible to humans, in the last few years the resarch is aimed to provide a more versatile assistance system. Operating Rooms (OR) endowed with robotics systems are critical and dynamic environments in which the surgical robot must perform high-risk procedures, while cohabiting and interacting with members of the surgical equipe. A robotic system able to autonomously handle a set of defined emergencies that may occurr during the procedure can provide to the surgical equipe a more comfortable cohabitation inside the OR, for example leaving the surgeons free to move around the robot also during its movement. When the surgical equipe is free to move in the robot workspace, emergencies regarding the loss of data related to the robot pose in the intraoperative environment (e.g. due to something or someone that corrupts the tracking data) as well as possible collision with the surgeons must be taken into account. Commercially available robots (e.g. the NeuroMate or the Da Vinci System) does not provide an intraoperative supervision system and leave to the surgeon the management of eventual emergencies. In this work we design a high-level software architecture for supervision and behavioural control of a semi-autonomous robot inside a crowded and shared environment like the OR. The final aim is to achieve a more comfortable human-robot cohabitation with surgeons free to move around the robot also during its movement through a robust management of sensor faults and emergency situations. Because the surgeon movement can lead to the loss of intraoperative tracking data (e.g. due to the occlusion of the Field Of View of the tracking sensors), emergencies related to the tracking system are detected and autonomously solved, when possible. A multi-sensor redundant approach for robot tracking is carried out so that if a sensor fails in determining the current 3D pose of the robot during its movement, the system can rely on other sensors as backup. A set of dedicated tools able to be seen from all the involved sensors have been built and sensors are calibrated with the robot and between each other using a classic hand-eye approach. A Fault-Tree Analysis is developed to analize all the possible causes that can lead to a sensor fault. A set of software modules handles the communication with the sensors and analizes incoming data to state at each moment if a particular sensor is correctly able to track the robot. Because surgeons movement around the robot can cause collisions, human detection and tracking is performed inside the surgical workspace and emergencies related to collision are autonomously handled. Human detection is developed using a set of RGB-D cameras able to automatically detect and track people (called users) inside their field of view. After a user is detected, a set of Reference Frames having the origins in the user's joints is constructed. Incoming data from all cameras are analized in order to find correspondence between users using a metric based on euclidean distance between users Reference Frames. Once data of multiple Kinects are merged, a simple envelope (i.e. spheres and cylinders) is constructed around each user's skeleton. The behaviour of the robot is updated with respect to information incoming from the above described modules and reactions to emergencies (e.g. the stop of the robot movement) are performed if the modules are no more able to provide tracking data or if a collision is detected. Supervision of all the involved components is achieved using three different software supervisors (i.e. a Supervisor, a Coordinator and a Configurator) that handles respectevely (1) incoming data from both sensors and robot, (2) task knowledge and (3) reactions to event and behavioural control. The Coordinator is actually a hierarchical state machine and represents the only task-aware component in the architecture. Driving events for the Coordinator are raised from the Supervisor and reaction to a transition is performed by the Configurator. Tests are performed in a simplified scenario (i.e. a target approaching procedure, for example during the preliminary phase of a SEEG) in order to evaluate the architecture performance. The hardware setup includes a KUKA LWR4+ as the robot, a NDI Polaris Vicra and a NDI Optotrack Certus as the tracking system and a set of Microsoft Kinects as RGB-D cameras. The adopted workflow is the following: After the surgeon starts the procedure, the system begins to track the robot. If the tracker with the highest performance is able to correctly track the robot inside the OR, the movement is started. In case of fault of the best performant tracker, the second tracker is connected to the architecture and the robot movement is continued. If no tracker is available, the robot movement is stopped and in order to release the brakes, it is needed to ackowledge the event. Moreover, if a possible collision condition is detected (e.g. a user detected too close to the robot during movement) the stop of the robot movement is asserted and the release of the brakes needs to be aknowledged by a user. In general, the robot uses two different velocity of movement (i.e. fast and slow). The slow movement is performed when the robot is detected inside a critical area (i.e. a sphere around the provided target). Results show that it is possible to track the robot also when switching between all the connected sensors with latencies of sensor swap in the order of 10 ms and with small oscillation in the detected pose due to calibration residuals. The developed human detection algorithm is able to correctly detect and track the surgeons inside the Operating Room. False positives in human detection rarely appear and for a limited amount of time (less than 1 s). Reaction to a possible collision shows a latency in the order of 1 ms before asserting the stop of the robot motion. Braking performances are strongly dependant on the velocity of motion, with a displacement from the position where the stop command is asserted that can reach 6 mm at 10 cm/s. At the velocity of 1.0 cm/s a sub-millimetric displacement in braking is shown. The developed architecture, when moving at the two velocities of 7.5 cm/s for the fast movement and 1.0 cm/s for the slow movement, is able to provide sub-millimetric displacement or untracked movement of the robot Tool Center Point in reaction to emergencies detected inside the critical area. Future works may be related to the implementation of a more user-specific envelope (e.g. based on point-clouds and not on spheres and cylinders) or a more complex behavioural control of the robot (e.g. allowing an hands-on control when particular conditions are fullfilled).

La robotica medica è un'area in continuo sviluppo. Se il punto di partenza per una simile applicazione era la possibilità di raggiungere livelli di accuratezza, velocità d'esecuzione e capacità di movimentare carichi pesanti impossibili per un essere umano, gli ultimi anni hanno visto la ricerca sempre più rivolta allo sviluppo di un assistente robotico più versatile. Una Sala Operatoria (SO) dotata di un sistema robotico è un ambiente critico in cui il robot chirurgico deve eseguire compiti ad alto rischio in uno spazio condiviso con l'equipe chirurgica. Un sistema in grado di gestire autonomamente una serie di situazioni d'emergenza può garantire all'equipe chirurgica una coabitazione più confortevole all'interno della SO, per esempio lasciando i medici liberi di aggirarsi attorno al robot anche durante il suo movimento. Quando l'equipe chirurgica è libera di muoversi attorno al robot, devono essere prese in considerazione emergenze legate alla perdita di informazioni relative alla posa intraoperatoria del robot (dovuta al fatto che i dati di tracking possono essere corrotti) così come la come le possibili collisioni del robot stesso con i chirurghi. I robot disponibili in commercio (ad esempio il NeuroMate o il Da Vinci) non possiedono un sistema di supervisione durante la fase intraoperatoria e di conseguenza si affidano completamente al chirurgo per il riconoscimento di eventuali emergenze. In questo lavoro viene sviluppata un'architettura software ad alto livello in grado di supervisionare un sistema robotico semi-autonomo all'interno della sala operatoria. L'obiettivo è il raggiungimento di una più confortevole convivenza uomo-robot attraverso una robusta gestione di eventuali fallimenti dei sensori e di situazioni di emergenza. Poichè il movimento dell'equipe chirurgica può portare alla perdita di dati relativi al tracking intraoperatorio del robot (ad esempio dovute all'occlusione del campo di vista dei sensori), le emergenze relative al sistema di tracking devono essere rilevate e risolte autonomamente. Il sistema di tracking è stato sviluppato seguendo un approccio ridondante multi-sensore per la navigazione, in modo che qualora un sensore fallisca nel determinare la posa corrente del robot durante il movimento, il sistema possa fare affidamento su altri sensori di riserva. E' stata costruita una serie di Dynamical Reference Frame (corpi rigidi ai quali è associato un sistema di riferimento) in grado di essere visti da tutti i sensori considerati. I corrispettivi sensori sono poi stati calibrati tra loro e con il robot con un classico approccio hand-eye. E' stato disegnato uno schema ad albero (Fault-Tree Analysis) che identifichi i possibili fallimenti di un sensore. Infine, è stato implementato un set di componenti software in modo da gestire i dati in arrivo dai sensori e da restituire informazioni relative a quali sensori siano correntemente in grado di fornire la posa del robot. Poichè il movimento dei medici attorno al robot può essere causa di collisioni, sono stati implementati la detezione e il tracking della figura umana all'interno dello spazio intraoperatorio, così come il rilevamento e la gestione di emergenze legate alle possibili collisioni. Il riconoscimento dell'equipe chirurgica è stato sviluppato usando delle camere RGB-D (che forniscono l'immagine RGB più un'informazione di profondità, Depth) in grado di riconoscere automaticamente la presenza di una persona (detta utente) all'interno del loro campo visivo. Dopo il riconoscimento, vengono costruiti una serie di sistemi di riferimento aventi l'origine nelle diverse articolazioni dell'utente. I dati relativi alla posa di questi sistemi di riferimento sono analizzati da un software appositamente scritto al fine di trovare corrispondenze tra diversi utenti con una metrica basata sulla distanza euclidea tra i sistemi di riferimento di ogni utente. Viene quindi calcolato un semplice inviluppo per ogni utente, con sfere e cilindri costruiti attorno allo scheletro riconosciuto dalle camere. Il comportamento del robot è continuamente aggiornato con le informazioni in arrivo dai moduli sopra descritti e sono eseguite reazioni alle emergenze (ad esempio, una frenata di emergenza) qualora uno dei moduli non sia in grado di eseguire correttamente il tracking o se viene rilevata una possibile collisione. La supervisione di tutti i componenti coinvolti viene eseguita da tre diversi software (detti Supervisor, Coordinator e Configurator) che gestiscono rispettivamente: (1) i dati in arrivo da sensori e robot, (2) la conoscenza del compito da eseguire e (3) le azioni da compiere in seguito a particolari eventi. Il Coordinator è una macchina a stati gerarchica e rappresenta l'unico componente del sistema che ha conoscenza riguardo all'attuale procedura. Gli eventi necessari al Coordinator sono forniti dal Supervisor mentre le azioni da eseguire in seguito ad una particolare transizione di stato sono gestite dal Configurator. Sono stati quindi eseguiti dei test all'interno di uno scenario semplificato (una procedura di approccio chirurgico, ad esempio durante la fase preliminare di una StereoElettroEncefalografia) al fine di valutare le performance dell'architettura. Il setup hardware comprende un KUKA LWR4+ come robot, un NDI Polaris Vicra e un NDI Optotrack Certus a comporre il sistema di tracking e due Microsoft Kinect come camere RGB-D. Il workflow adottato è il seguente: Dopo che il chirurgo ha dato inizio alla procedura, il sistema inizia il tracking del robot. Se il sensore con le migliori performance è in grado di seguire correttamente il movimento del robot, viene dato inizio al movimento. In caso di fallimento del sensore con le migliori performance, il secondo sensore viene connesso all'architettura e il movimento del robot è proseguito. Se nessun tracker è disponibile, viene eseguito uno stop di emergenza e il rilascio dei freni del robot deve essere confermato da un utente. In più, se viene rilevata una possibile collisione, il movimento del robot viene fermato e la ripresa delle operazioni deve essere confermata da un utente. In generale, il robot si muove con due diverse velocità (lenta e veloce) a seconda della sua posizione nello spazio. La velocità lenta viene scelta se il robot si trova all'interno di un'area critica (ossia una sfera costruita attorno al target). I risultati mostrano che è possibile effettuare tracking del robot anche intercambiando i sensori durante la procedura, con latenze di risposta nell'ordine dei 10 ms e con piccoli errori della posa rilevata dovute ai residui di calibrazione. L'algoritmo di riconoscimento della figura umana sviluppato è in grado di identificare con correttezza la presenza di un utente all'interno del campo operatorio in tutti i casi. Falsi positivi appaiono raramente e per brevi durate (minori di 1 s). La reazione ad una possibile collisione mostra una latenza nell'ordine di 1 ms prima dell'invio di un comando di stop al robot. Le performance in frenata sono fortemente dipendenti dalla velocità di movimento e mostrano uno spostamento del Tool Center Point rispetto alla posizione in cui è stata comandata la frenata che può raggiungere i 6 mm alla velocità di 10 cm/s. Alla velocità di 1 cm/s, lo spostamento in frenata è inferiore al millimetro. Il sistema sviluppato, in condizioni di velocità di 7.5 cm/s per il movimento veloce e di 1.0 cm/s per il movimento lento è in grado di garantire reazioni con spostamento sub-millimetrico del Tool Center Point del robot per emergenze rilevate all'interno dell'area critica. Sviluppi futuri possono riguardare l'implementazione di un inviluppo dipendente dall'operatore (ad esempio basandosi su mappe di profondità e non su sfere e cilindri) o di un comportamento più complesso del robot (ad esempio consentendo un controllo cooperato da parte dell'operatore quando sono soddisfatte particolari condizioni).