Monitoring human state in a robotic assistive platform : data acquisition and person detection systems

During the last decade, problems related to a sedentary lifestyle increased significantly and became one of the greatest burden of today’s society. Robotic technologies are possibly one of the main resources we have to tackle the problem due to their versatility and adjustment capabilities. The AHA (Augmented Human Assistance) project has the final aim of build- ing a robotic assistive platform that could be employed to raise motivation in patients while monitoring their progresses and assessing their health con- ditions. In this thesis work the acquisition and Pedestrian Detection module of the final platform have been designed and developed. In order to provide a satisfying human-robot interaction experience, aware- ness of the position of people is a crucial aspect in Robotic Mobile Platforms. Vision is one of the richest and most employed sensor in today’s Robotic Sys- tems due to the diversity of features that can be extracted and used to analyse the surrounding environment, people included. Omnidirectional cameras, in particular, give us the possibility to exploit and increased field of view at the price of an increased distortion in acquired images. This work explores different ways to process omnidirectional images to ob- tain versions of them more suitable to be used together with state-of-the art pedestrian detectors and to assess their performance by using different training and testing combinations. Results showed that better performances can be achieved using unwrapped or rectified images and by training the detector with ad Hoc datasets, but there is still a big margin for further improvements. In addition, due to the critical role acted by datasets, I present a system to accelerate the labelling task using skeletal data provided by Microsoft Kinect V2. Finally, I introduce an architecture to acquire, store and visualize the differ- ent information obtainable from the different sensors and cameras connected to the system.

Durante le ultime decadi, i problemi causati da uno stile di vita sedentario sono cresciuti significativamente fino a diventare one dei maggiori problemi della società d’oggigiorno. Le tecnologie robotiche, grazie alla loro versatilità e capacità d’adattamento, sono probabilmente alcune tra le principali risorse a nostra disposizione per contrastare il problema. Il progetto AHA (Augmented Human Assistance) ha come obiettivo finale quello di costruire un piattaforma assistenziale robotica che possa essere impiegata per accrescere la motivazione dei pazienti e, allo stesso tempo, monitorarne la condizione fisica. In questo lavoro di tesi sono stati progettati e sviluppati i moduli per l’acquisizione e il rilevamento di pedoni (Pedestrian Detection). Per fornire un’esperienza soddisfacente durante l’interazione tra umano e robot, la conoscenza da parte del sistema della posizione delle persone che lo attornino è un aspetto cruciale nelle piattaforme robotiche mobili. La visione è uno dei sensori in grado di fornire più informazioni e più più utilizzato nei sistemi robotici moderni, per via della varietà di dettagli che possono essere estratti e utilizzati per analizzare l’ambiente circostante, persone incluse. Le camere omnidirezionali, in particolare, ci forniscono la possibilità di sfruttare un angolo di visione più largo al prezzo di una maggiore distorsione nelle immagini acquisite. Questo lavoro esplora le vie possibili per processare le immagini omnidirezionali e ottenerne versioni più adatte per essere utilizzate congiuntamente con i migliori detector e ne valuta le prestazioni testando diverse combinazioni di training e testing set. I risultati hanno dimostrato che delle prestazioni migliori possono essere ottenute utilizzando immagini unwrapped o rectified e svolgendo la fase di training utilizzando dataset ad Hoc, ma è ancora presente un grosso margine di miglioramento. Inoltre, data l’importante ruolo svolto dai dataset, ho presentato un sistema per accelerare l’etichettatura dei dataset usando i dati scheletrici forniti dal Microsoft Kinect V2. Infine, ho introdotto un’architettura per acquisire, archiviare e visualizzare le diverse informazioni ottenibili da sensori e videocamere connessi al sistema.