In-hand manipulation is a complex robotic task in which an object must remain securely grasped while controlled movements are performed. In this context, motion is achieved through precise pushes on the object’s surface rather than through sequences of regrasping, resulting in a relative displacement of the object within a fixed gripper. Traditional approaches for this type of manipulation rely on vision-based systems, using iterative point cloud analyses to enhance robustness. This work proposes an alternative method that replaces vision with the DIGIT tactile sensor. The sensor provides high-resolution tactile images which, when processed by a multimodal classifier based on RGB tactile images and point cloud data, enable binary classification of surface defects. Assuming a point cloud of the manipulated object as input, the method functions as a supervisory controller capable of detecting nonlinear behaviors and unexpected object movements—features particularly relevant in human–robot collaborative scenarios involving the ABB IRB 14000 YuMi collaborative robot. The system is designed to identify defects that are not detectable by conventional vision systems and that are absent from the CAD model, conditions that could lead to manipulation failures. Two neural network architectures are evaluated: one relying solely on tactile RGB data, and another combining RGB data with a point cloud generated by a neural network. Experiments on objects of varying shapes and sizes demonstrate that both architectures are effective, providing two equally valid solutions for tactile defect detection in in-hand manipulation.
La manipolazione in-hand è un compito robotico complesso in cui un oggetto deve ri- manere saldamente afferrato mentre vengono eseguiti movimenti controllati. In questo contesto, il movimento viene ottenuto tramite spinte precise sulla superficie dell’oggetto, anziché attraverso sequenze di regrasping, determinando uno spostamento relativo dell’oggetto all’interno di un gripper fisso. Gli approcci tradizionali per questo tipo di manipolazione si basano su sistemi di visione, utilizzando analisi iterative della point cloud per au- mentare la robustezza. Questo lavoro propone un metodo alternativo che sostituisce la visione con il sensore tattile DIGIT. Il sensore fornisce immagini tattili ad alta risoluzione che, elaborate tramite un classificatore multimodale basato su immagini RGB tattili e point cloud, consentono la classificazione binaria dei difetti superficiali. Assumendo come input la point cloud dell’oggetto da manipolare, il metodo funziona come un con- trollore supervisore in grado di rilevare comportamenti non lineari e movimenti impre- visti dell’oggetto—caratteristiche particolarmente rilevanti negli scenari di collaborazione uomo–robot che coinvolgono il robot collaborativo ABB IRB 14000 YuMi. Il sistema è progettato per identificare difetti non rilevabili dai sistemi di visione convenzionali e non presenti nel modello CAD, condizioni che potrebbero causare fallimenti nella manipo- lazione. Sono state valutate due architetture neurali: una basata esclusivamente sui dati RGB tattili e un’altra che combina i dati RGB con una point cloud generata da una rete neurale. Esperimenti su oggetti di forme e dimensioni diverse dimostrano che entrambe le architetture sono efficaci, offrendo due soluzioni ugualmente valide per la rilevazione tattile dei difetti nella manipolazione in-hand.
Deep learning-aided tactile-based supervision for dual arm robotic in-hand manipulation
PIZZOLITTO, THOMAS
2024/2025
Abstract
In-hand manipulation is a complex robotic task in which an object must remain securely grasped while controlled movements are performed. In this context, motion is achieved through precise pushes on the object’s surface rather than through sequences of regrasping, resulting in a relative displacement of the object within a fixed gripper. Traditional approaches for this type of manipulation rely on vision-based systems, using iterative point cloud analyses to enhance robustness. This work proposes an alternative method that replaces vision with the DIGIT tactile sensor. The sensor provides high-resolution tactile images which, when processed by a multimodal classifier based on RGB tactile images and point cloud data, enable binary classification of surface defects. Assuming a point cloud of the manipulated object as input, the method functions as a supervisory controller capable of detecting nonlinear behaviors and unexpected object movements—features particularly relevant in human–robot collaborative scenarios involving the ABB IRB 14000 YuMi collaborative robot. The system is designed to identify defects that are not detectable by conventional vision systems and that are absent from the CAD model, conditions that could lead to manipulation failures. Two neural network architectures are evaluated: one relying solely on tactile RGB data, and another combining RGB data with a point cloud generated by a neural network. Experiments on objects of varying shapes and sizes demonstrate that both architectures are effective, providing two equally valid solutions for tactile defect detection in in-hand manipulation.| File | Dimensione | Formato | |
|---|---|---|---|
|
2025_12_Pizzolitto_Executive_Summary.pdf
accessibile in internet per tutti a partire dal 18/11/2026
Dimensione
5.02 MB
Formato
Adobe PDF
|
5.02 MB | Adobe PDF | Visualizza/Apri |
|
2025_12_Pizzolitto_Thesis.pdf
accessibile in internet per tutti a partire dal 18/11/2026
Dimensione
82.56 MB
Formato
Adobe PDF
|
82.56 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/247396