As global demand for food rises alongside labour shortages in agriculture, developing robotic solutions becomes increasingly vital to ensure sustainable and efficient crop production. This research focuses on automating the delicate task of strawberry harvest ing—a process traditionally dependent on human labour due to the complexity of the environment in which these fruits grow, as well as the careful manipulation required to prevent slippage and bruising. To address these challenges, two neural network (NN) models were developed to enable tactile sensing and visual servoing for complex manipulation tasks, such as pushing strawberries in cluttered environments. The first model—the Tactile Prediction Model (TPM)—aims to predict forces exerted on a soft sensor (XELA Sensor) surface along three axes. This predictive capability generates sequences of force readings, allowing the identification of stem locations on the sensor during manipulation tasks. The second model—the Video Prediction Model (VPM)—is designed to predict sequences of images capturing the position of the strawberry during the pushing action, facilitating accurate detection and guiding the robot towards a target location. This model exploits masking procedure to easily detect the strawberry, while ensuring more lightweight images as input. These data-driven models feed customized predictive controllers, which use prediction errors to generate tailored trajectories that dynamically adapt the robot’s pose. Those controllers have the objective to both ensure feasible movements and sustained contact with the strawberry stem, effectively avoiding slippage and bruising. Experimental results demonstrate that both models successfully learn patterns relevant to physical robot interaction (PRI) in harvesting scenarios and reliably reconstruct masked visual scenes with acceptable accuracy. However, the experimental setup also reveals the necessity of multi-parallel computing to achieve sufficiently fast computations for real-time control applications involving both models simultaneously. Additionally, the autonomous data collection procedure developed in this research enables straightforward replication across other precision agriculture contexts. Thus, the NN models presented here can be effectively adapted to different high-value crops through simple adjustments, such as changing the masking approach to target alternative fruits. The integration of data-driven models with tailored predictive control architectures presented in this thesis provides a robust foundation for developing more sophisticated and versatile robotic systems capable of addressing the growing demands of modern agriculture.
Con l’aumento della domanda globale di produzione agricola e la contemporanea carenza di manodopera nel settore agricolo, lo sviluppo di soluzioni robotiche diviene sempre più essenziale per garantire una produzione sostenibile ed efficiente. Questo lavoro di ricerca si propone di automatizzare il delicato processo di raccolta delle fragole, un’attività tradizionalmente svolta da operatori umani a causa della complessità dell’ambiente in cui crescono questi frutti e della manipolazione accurata necessaria a prevenirne la perdita di contatto e ulteriori danni che possono essere recati ai frutti. Per affrontare tali sfide, sono stati sviluppati due modelli basati su reti neurali (Neural Networks– NN), mirati ad abilitare la percezione tattile e la tecnica di visual servoing per compiti di manipolazione complessi, come lo spostamento delle fragole in ambienti di natura aggregata quale è quella delle fragole. Il primo modello—TPM (Tactile Prediction Model)—ha lo scopo di prevedere le forze esercitate su una superficie sensibile lungo tre componenti, generando sequenze di letture di forza per localizzare il punto di contatto dello stelo sul sensore durante la manipolazione. Il secondo modello—VPM (Video Prediction Model)—è progettato per prevedere sequenze di immagini che catturano la posizione della fragola durante l’azione di manipolazione, facilitando così la sua rilevazione visiva e consentendo al robot di raggiungere accuratamente una lo spostamento della fragola verso una direzione target. Questi modelli data-driven hanno lo scopo di generare informazioni per controllori predittivi, i quali sfruttano gli errori delle previsioni per generare traiettorie adattive che modificano dinamicamente la configurazione del robot. Tale approccio assicura movimenti realizzabili e mantiene un contatto stabile con lo stelo della fragola, prevenendo efficacemente possibili scivolamenti e ammaccature. I risultati sperimentali hanno dimostrato che entrambi i modelli sono capaci di apprendere pattern significativi legati all’interazione fisica del robot con l’ambiente esterno (physical robot interaction– PRI) negli scenari di raccolta e di ricostruire fedelmente scene visive mascherate con una precisione accettabile. Tuttavia, la configurazione sperimentale ha anche evidenziato la necessità di hardware in grado di svolgere calcolo in parallelo per ottenere tempi di elaborazione sufficientemente rapidi per applicazioni di controllo in tempo reale affinché entrambi i modelli possano computare simultaneamente le sequenze di predizione. Inoltre, la procedura di raccolta dati autonoma sviluppata in questo studio consente una semplice riproducibilità in altri contesti dell’agricoltura di precisione. Pertanto, i modelli NN qui presentati possono essere facilmente adattati ad altre colture ad alto valore, semplicemente modificando l’approccio di mascheramento per rilevare i diversi frutti. L’integrazione di modelli basati sui dati– con architetture di controllo predittivo appositamente progettate– rappresenta una base solida per lo sviluppo di sistemi robotici avanzati e versatili, capaci di soddisfare le crescenti esigenze dell’agricoltura moderna.
Multi-modal predictive models for a strawberry pushing robot controller
Gemmani, Giuliano
2024/2025
Abstract
As global demand for food rises alongside labour shortages in agriculture, developing robotic solutions becomes increasingly vital to ensure sustainable and efficient crop production. This research focuses on automating the delicate task of strawberry harvest ing—a process traditionally dependent on human labour due to the complexity of the environment in which these fruits grow, as well as the careful manipulation required to prevent slippage and bruising. To address these challenges, two neural network (NN) models were developed to enable tactile sensing and visual servoing for complex manipulation tasks, such as pushing strawberries in cluttered environments. The first model—the Tactile Prediction Model (TPM)—aims to predict forces exerted on a soft sensor (XELA Sensor) surface along three axes. This predictive capability generates sequences of force readings, allowing the identification of stem locations on the sensor during manipulation tasks. The second model—the Video Prediction Model (VPM)—is designed to predict sequences of images capturing the position of the strawberry during the pushing action, facilitating accurate detection and guiding the robot towards a target location. This model exploits masking procedure to easily detect the strawberry, while ensuring more lightweight images as input. These data-driven models feed customized predictive controllers, which use prediction errors to generate tailored trajectories that dynamically adapt the robot’s pose. Those controllers have the objective to both ensure feasible movements and sustained contact with the strawberry stem, effectively avoiding slippage and bruising. Experimental results demonstrate that both models successfully learn patterns relevant to physical robot interaction (PRI) in harvesting scenarios and reliably reconstruct masked visual scenes with acceptable accuracy. However, the experimental setup also reveals the necessity of multi-parallel computing to achieve sufficiently fast computations for real-time control applications involving both models simultaneously. Additionally, the autonomous data collection procedure developed in this research enables straightforward replication across other precision agriculture contexts. Thus, the NN models presented here can be effectively adapted to different high-value crops through simple adjustments, such as changing the masking approach to target alternative fruits. The integration of data-driven models with tailored predictive control architectures presented in this thesis provides a robust foundation for developing more sophisticated and versatile robotic systems capable of addressing the growing demands of modern agriculture.File | Dimensione | Formato | |
---|---|---|---|
2025_07_Gemmani_Executive_Summary.pdf
solo utenti autorizzati a partire dal 30/06/2026
Descrizione: Executive Summary
Dimensione
1.48 MB
Formato
Adobe PDF
|
1.48 MB | Adobe PDF | Visualizza/Apri |
2025_07_Gemmani_Thesis.pdf
solo utenti autorizzati a partire dal 30/06/2026
Descrizione: Tesi
Dimensione
17.21 MB
Formato
Adobe PDF
|
17.21 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/239605