Smistamento pacchi : un approccio basato su computer vision e machine learning

The present work is based on the need to make the system for the recognition of logistic packages and for double detection as versatile as possible, thus distancing its operation from the modus operandi of expert systems, configured manually. To achieve this goal and to be able to address the problem with strategies based on machine learning, it was necessary to do a careful job for the creation of a sufficiently representative dataset, through the creation of a software application that would allow you to expand and modify the data according to future needs. Moreover, by analysing the product dataset, the features were selected and extracted to allow the classifiers considered to have as much information as possible on which to base their classification work. Initially, a strategy was tested, defined as a baseline, through which the classification is carried out by SVM, K-nn, Random Forest classifiers, placed in parallel, which fully analyze the features available with the posterior probabilities aggregated thanks to the use of a Naive Bayes classifier. Subsequently, after having ascertained the scarce added value brought by a successive frame analysis due to the almost zero variance of the features between the different frames, a new classification strategy was structured that optimizes the parameters of the classifiers and that allows to arrive at the classification of the double case in a more robust way and in less time, selecting for each phase the sub-sets of features with more discriminating power for the specific class. Finally, by analyzing the problems that have arisen during the analysis phase and after having highlighted the criticalities that these involve, we propose alternatives and future developments to further improve the quality of the results.

Il presente lavoro trova fondamento nella necessità di rendere il sistema per il riconoscimento dei colli logistici e per la double detection il più possibile versatile, allontanando dunque il suo funzionamento dal modus operandi dei sistemi esperti, configurati manualmente. Per raggiungere questo obiettivo e per poter affrontare il problema con strategie basate sul machine learning, è stato necessario svolgere un attento lavoro per la creazione di un dataset sufficientemente rappresentativo, attraverso la creazione di un applicativo software che permettesse di ampliare e modificare i dati secondo le necessità future. Inoltre, analizzando il dataset prodotto, sono state selezionate ed estratte le features per permettere ai classificatori considerati di avere più informazioni possibili su cui basare il loro lavoro di classificazione. Inizialmente è stata testata una strategia, definita come baseline, attraverso cui la classificazione viene portata a termine da classificatori SVM, K-nn, Random Forest, posti in parallelo che analizzano in toto le features disponibili con le posterior probabilities aggregate grazie all’uso di un classificatore Naive Bayes. Successivamente, dopo aver appurato lo scarso valore aggiunto portato da una analisi a frame successivi per via della varianza quasi nulla delle features tra i diversi frames, è stata strutturata una nuova strategia di classificazione che ottimizza i parametri dei classificatori e che permette di arrivare alla classificazione del caso double in modo più robusto e in minore tempo, selezionando per ogni fase i sottoinsiemi di features con maggiore potere discriminante per la specifica classe. Infine, analizzando i problemi sorti in fase di analisi e dopo aver evidenziato le criticità che questi comportano, si propongono delle alternative e degli sviluppi futuri per migliorare ulteriormente la qualità dei risultati.