Sports analytics has grown significantly through continuous data streams from wearable tracking devices. However, traditional Machine Learning methods struggle with incremental learning from evolving data. Streaming Machine Learning enables models to continuously learn and adapt from incoming data. This thesis aims to compare the performance of traditional and Streaming Machine Learning approaches in sports analytics by identifying passes during a soccer match. We evaluate the ability of algorithms from both fields to differentiate between pass and non-pass actions using only leg movement features extracted from sensors placed on players’ shoes. Initial balanced and progressively imbalanced datasets are generated alongside larger rebalanced datasets produced statically via SMOTE and dynamically via C-SMOTE. Traditional and streaming algorithms such as decision tree, bagging, random forest, Hoeffding tree, leveraging bagging, adaptive random forest, and streaming random patches are tested on these datasets using a variety of standard performance metrics. Statistical analyses like the t-test and Nemenyi test determine significant differences between algorithms’ performance. Results demonstrate that Streaming Machine Learning frameworks can perform on par with or surpass traditional Machine Learning in various contexts. While traditional techniques initially hold advantages, streaming algorithms consistently outperform traditional counterparts over larger rebalanced datasets. Ensemble methods prove particularly robust across paradigms, and additionally, leveraging bagging and streaming random patches emerge as top performers, though adaptive random forest maintains competitiveness. This work provides compelling evidence that Streaming Machine Learning achieves comparable or better performance than traditional Machine Learning. Leveraging streaming techniques’ adaptability extends possibilities for online sports analytics. Future works expand analysis through varied matches modelling player tendencies and evaluate algorithms’ long-term adaptability on edge devices. Overall, this research underscores Streaming Machine Learning’s growing relevance for analyzing continuous data streams in dynamic domains like sports.
L’analisi sportiva ha visto un notevole sviluppo grazie ai flussi continui di dati provenienti dai dispositivi indossabili di tracciamento. Tuttavia, i metodi tradizionali di Machine Learning faticano nell’apprendimento incrementale da dati in evoluzione. Streaming Machine Learning consente ai modelli di apprendere in modo continuo e adattarsi ai dati in arrivo. Questa tesi si propone di confrontare le prestazioni dei metodi tradizionali e di Streaming Machine Learning nell’ambito dell’analisi sportiva, attraverso l’identificazione dei passaggi di una partita di calcio. Valutiamo la capacità degli algoritmi di entrambi i campi di differenziare tra azioni di passaggio e di non passaggio, utilizzando solo le caratteristiche del movimento delle gambe estratte da sensori posizionati sulle scarpe dei giocatori. Vengono generati dataset iniziali bilanciati e progressivamente sbilanciati, insieme a dataset più ampi ribilanciati staticamente tramite SMOTE e dinamicamente tramite C-SMOTE. Diversi algoritmi tradizionali e in streaming come decision tree, bagging, random forest, Hoeffding tree, leveraging bagging, adaptive random forest e streaming random patches sono testati su questi dataset utilizzando una varietà di metriche di performance standard. Test statistici come il t-test e il Nemenyi test determinano le differenze significative tra le prestazioni degli algoritmi. I risultati dimostrano che Streaming Machine Learning può performare allo stesso livello o superare il Machine Learning tradizionale in vari contesti. Sebbene le tecniche tradizionali abbiano inizialmente vantaggi, gli algoritmi in streaming superano costantemente i corrispondenti tradizionali su dataset più ampi ribilanciati. I metodi di ensemble si dimostrano particolarmente robusti tra i paradigmi, e inoltre, leveraging bagging e streaming random patches emergono come i migliori performer, anche se adaptive random forest rimane competitivo. Questo lavoro fornisce prove convincenti che Streaming Machine Learning raggiunge o supera le prestazioni del Machine Learning tradizionale. Sfruttando l’adattabilità delle tecniche in streaming si ampliano le possibilità per l’analisi sportiva in tempo reale. Sviluppi futuri espanderanno l’analisi attraverso partite diverse, modellando le tendenze dei giocatori e valutando la capacità degli algoritmi di adattarsi nel lungo termine su dispositivi edge. In generale, questa ricerca sottolinea la crescente rilevanza di Streaming Machine Learning nell’analisi di flussi continui di dati in domini dinamici come lo sport.
Comparing traditional and Streaming Machine Learning methods for soccer pass detection
MENCONI, STEFANIA
2022/2023
Abstract
Sports analytics has grown significantly through continuous data streams from wearable tracking devices. However, traditional Machine Learning methods struggle with incremental learning from evolving data. Streaming Machine Learning enables models to continuously learn and adapt from incoming data. This thesis aims to compare the performance of traditional and Streaming Machine Learning approaches in sports analytics by identifying passes during a soccer match. We evaluate the ability of algorithms from both fields to differentiate between pass and non-pass actions using only leg movement features extracted from sensors placed on players’ shoes. Initial balanced and progressively imbalanced datasets are generated alongside larger rebalanced datasets produced statically via SMOTE and dynamically via C-SMOTE. Traditional and streaming algorithms such as decision tree, bagging, random forest, Hoeffding tree, leveraging bagging, adaptive random forest, and streaming random patches are tested on these datasets using a variety of standard performance metrics. Statistical analyses like the t-test and Nemenyi test determine significant differences between algorithms’ performance. Results demonstrate that Streaming Machine Learning frameworks can perform on par with or surpass traditional Machine Learning in various contexts. While traditional techniques initially hold advantages, streaming algorithms consistently outperform traditional counterparts over larger rebalanced datasets. Ensemble methods prove particularly robust across paradigms, and additionally, leveraging bagging and streaming random patches emerge as top performers, though adaptive random forest maintains competitiveness. This work provides compelling evidence that Streaming Machine Learning achieves comparable or better performance than traditional Machine Learning. Leveraging streaming techniques’ adaptability extends possibilities for online sports analytics. Future works expand analysis through varied matches modelling player tendencies and evaluate algorithms’ long-term adaptability on edge devices. Overall, this research underscores Streaming Machine Learning’s growing relevance for analyzing continuous data streams in dynamic domains like sports.File | Dimensione | Formato | |
---|---|---|---|
Tesi_Stefania_Menconi.pdf
accessibile in internet per tutti
Descrizione: Tesi
Dimensione
1.48 MB
Formato
Adobe PDF
|
1.48 MB | Adobe PDF | Visualizza/Apri |
Executive_Summary_Stefania_Menconi.pdf
accessibile in internet per tutti
Descrizione: Executive Summary
Dimensione
535.6 kB
Formato
Adobe PDF
|
535.6 kB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/210996