This thesis explores customer behavior patterns in retail through the integration of the innovative RF3MP (Recency, Frequency, Monetary, Monetary Variability, Monetary Difference, and Periodicity) model and Hidden Markov Models (HMM). Traditional approaches, such as the Recency-Frequency-Monetary (RFM) model, often lack the dynamic capability to accurately reflect evolving customer behaviors over time. The proposed RF3MP model addresses this gap by extending the classic RFM framework with additional longitudinal metrics: Monetary Variability (standard deviation of spending within a period), Monetary Difference (change in spending compared to the previous period), and Periodicity (standard deviation of interpurchase time), providing enhanced predictive capabilities and enabling better sequential tracking of consumer engagement. The study analyzes transactional data from an Italian grocery retailer, encompassing over 11 million transactions from approximately 419,000 customers across a two-and-a-half-year period. Using RF3MP combined with HMM, eight distinct customer states are identified, ranging from inactive and churners to irregular, high-value, and champion customers. Furthermore, two significant churn trajectories ("churn parabolas") are uncovered: a gradual disengagement pathway involving regular and declining customers, and a more pronounced disengagement affecting high-value customers. The qualitative comparison with static supervised learning algorithms (XGBoost and Random Forest) highlights the key advantage of HMM: its ability to model smooth and sequential state transitions, enabling realistic mapping of customer behavior evolution. This sequential modeling capability allows HMM to capture nuanced patterns of behavior shifts, thus providing deeper insights into customer lifecycle dynamics. Quantitative validation, specifically addressing churn classification, further reinforces the robustness of the RF3MP-HMM framework, showing superior recall (91%) and predictive performance (AUC-ROC: 0.87) compared to static models, such as XGBoost with 85% recall and AUC-ROC: 0.84. These results enable businesses to identify nearly all churners effectively, significantly reducing false negatives compared to currently adopted models. Consequently, companies can take timely and targeted retention actions, proactively managing customer churn and enhancing overall business performance. Future research directions and study limitations have also been identified and discussed.
Questa tesi analizza i modelli di comportamento dei clienti nel settore retail attraverso l'integrazione del modello innovativo RF3MP (Recency, Frequency, Monetary, Variabilità Monetaria, Differenza Monetaria e Periodicità) con i Modelli Nascosti di Markov (Hidden Markov Models - HMM). Gli approcci tradizionali, come il modello Recency-Frequency-Monetary (RFM), spesso non riescono a catturare efficacemente la dinamica evolutiva dei comportamenti del cliente nel tempo. Il modello RF3MP proposto supera queste limitazioni estendendo il classico framework RFM con metriche longitudinali aggiuntive: Variabilità Monetaria (deviazione standard della spesa nel periodo), Differenza Monetaria (variazione di spesa rispetto al periodo precedente) e Periodicità (deviazione standard degli intervalli temporali tra gli acquisti), fornendo così maggiori capacità predittive e un migliore monitoraggio dell'engagement del consumatore. Lo studio analizza i dati transazionali di un retailer italiano della grande distribuzione, comprendenti oltre 11 milioni di transazioni relative a circa 419.000 clienti nell'arco di due anni e mezzo. Attraverso l'integrazione del modello RF3MP con gli HMM, sono stati identificati otto stati distinti dei clienti, che vanno da clienti inattivi e churner, a irregolari, fino a clienti di alto valore e "champions". Inoltre, sono state identificate due traiettorie significative di churn ("parabole di churn"): un percorso di disimpegno graduale che coinvolge clienti regolari e in declino, e un percorso più marcato che colpisce i clienti ad alto valore. Il confronto qualitativo con algoritmi statici di apprendimento supervisionato (XGBoost e Random Forest) evidenzia il principale vantaggio dell'HMM: la sua capacità di modellare transizioni sequenziali e graduali tra stati, consentendo una rappresentazione realistica dell'evoluzione del comportamento dei clienti. Questa capacità di modellazione sequenziale permette agli HMM di catturare cambiamenti sottili nei comportamenti, fornendo una comprensione più approfondita della dinamica del ciclo di vita dei clienti. La validazione quantitativa, focalizzata specificamente sulla classificazione del churn, conferma ulteriormente la solidità del modello RF3MP-HMM, mostrando una recall superiore (91%) e una maggiore capacità predittiva (AUC-ROC: 0,87) rispetto ai modelli static, come XGBoost con performance pari a 85% recall e AUC-ROC: 0,84. Questi risultati consentono alle aziende di identificare quasi tutti i clienti a rischio churn in maniera efficace, riducendo significativamente i falsi negativi rispetto ai modelli attualmente utilizzati come XGBoost. Di conseguenza, le aziende possono intraprendere interventi di retention mirati e tempestivi, gestendo proattivamente il churn e migliorando complessivamente le performance aziendali. Sono inoltre identificate e discusse le direzioni per la ricerca futura e i limiti dello studio.
Unveiling customer patterns through the RF3MP model and Hidden Markov Models: identifying critical flows and churn risks
Mancini, Anuar
2024/2025
Abstract
This thesis explores customer behavior patterns in retail through the integration of the innovative RF3MP (Recency, Frequency, Monetary, Monetary Variability, Monetary Difference, and Periodicity) model and Hidden Markov Models (HMM). Traditional approaches, such as the Recency-Frequency-Monetary (RFM) model, often lack the dynamic capability to accurately reflect evolving customer behaviors over time. The proposed RF3MP model addresses this gap by extending the classic RFM framework with additional longitudinal metrics: Monetary Variability (standard deviation of spending within a period), Monetary Difference (change in spending compared to the previous period), and Periodicity (standard deviation of interpurchase time), providing enhanced predictive capabilities and enabling better sequential tracking of consumer engagement. The study analyzes transactional data from an Italian grocery retailer, encompassing over 11 million transactions from approximately 419,000 customers across a two-and-a-half-year period. Using RF3MP combined with HMM, eight distinct customer states are identified, ranging from inactive and churners to irregular, high-value, and champion customers. Furthermore, two significant churn trajectories ("churn parabolas") are uncovered: a gradual disengagement pathway involving regular and declining customers, and a more pronounced disengagement affecting high-value customers. The qualitative comparison with static supervised learning algorithms (XGBoost and Random Forest) highlights the key advantage of HMM: its ability to model smooth and sequential state transitions, enabling realistic mapping of customer behavior evolution. This sequential modeling capability allows HMM to capture nuanced patterns of behavior shifts, thus providing deeper insights into customer lifecycle dynamics. Quantitative validation, specifically addressing churn classification, further reinforces the robustness of the RF3MP-HMM framework, showing superior recall (91%) and predictive performance (AUC-ROC: 0.87) compared to static models, such as XGBoost with 85% recall and AUC-ROC: 0.84. These results enable businesses to identify nearly all churners effectively, significantly reducing false negatives compared to currently adopted models. Consequently, companies can take timely and targeted retention actions, proactively managing customer churn and enhancing overall business performance. Future research directions and study limitations have also been identified and discussed.File | Dimensione | Formato | |
---|---|---|---|
2025_04_Mancini.pdf
accessibile in internet solo dagli utenti autorizzati
Dimensione
4.52 MB
Formato
Adobe PDF
|
4.52 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/235751