This thesis, a collaboration between Politecnico di Milano and Omega3C, analyses a large, diverse anonymized dataset on various aspects of the relationship between a major Italian bank and its customers. The analysis focuses on two key performance indicators (KPIs) that measure customer sentiment toward the bank’s services: the Net Promoter Score (NPS), reflecting the likelihood of a customer recommending the bank, and the Customer Effort Score (CES), which assesses the ease of customer interactions with the bank. The primary objective is to identify significant relationships between the covariates and these two KPIs and to develop a model capable of identifying the most influential predictors within the dataset. An initial analysis characterizes the average customer profile and typical responses across seven different survey types. The study begins with a thorough literature review of NPS and CES, emphasizing their theoretical underpinnings and practical relevance in various sectors. NPS is recognized for its utility in predicting customer loyalty, with customers categorized as promoters, passives, or detractors based on their scores. The dataset, containing anonymized responses from multiple customer surveys across digital and physical channels, is systematically analysed. It includes demographic data, survey response details, and segmented customer. To achieve the objective, this thesis exploit the usage of an innovative combination of advanced statistical and machine learning algorithms, aimed at identifying the most important covariates, and understanding how they can explain the target variables, NPS and CES, looking for significant causal relationships. Several models are employed to achieve these objectives, including Linear Regression, Lasso Regression, Logistic Regression and Random Forest. These models were chosen for their ability to identify key covariates and their meaningful connections to the target indicators. Additionally, a multivariate correlation analysis explores deeply the relationships among covariates and target variables. The thesis concludes with recommendations to enhance the effectiveness of similar studies, primarily through improved data collection practices and a larger sample size.
Questa tesi, frutto di una collaborazione tra il Politecnico di Milano e Omega3C, analizza un ampio e anonimizzato dataset su vari aspetti della relazione tra una delle principali banche italiane e i suoi clienti. L'analisi si concentra su due principali indicatori di prestazione (KPI) che misurano la soddisfazione dei clienti verso i servizi della banca: il Net Promoter Score (NPS), che riflette la probabilità che un cliente raccomandi la banca, e il Customer Effort Score (CES), che valuta la facilità delle interazioni del cliente con la banca. L'obiettivo principale è identificare le relazioni significative tra le covariate e questi due KPI, sviluppando un modello capace di individuare i predittori più influenti all'interno del dataset. Lo studio inizia con un'approfondita rassegna della letteratura su NPS e CES, sottolineando i loro fondamenti teorici e la loro rilevanza pratica in vari settori. L'NPS è riconosciuto per la sua utilità nel prevedere la fedeltà dei clienti, che vengono classificati come promotori, passivi o detrattori in base ai loro punteggi. Il set di dati, contenente le risposte anonime di più sondaggi sui clienti attraverso i canali digitali e fisici, viene analizzato sistematicamente. Include dati demografici, dettagli delle risposte ai sondaggi e segmentazioni dei clienti. Per raggiungere l’obiettivo, questa tesi sfrutta l'utilizzo di una combinazione innovativa di algoritmi statistici e di machine learning avanzati, finalizzati all'identificazione delle covariate più importanti e alla comprensione di come queste possano spiegare le variabili target, NPS e CES, alla ricerca di relazioni causali significative. Inoltre, un'analisi di correlazione multivariata esplora in profondità le relazioni tra le covariate e le variabili target. La tesi si conclude con raccomandazioni per migliorare l'efficacia di studi simili, principalmente tramite pratiche di raccolta dati migliorate e un campione di dimensioni maggiori.
Exploring large datasets to uncover significant relationships between covariates and target variables: a case study on net promoter score and customer effort score
Salina, Matteo;MESSERI, FILIPPO
2023/2024
Abstract
This thesis, a collaboration between Politecnico di Milano and Omega3C, analyses a large, diverse anonymized dataset on various aspects of the relationship between a major Italian bank and its customers. The analysis focuses on two key performance indicators (KPIs) that measure customer sentiment toward the bank’s services: the Net Promoter Score (NPS), reflecting the likelihood of a customer recommending the bank, and the Customer Effort Score (CES), which assesses the ease of customer interactions with the bank. The primary objective is to identify significant relationships between the covariates and these two KPIs and to develop a model capable of identifying the most influential predictors within the dataset. An initial analysis characterizes the average customer profile and typical responses across seven different survey types. The study begins with a thorough literature review of NPS and CES, emphasizing their theoretical underpinnings and practical relevance in various sectors. NPS is recognized for its utility in predicting customer loyalty, with customers categorized as promoters, passives, or detractors based on their scores. The dataset, containing anonymized responses from multiple customer surveys across digital and physical channels, is systematically analysed. It includes demographic data, survey response details, and segmented customer. To achieve the objective, this thesis exploit the usage of an innovative combination of advanced statistical and machine learning algorithms, aimed at identifying the most important covariates, and understanding how they can explain the target variables, NPS and CES, looking for significant causal relationships. Several models are employed to achieve these objectives, including Linear Regression, Lasso Regression, Logistic Regression and Random Forest. These models were chosen for their ability to identify key covariates and their meaningful connections to the target indicators. Additionally, a multivariate correlation analysis explores deeply the relationships among covariates and target variables. The thesis concludes with recommendations to enhance the effectiveness of similar studies, primarily through improved data collection practices and a larger sample size.File | Dimensione | Formato | |
---|---|---|---|
Elaborato Tesi.pdf
accessibile in internet solo dagli utenti autorizzati
Descrizione: Elaborato completo
Dimensione
1.8 MB
Formato
Adobe PDF
|
1.8 MB | Adobe PDF | Visualizza/Apri |
Executive_summary.pdf
accessibile in internet solo dagli utenti autorizzati
Descrizione: Executive summary
Dimensione
540.44 kB
Formato
Adobe PDF
|
540.44 kB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/230150