The present work is performed within the Student Profile for Enhancing Tutoring Engineering (SPEET) project, an ERASMUS+ project which involves Politecnico di Milano and five other European universities, aiming to open a new perspective to university tutoring systems. The main goal of this work is to analyse the relationship between the success probability in getting the degree and the different student’s characteristics, grouping by Engineering programme attended. Specifically, the analysis focuses on Bachelor of Science Engineering degrees from Politecnico di Milano: the dataset of interest contains detailed information about more than 41,000 students who enrolled in BSc from 2010 to 2016. Collected data include degree details, student’s performance on each of the study plan subject, as well as collateral information about the student. Both parametric and non-parametric methods are applied: generalized linear mixed-effects (GLME) models as well as established classification algorithms such as classification trees and random forests. Moreover, a new generalized mixed-effects tree method is proposed. It introduces cluster-specific effects on the response variable, while providing as easily interpretable models as standard classification trees. The simulation results show that the proposed method provides substantial improvements over standard trees where the random effects are not negligible. Finally, this method is applied to the dataset of interest and its effectiveness in detecting dropout cases is evaluated. The work identifies the characteristics of the student that more influence their performances and the significant impact of their study programme deriving by adjusting it to the student’s characteristics.
Questa tesi è parte del progetto Student Profile for Enhancing Tutoring Engineering (SPEET), un progetto ERASMUS+ che coinvolge il Politecnico di Milano e altri cinque atenei europei, con lo scopo di aprire una nuova prospettiva ai sistemi di tutorato universitario. L’obiettivo principale di questo lavoro è quello di analizzare la relazione tra la probabilità di successo nell’ottenere la laurea e le diverse caratteristiche dello studente, raggruppando le osservazioni in base al corso di studi di Ingegneria frequentato. In particolare l’analisi si concentra sui Corsi di Laurea triennale di Ingegneria del Politecnico di Milano: il dataset considerato include informazioni dettagliate su oltre 41.000 studenti che si sono iscritti alla Laurea di primo livello tra il 2010 e il 2016. I dati raccolti includono informazioni riguardo corsi di studio, caratteristiche degli studenti e valutazioni ottenute. Vengono applicati metodi parametrici come modelli lineari generalizzati a effetti misti (GLME) e algoritmi di classificazione non parametrici come alberi di classificazione e random forest. Nella tesi è proposto un nuovo metodo, ad albero generalizzato a effetti misti. Questo applica effetti specifici di raggruppamento alla variabile di risposta, fornendo simultaneamente modelli facilmente interpretabili allo stesso modo degli alberi di classificazione tradizionali. I risultati delle simulazioni mostrano che, nel caso in cui gli effetti casuali non siano trascurabili, attraverso il metodo proposto si ottengono sostanziali miglioramenti rispetto agli alberi tradizionali. Questo metodo è poi applicato al dataset di interesse per valutarne l’efficacia riguardo al rilevamento dei dropout. Il risultato del lavoro individua le caratteristiche che influenzano maggiormente le prestazioni degli studenti e rileva effetti significativi relativi al corso di studio frequentato.
Statistical analysis of engineering BSc dropout through mixed effects models
FONTANA, LUCA
2017/2018
Abstract
The present work is performed within the Student Profile for Enhancing Tutoring Engineering (SPEET) project, an ERASMUS+ project which involves Politecnico di Milano and five other European universities, aiming to open a new perspective to university tutoring systems. The main goal of this work is to analyse the relationship between the success probability in getting the degree and the different student’s characteristics, grouping by Engineering programme attended. Specifically, the analysis focuses on Bachelor of Science Engineering degrees from Politecnico di Milano: the dataset of interest contains detailed information about more than 41,000 students who enrolled in BSc from 2010 to 2016. Collected data include degree details, student’s performance on each of the study plan subject, as well as collateral information about the student. Both parametric and non-parametric methods are applied: generalized linear mixed-effects (GLME) models as well as established classification algorithms such as classification trees and random forests. Moreover, a new generalized mixed-effects tree method is proposed. It introduces cluster-specific effects on the response variable, while providing as easily interpretable models as standard classification trees. The simulation results show that the proposed method provides substantial improvements over standard trees where the random effects are not negligible. Finally, this method is applied to the dataset of interest and its effectiveness in detecting dropout cases is evaluated. The work identifies the characteristics of the student that more influence their performances and the significant impact of their study programme deriving by adjusting it to the student’s characteristics.File | Dimensione | Formato | |
---|---|---|---|
2018_04_Fontana.pdf
accessibile in internet solo dagli utenti autorizzati
Descrizione: Testo della tesi
Dimensione
4.88 MB
Formato
Adobe PDF
|
4.88 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/140103