Open-source communities play a vital role in modern software development, relying heavily on effective communication and collaboration among developers. This thesis presents a comprehensive framework for analyzing developer sentiment in open-source communities, integrating sentiment analysis with social metrics and network construction to understand the human factors influencing software development. We developed a cross-platform sentiment classification model, BERT-CP, tailored for software engineering contexts. The model was validated against existing datasets, demonstrating superior performance in accurately classifying sentiment across various communication platforms. We constructed layered developer communication, collaboration, and socio-technical networks from raw data sourced from open-source community platforms. By integrating these networks, we developed a framework capable of extracting a wide range of metrics reflecting sentiment, community smells, and socio-technical congruence. An empirical analysis was conducted on 20 open-source communities from the Apache Software Foundation. The analysis addressed three primary research questions: the impact of sentiment on individual well-being and productivity, the influence of sentiment on community health, and the effect of sentiment on project success. Key findings include the mirroring effect of received sentiment on developers' outgoing sentiment, the role of emotional volatility in community smells, the causal relationship between sentiment indicators and community health metrics, and the impact of positive sentiment on issue resolution times. The research highlights the significant role of sentiment in shaping the dynamics of open-source software development communities. Future work should expand the empirical analysis to a broader range of communities, incorporate meta-attributes of communities to understand the variability of sentiment effects, and explore the intersections among different types of community smell constructors.
Le comunità open-source svolgono un ruolo fondamentale nello sviluppo software moderno, basandosi fortemente sulla comunicazione e collaborazione efficace tra gli sviluppatori. Questa tesi presenta un quadro completo per l'analisi del sentiment degli sviluppatori nelle comunità open-source, integrando l'analisi del sentiment con metriche sociali e la costruzione di reti per comprendere i fattori umani che influenzano lo sviluppo software. Abbiamo sviluppato un modello di classificazione del sentiment cross-platform, BERT-CP, adattato ai contesti dell'ingegneria del software. Il modello è stato validato rispetto a dataset esistenti, dimostrando prestazioni superiori nella classificazione accurata del sentiment su varie piattaforme di comunicazione. Abbiamo costruito reti stratificate di comunicazione, collaborazione e socio-tecniche degli sviluppatori dai dati grezzi provenienti dalle piattaforme comunitarie open-source. Integrando queste reti, abbiamo sviluppato un framework in grado di estrarre una vasta gamma di metriche che riflettono il sentiment, gli odori della comunità e la congruenza socio-tecnica. È stata condotta un'analisi empirica su venti comunità open-source della Apache Software Foundation. L'analisi ha affrontato tre principali domande di ricerca: l'impatto del sentiment sul benessere e la produttività individuale, l'influenza del sentiment sulla salute della comunità e l'effetto del sentiment sul successo del progetto. I risultati principali includono l'effetto specchio del sentiment ricevuto sul sentiment in uscita degli sviluppatori, il ruolo della volatilità emotiva negli odori della comunità, la relazione causale tra indicatori di sentiment e metriche di salute della comunità e l'impatto del sentiment positivo sui tempi di risoluzione delle issue. La ricerca evidenzia il ruolo significativo del sentiment nel plasmare le dinamiche delle comunità di sviluppo software open-source. I lavori futuri dovrebbero espandere l'analisi empirica a un'ampia gamma di comunità, incorporare meta-attributi delle comunità per comprendere la variabilità degli effetti del sentiment e esplorare le intersezioni tra i diversi tipi di costruttori di odori della comunità.
Automated evaluation of developer sentiment in open source communities: an empirical analysis
ZHANG, HAOTIAN
2023/2024
Abstract
Open-source communities play a vital role in modern software development, relying heavily on effective communication and collaboration among developers. This thesis presents a comprehensive framework for analyzing developer sentiment in open-source communities, integrating sentiment analysis with social metrics and network construction to understand the human factors influencing software development. We developed a cross-platform sentiment classification model, BERT-CP, tailored for software engineering contexts. The model was validated against existing datasets, demonstrating superior performance in accurately classifying sentiment across various communication platforms. We constructed layered developer communication, collaboration, and socio-technical networks from raw data sourced from open-source community platforms. By integrating these networks, we developed a framework capable of extracting a wide range of metrics reflecting sentiment, community smells, and socio-technical congruence. An empirical analysis was conducted on 20 open-source communities from the Apache Software Foundation. The analysis addressed three primary research questions: the impact of sentiment on individual well-being and productivity, the influence of sentiment on community health, and the effect of sentiment on project success. Key findings include the mirroring effect of received sentiment on developers' outgoing sentiment, the role of emotional volatility in community smells, the causal relationship between sentiment indicators and community health metrics, and the impact of positive sentiment on issue resolution times. The research highlights the significant role of sentiment in shaping the dynamics of open-source software development communities. Future work should expand the empirical analysis to a broader range of communities, incorporate meta-attributes of communities to understand the variability of sentiment effects, and explore the intersections among different types of community smell constructors.File | Dimensione | Formato | |
---|---|---|---|
Automated_Evaluation_of_Developer_Sentiment_in_Open_Source_Communities__An_Empirical_Analysis-F.pdf
solo utenti autorizzati a partire dal 01/07/2025
Dimensione
4.74 MB
Formato
Adobe PDF
|
4.74 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/223844