The job domain is still an unexplored field in recommender systems, though with the birth of business social networks like Xing and LinkedIn, used mainly to propose and find jobs, this domain have become more popular and researches in this sense are much needed. The purposes of our thesis is exploring this field, develop novel Content-Based and Collaborative Filtering approaches focused on job recommendation and evaluate their performance comparing them with more advance techniques found in literature. In this work we found that for Collaborative Filtering algorithms, the adoption of the IDF measure as rating in a field where explicit rating is missing; and the inclusion of the user himself in its collaborative neighbourhood greatly improve the performance. We also found that extensive feature selection and time-based considerations can improve the results of Content Based techniques. The quality of our work was also proven during the RecsysChallenge 2016, where the algorithm we developed combined with an ensemble technique allowed us to reach the fourth position overall in the final leadearboard, and the first position among academic teams.
Il campo relativo alla raccomandazione delle offerte di lavoro è uno dei domini meno esplorati in ambito Recommender Systems, anche se negli ultimi anni la ricerca focalizzata su di esso ha ricevuto nuovo impulso, grazie all'affermarsi di social network orientati al business come Xing e LinkedIn. La nostra tesi prevede tre scopi: innanzitutto vogliamo esplorare approfonditamente questo dominio valutandone vantaggi e svantaggi. In secondo luogo, ci proponiamo di sviluppare nuovi approcci di tipo Collaborativo e Content-Based speciallizzati su di esso; infine vogliamo valutare i risultati ottenuti dalle nuove tecniche comparandoli con altri metodi avanzati noti in letteratura. Grazie al risultato del nostro lavoro abbiamo scoperto che per quanto riguarda gli algoritmi Collaborativi, l'utilizzo della metrica IDF al posto del rating (completamente assente in questo dominio) e l'inclusione dell'utente stesso nel suo neighbourhood collaborativo migliorano notevolemnte le performance finali. Dal punto di vista degli algoritmi Content-Based invece abbiamo notato che attraverso un lavoro esaustivo di selezione degli attributi e una maggiore considerazione dei parametri temporali si ottengono ottimi risultati. La qualità del nostro lavoro è stata anche comprovata durante la RecSysChallenge 2016: gli algoritmi da noi sviluppati, opportunamente combinati in un ensemble, ci hanno permesso di raggiungere la quarta posizione nella classifica finale, e il primo posto tra i team accademici.
Collaborative filtering and content-based filtering algorithms for the job recommendation problem
KAMBEROSKI, ERVIN;SACCHI, ELENA
2015/2016
Abstract
The job domain is still an unexplored field in recommender systems, though with the birth of business social networks like Xing and LinkedIn, used mainly to propose and find jobs, this domain have become more popular and researches in this sense are much needed. The purposes of our thesis is exploring this field, develop novel Content-Based and Collaborative Filtering approaches focused on job recommendation and evaluate their performance comparing them with more advance techniques found in literature. In this work we found that for Collaborative Filtering algorithms, the adoption of the IDF measure as rating in a field where explicit rating is missing; and the inclusion of the user himself in its collaborative neighbourhood greatly improve the performance. We also found that extensive feature selection and time-based considerations can improve the results of Content Based techniques. The quality of our work was also proven during the RecsysChallenge 2016, where the algorithm we developed combined with an ensemble technique allowed us to reach the fourth position overall in the final leadearboard, and the first position among academic teams.File | Dimensione | Formato | |
---|---|---|---|
Kamberoski_Sacchi.pdf
solo utenti autorizzati dal 26/11/2017
Descrizione: Thesis text
Dimensione
824.34 kB
Formato
Adobe PDF
|
824.34 kB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/131897