With the ever-increasing popularity of entertainment streaming services, social media, e-commerce websites, etc., very large and diverse catalogs become usually hard to approach by users. Predicting the single customer's tastes and creating personalized lists of products has become a need for companies. By aiming at alleviating this problem, recommender systems have recently become major players in the field of machine learning. These systems leverage the data of users and items of the catalog to extract information about each user's taste. The quality of the results strictly depends on data quality and quantity. In particular, in practical applications, the amount of interactions has an extremely low density compared to the whole dataset. That is, users tend to interact with or review a very small subset of the catalog. This issue is known as the data sparsity problem. Over the years, many different knowledge transfer solutions to this problem have been proposed. The majority of these techniques rely on the overlap of users, items, or both, between two datasets, source and target. In this thesis, we aim to analyze a particular set of knowledge transfer techniques based on the rating pattern transfer approach without any data overlapping between datasets, called codebook transfer. Since their efficacy in specific tasks is not consistent across past evaluations, we show their formulation and performance in providing recommendations by analyzing their models and by performing experiments on several datasets, by including multiple baselines and an ablation study. The results of the conducted experiments provide theoretical and empirical evidence that codebook transfer is not able to encode and transfer knowledge useful to provide recommendations and its use cannot be generalized to real domains.
Con la popolarità sempre in aumento dei servizi di intrattenimento in streaming, social media, siti di e-commerce, etc., è frequente che cataloghi molto estesi e vari diventino di difficile approccio per gli utenti. Predirre i gusti del singolo consumatore e creare liste di prodotti personalizzate è divenuta una necessità per le aziende. Nel tentativo di attenuare il problema, i recommender systems sono recentemente diventati importanti punti di riferimento nel campo del machine learning. Questi sistemi sfruttano i dati di utenti e oggetti del catalogo per estrarre informazioni riguardanti i gusti di ciascun utente. La qualità dei risultati dipende strettamente dalla qualità e quantità dei dati. In particolare, nelle applicazioni pratiche, il numero di interazioni ha una densità estremamente bassa se comparato con l'intero dataset. Ossia, gli utenti tendono ad interagire o a recensire un sottoinsieme molto piccolo del catalogo. Questo problema è noto come "data sparsity problem". Negli anni, sono state proposte diverse soluzioni per il trasferimento di informazione. La maggior parte di queste tecniche si affida alla sovrapposizione di utenti, oggetti, o entrambi, tra i due dataset, sorgente e bersaglio. In questa tesi, miriamo ad analizzare un particolare insieme di tecniche per il trasferimento di informazione basate su un approccio di trasferimento dell'andamento dei rating senza alcuna sovrapposizione tra i dataset, chiamato "codebook transfer". Siccome la loro efficacia in compiti specifici non è consistente tra le valutazioni passate, mostriamo la loro formulazione e prestazione nel fornire raccomandazioni analizzando i loro modelli e svolgendo esperimenti su diversi dataset, includendo molteplici baseline e uno studio sulla rimozione di dati. I risultati degli esperimenti condotti forniscono evidenza teorica ed empirica che il "codebook transfer" non è in grado di codificare e trasferire informazione utile a fornire raccomandazioni e il suo utilizzo non può essere generalizzato a domini reali.
Assessing the effectiveness of rating pattern transfer models for recommender systems
Bozzano, Giovanni
2020/2021
Abstract
With the ever-increasing popularity of entertainment streaming services, social media, e-commerce websites, etc., very large and diverse catalogs become usually hard to approach by users. Predicting the single customer's tastes and creating personalized lists of products has become a need for companies. By aiming at alleviating this problem, recommender systems have recently become major players in the field of machine learning. These systems leverage the data of users and items of the catalog to extract information about each user's taste. The quality of the results strictly depends on data quality and quantity. In particular, in practical applications, the amount of interactions has an extremely low density compared to the whole dataset. That is, users tend to interact with or review a very small subset of the catalog. This issue is known as the data sparsity problem. Over the years, many different knowledge transfer solutions to this problem have been proposed. The majority of these techniques rely on the overlap of users, items, or both, between two datasets, source and target. In this thesis, we aim to analyze a particular set of knowledge transfer techniques based on the rating pattern transfer approach without any data overlapping between datasets, called codebook transfer. Since their efficacy in specific tasks is not consistent across past evaluations, we show their formulation and performance in providing recommendations by analyzing their models and by performing experiments on several datasets, by including multiple baselines and an ablation study. The results of the conducted experiments provide theoretical and empirical evidence that codebook transfer is not able to encode and transfer knowledge useful to provide recommendations and its use cannot be generalized to real domains.File | Dimensione | Formato | |
---|---|---|---|
master-thesis.giovanni-bozzano.pdf
accessibile in internet per tutti
Dimensione
2.75 MB
Formato
Adobe PDF
|
2.75 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/177650