A combinatorial, multiple-play semi-bandit approach for automatic generation of newspaper Web pages

The homepage of online newspapers and magazines is the foremost gateway for users looking for interesting and novel news. This kind of pages typically aggregate two types of contents: the organic and the sponsored results. The heterogeneous nature of the contents presented on such pages (big images, videos, small images, news snippets, ads etc.) generates the need for finding the best presentation for the whole set of items, in a way to maximize the user’s interest and attention. Traditionally, homepages of online newspapers are manually composed, attempting to create a set of news which should be, in general, interesting for a generic set of users. This approach leads to the generation of static pages which are not personalized on the basis of the user they are shown to. In this thesis, a new approach to deal with whole-page optimization is presented. We formalized the problem as a combinatorial multi-armed bandit problem, with the objective to achieve the aforementioned tasks and, therefore, to obtain a personalized page for each type of user. Linear optimization techniques are used to solve the combinatorial problems introduced by this type of bandits, encouraging also page diversity (in terms of category of contents) while optimizing for user’s interest, and fulfilling business constraints on the ads presentation. The framework is also capable of tracking the changes in the user behaviour during all the optimization horizon, proposing fresh news at every access and penalizing contents already shown more than one time or already clicked. Through extensive offline experiments, we show that our proposed method is capable of learning both user’s tastes and behaviour and, as a consequence, of producing personalized pages basing on the received feedbacks. We tested our algorithms with real people and we showed their performance through online controlled experiments.

La homepage dei giornali e delle riviste online ricopre un ruolo di estrema importanza in quanto deve catturare il più possibile l’attenzione di chi sta osservando. Queste pagine sono tipicamente composte da due tipi di contenuti: le notizie e le inserzioni pubblicitarie. La natura eterogenea dei possibili elementi presentati in questo tipo di pagine (immagini grandi o piccole, video, inserzioni etc.) danno origine al bisogno di trovare la migliore presentazione che massimizzi l’attenzione e l’interesse dell’utente. Normalmente, le homepage dei giornali online sono create manualmente cercando di trovare l’insieme di news che, in generale, risultano essere di interesse per l’utente medio. Questo approccio dà però origine a pagine statiche (che non cambiano visita dopo visita del singolo utente) senza nessuna personalizzazione che tenga in considerazione colui a cui la pagina viene mostrata. Abbiamo formulato il problema di ottimizzazione globale di pagina come un problema combinatorio di multi-armed bandit, con l’obiettivo di ottenere pagine dinamiche e personalizzate per ogni tipo di utente. Formulazioni di ottimizzazione lineare sono state usate per risolvere i problemi combinatori associati a questo tipo di bandit, incentivando la diversità di pagina seppur sempre massimizzando la soddisfazione dell’utente, considerando anche i vincoli di business inerenti al display di ads. Il framework è anche i grado di apprendere i cambiamenti nel comportamento dell’utente che si verificano durante tutto il periodo di ottimizzazione, favorendo notizie fresche e penalizzando notizie già mostrate più di una volta o già cliccate. Attraverso diversi esperimenti offline siamo in grado di mostrare la capacità del nostro agente di apprendere correttamente i gusti dell’utente e il suo comportamento e, come conseguenza, la capacità di produrre homepage personalizzate basandosi sulle interazioni precedenti. Abbiamo testato infine i nostri algoritmi e ne abbiamo provata l'effettiva efficacia con persone reali attraverso a/b test.