Bridging the gap : improving classification accuracy score through post-processing techniques for synthetic-to-real learning

Nowadays, image generative models have reached incredible levels of visual quality. However, there is still a lack of a rigorous method to quantitatively evaluate their performance in such a way that the various models can be compared with each other, especially when the generated synthetic images are used for a downstream task. Recently, it has been proposed to measure the performance of a generative model by using the synthetic images to train a classifier, which is then tested on a dataset of real images. The resulting accuracy, known as Classification Accuracy Score (CAS), can then be compared to that of a classifier trained on real images. The closer the CAS performance is to that of the classifier trained on real images, the better the generative model can be considered. In this thesis, we propose a new pipeline designed to reduce the existing gap between the CAS and the accuracy obtained by training the classifier on real images. More precisely, we review and adopt some of the existing techniques employed in this research area, while also proposing a novel technique called Expansion Trick, which significantly improves the performance. Although our pipeline can be applied to almost any current generative model, we choose to focus our research on Generative Adversarial Networks (GANs). This choice is dictated by the fact that GANs are a well-studied and established architecture, which do not require too many resources to be trained and offer excellent sampling speed, which is crucial in our case. Through extensive testing on three different datasets, we demonstrate that our approach leads to a significant boost in performance compared to previous works, establishing a new state-of-the-art in this field. Overall, our work provide a promising foundation for future research in this area.

Al giorno d'oggi, i modelli generativi di immagini hanno raggiunto incredibili livelli di qualità visiva. Tuttavia, manca ancora un metodo rigoroso per valutare quantitativamente le loro prestazioni in maniera tale che i vari modelli possano essere comparati tra loro, soprattutto quando le immagini sintetiche generate vengono utilizzate per determinato scopo. Recentemente, è stato proposto di misurare le prestazioni di un modello generativo utilizzando le immagini sintetiche per addestrare un classificatore, che viene poi testato su un dataset di immagini reali. L'accuratezza risultante, nota come Classification Accuracy Score (CAS), può essere confrontata con quella di un classificatore addestrato su immagini reali. Più il CAS si avvicina all'accuratezza del classificatore addestrato su immagini reali, migliore può essere considerato il modello generativo. In questa tesi, proponiamo una nuova pipeline progettata per ridurre il divario esistente tra il CAS e l'accuratezza ottenuta addestrando il classificatore su immagini reali. Più precisamente, esaminiamo e adottiamo alcune delle tecniche già esistenti impiegate in questa area di ricerca, proponendo inoltre una nuova tecnica chiamata Expansion Trick, che migliora significativamente le prestazioni. Sebbene la nostra pipeline possa essere applicata a quasi tutti gli attuali modelli generativi, abbiamo scelto di concentrare la nostra ricerca sulle Generative Adversarial Networks (GANs). Questa scelta è dettata dal fatto che le GAN sono un'architettura ampiamente studiata e ben consolidata, che non richiede troppe risorse per essere addestrata e offre un'ottima velocità di campionamento, il che è cruciale per il nostro utilizzo. Attraverso un'ampia sperimentazione su tre diversi dataset, dimostriamo che il nostro approccio porta a un significativo miglioramento delle prestazioni rispetto ai lavori precedenti, stabilendo così un nuovo stato dell'arte in questo campo. Nel complesso, il nostro lavoro fornice una base promettente per future ricerche in questo settore.