Generative adversarial networks and real-time anomaly detection in texture images

Since their first introduction in 2014, a particular category of neural networks, Generative Adversarial Networks (GANs) [23], has been able to outperform many other models in tasks such as producing photo-realistic images [27, 28], super-resolution [50], and domain translation [56]. A field in which GANs have not been able to outstand is anomaly detection in images, which can be defined as the task of segmenting and locating anomalous regions inside the images themselves. Indeed, many solutions based on GANs are evaluated and meant to be used on the over-simplistic one-class novelty detection problem and require the inefficient step of subdividing the input image in patches of a fixed size, to estimate an anomaly mask for the whole image. Our work focuses on implementing the first GAN architecture that is able to perform anomaly detection efficiently on full scale texture images, by using a fully convolutional architecture [34]. In particular, our model is build upon an extension of the first GAN model, the Bidirectional Generative Adversarial Network [18], that has already been used in the context of anomaly detection in images [54]. We also define new anomaly scores by taking advantage of each sub-network and we also define new aggregation strategies for these scores in order to recognize anomalous regions. We evaluated our model with two dataset containing texture images of industrial products such as the Scanning Electron Microscope’s images of nanofibers in the NanoTWICE dataset [12] and images of other materials in the MVTEC AD dataset [40], which is currently becoming a benchmark for various algorithms for anomaly detection in images. We show that our model can generate fake images that match the corresponding real ones. Most importantly, we prove that our model is capable of performing efficiently anomaly detection on full scale texture images by combining different anomaly scores that exploit all the sub-networks in the architecture, overcoming many limitations of the current state of the art for what regards GANs for anomaly detection.

A partire dalla sua nascita nel 2014, una particolare categoria di reti neurali, le Generative Adversarial Networks (GAN), o Reti Generative Avversarie [23], è riuscita ad ottenere risultati impressionanti nella generazioni di immagini foto-realistiche [27, 28], nell’aumento di risoluzione [50] e nei problemi di traduzione di un’immagine in un’altra [56]. Un campo dove però le GAN non sono ancora riuscite ad ottenere risultati al di fuori della norma è l’individuazione di anomalie. L’individuazione di anomalie nelle immagini può essere definita come il problema di localizzare e ritagliare regioni anomale all’interno delle stesse immagini. La maggior parte delle soluzioni basate sulle GAN infatti, si concentra sul più semplice problema di riconoscere intere immagini anomale, piuttosto che sulla localizzazione delle anomalie all’interno dell’immagine stessa. Questi modelli non possono essere usati per localizzare anomalie in immagini ad alta risoluzione se non suddividendo le suddette immagini in piccole parti, operazione costosa in termini di complessità computazionale. La tesi si concentra sull’implementare il primo metodo per l’individuazione di anomalie basato su una GAN con architettura Fully Convolutional [34], che può essere utilizzato efficientemente su immagini di texture ad alta risoluzione. Sviluppiamo il nostro modello, la Fully Convolutional Bidirectional Generative Adversarial Network (FCBiGAN), a partire da un’estensione delle Reti Generative Avversarie, la Bidirectional Generative Adversarial Network [18], con cui è possibile definire diverse misure di anomalia [54]. Inoltre sperimentiamo l’utilizzo di nuove misure di anomalia e nuove strategie di aggregazione delle diverse misure di anomalia, definibili mediante il nostro modello. Per la valutazione dei risultati ottenuti dal nostro modello, utilizziamo due dataset contenenti immagini di materiali industriali, come il dataset NanoTWICE , composto da immagini di nanofibre ottenute dal Microscopio Elettronico a Scansione [12] e il dataset MVTEC AD [40], utilizzato di recente come benchmark di riferimento per vari metodi per l’individuazione di anomalie. I nostri esperimenti dimostrano che la FCBiGAN riesce a raggiungere l’obiettivo di generare immagini sintetiche ad alta risoluzione che assomigliano alle corrispondenti immagini reali e che con il nostro modello è possibile localizzare le anomalie in immagini ad alta risoluzione in maniera efficiente, aggregando le diverse misure di anomalia definibili attraverso quest’ultimo.