Meme e computer vision : come i sistemi di riconoscimento automatico del contenuto interpretano gli internet meme

nternet memes have become a key communication tool in contemporary society, and their analysis is an area of growing interest in social and communication research. Computer vision systems are revolutionizing the analysis of complex images such as internet memes; however, how do these systems interpret and analyze the multimodality and content present in such artifacts? This paper focuses on the identification, classification, and evaluation of memes through the analysis of six automatic content recognition services: Amazon Rekognition, Clarifai, Google Vision, Imagga, Keras EfficientNetB7 and Microsoft Azure. In particular, how these systems perceive and categorize different elements of Internet memes is examined, comparing their performance and identifying differences in the way artificial intelligence interpret these multimodal digital artifacts. By means of object detection, a classification of the elements in memes is performed, extracting information and labels from them in order to compare the results of different computer vision systems. In addition, the ability of these systems to understand the cultural and semiotic context of memes is analyzed by assessing whether their priorities are focused on formal aspects or on the meanings of content within their context. The ultimate goal of the work is to guide researchers in choosing the computer vision system best suited to their needs based on the results obtained in this analysis. To this end, the design output of this research consists of an online report and archive entitled Memes through Computer Vision in which the six different automatic content recognition services are compared and the results of the analysis performed on a common internet meme dataset are presented. The report includes an investigation of the relationships between the labels recognized by computer vision systems, a series of examinations of the causes of performance differences between computer vision systems, and an analysis of the consistency in the interpretation of labels by automatic content recognition services.

Gli internet meme sono diventati uno strumento di comunicazione fondamentale nella società contemporanea e la loro analisi rappresenta un’area di crescente interesse per la ricerca sociale e di comunicazione. I sistemi di computer vision stanno rivoluzionando l’analisi di immagini complesse come gli internet meme; tuttavia, come interpretano ed analizzano questi sistemi la multimodalità e il contenuto presenti in tali artefatti? Il presente lavoro si concentra sull’identificazione, classificazione e valutazione dei meme attraverso l’analisi di sei servizi di riconoscimento automatico del contenuto: Amazon Rekognition, Clarifai, Google Vision, Imagga, Keras EfficientNetB7 e Microsoft Azure. In particolare viene esaminato come questi sistemi percepiscono e categorizzano i diversi elementi degli internet meme, confrontando le loro prestazioni e individuando le differenze nel modo in cui le intelligenze artificiali interpretano questi artefatti digitali multimodali. Mediante l’object detection viene effettuata una classificazione degli elementi presenti nei meme, estrapolando informazioni e label da essi al fine di comparare i risultati dei diversi sistemi di computer vision. Inoltre, si analizza la capacità di questi sistemi di comprendere il contesto culturale e semiotico dei meme valutando se le loro priorità siano focalizzate sugli aspetti formali o sui significati dei contenuti all’interno del loro contesto. L’obiettivo finale del lavoro è guidare i ricercatori nella scelta del sistema di computer vision più adatto alle loro esigenze basandosi sui risultati ottenuti in questa analisi. A tal fine l’output progettuale di questa ricerca consiste in un report e archivio online intitolato Memes through Computer Vision in cui vengono confrontati i sei diversi servizi di riconoscimento automatico del contenuto e presentati i risultati dell’analisi effettuata su un dataset di internet meme comune. Il report include un’indagine sulle relazioni tra le etichette riconosciute dai sistemi di visione artificiale, una serie di esami sulle cause delle differenze di prestazione tra i sistemi di computer vision e un’analisi della coerenza nell’interpretazione dei label da parte dei servizi di riconoscimento automatico del contenuto.