Automatic song segmentation based on time varying affective semantic descriptors

The request of fi le audio cataloguing is a common need of the people and the owners of commercial audio database that satisfy this, by allowing searching functions of music. This is possible because to each song is assigned one or more keywords called tags. This is a method that describes the whole content without taking account of temporal variation. In this thesis the purpose is to provide a temporal evolution to the tag by creating an audio segmenter with the task to perform an automatic division into sections and finds the variations into the music. Tags could describe several characteristics of a song, including the expressed mood. The segmentation will bases on detected mood. The mood detection step will be performed through the use of regressors, that implies a training and test steps in order to nd the best regression system. After that the goal is nding a good segmentation by analysing the temporal trend of the mood. The last step is to assign the tags to the found segments, and for this purpose a graphical user interface will be developed. and moreover it will be able to perform the previous operation This system will be tested with a survey conducted by common people in order to obtain rates that judge the e ectiveness of the system and to compare it with another working segmentation system. The proposed one provides improvements with better rates. Despite the optimal settings will depends by the song, the survey shows even that there are a general setting that works good.

La richiesta di catalogazione di file audio è un bisogno di chi ascolta musica, e i proprietari di database commerciali lo soddisfano, permettendo funzioni di ricerca. Questo è possibile perch e ad ogni canzone e assegnata una o pi u parole chiave chiamate tags. In questa tesi lo scopo e di dare un'evoluzione temporale al tag per mezzo della creazione di un segmentatore audio, la quale ha il compito di dividere automaticamente una canzone in sezioni e trovarne le variazioni. I tag possono descrivere varie caratteristiche di una canzone, tra cui lo stado d'animo espresso. La segmentazione si baserà su ciò. La fase di rilevazione dello stato d'animo sarà effettuata tramite l'uso di regressori, la quale implicano una fase di training e una fase di test, per trovare quale sia il sistema migliore, dopodich e, l'obiettivo resta di trovare una buona segmentazione attraverso un'analisi dell'andamento dello stato d'animo. L'ultimo passo consiste nell'assegnare tags ai segmenti trovati, e a questo scopo, è stata scritta un'interfaccia grafica, che sarà in grado di effettuare anche le precedenti operazioni appena descritte. Questo sistema sarà testato con un sondaggio posto a persone comuni per ottenere dei voti che giudicheranno l'efficacia del sistema e lo compareranno con un'altro tipo di sistema di segmentazione. Quello proposto fornirà dei miglioramenti dato che otterr a migliori voti. Nonostante il settaggio ottimale del sistema dipenderà dalla canzone, la ricerca mostra che comunque c'è un settaggio che funziona bene in casi generali.