Analysis of machine learning methods for anomaly detection of electrical power consumption in buildings

Anomaly detection is the process of finding patterns or points that distinguish themselves from a more regular collection. There are different causes for this happening: it could be caused by external factors or related to faults in the system that generated the patterns or the points. Sometimes data come as time series, like for power consumption of buildings. One way to perform anomaly detection is to “teach” a machine learning method what the normal behavior of a system should be and then identify the deviations. The goal of this thesis is to investigate whether and to what degree some of the most promising machine learning methods, such as autoencoders, LSTM and statistical methods, have the potential to deal with the problem of identifying anomalies in power consumption time series in buildings. The reason for finding such points that disrupt the general trend is not only economical, since they might create unexpected costs, but also concerns the environment because an excessive consumption of resources in buildings generates more pollution. In a nutshell, the aim of this thesis is to try to find what method works best in this setting and understand why they might perform better under certain conditions, and then try to propose a solution to the lack of datasets used for training the previously mentioned methods. This last solution is based on transfer learning.

Il rilevamento di anomalie è il processo grazie al quale è possible trovere pattern o punti che si distinguono da un insieme più regolare. Ci sono svariate cause per l’emergere delle anomalie: potrebbero essere causate da fattori esterni oppure da problemi interni al sistema considerato. I dati sul consumo energetico degli edifici sono raggruppati in serie temporali nelle quali le anomalie vanno individuate. Un modo per attuare il rilevamento di anomalie è quello di “insegnare” ad un metodo di machine learning quale dovrebbe essere il comportamento normale di un sistema e poi riconoscere gli scostamenti. Quello che questo lavoro di tesi vuole vuole indagare è fino a che punto alcuni dei metodi più promettenti di machine learning, come autoencoders, LSTM e metodi statistici, hanno il potenziale di individuare anomalie nei dati di consumo energetico negli edifici. La ragione per trovare questi punti anomali che rompono il trend nominale non è solo economica, poichè potrebbero portare a costi inaspettati, ma riguarda anche l’ambiente, perchè un consumo eccessivo di risorse negli edifici può generare in incremento dell’inquinamento. In sintesi, lo scopo di questa tesi è quello di provare a capire quale metodo funziona meglio in questo campo e capire perchè alcuni metodi potrebbero funzionare meglio sotto certe condizioni, e poi proporre una soluzione alla mancanza di dataset utilizzati per il training dei metodi mezionati precedentemente. Questa soluzione è basata sul transfer learning.