The problem of missing data is appearing in some areas such as electrical grid data, biomedical signal processing, traffic network analysis, social network services, image processing and communication systems in which data set is aim to uncommon errors. Moreover, these data collections can be quite large and have more than two axes of variation, e.g., frequency, amplitude, time. Many applications in those domains object to capture the underlying hidden structure of the data; in other words, factorizing data sets with missing entries get a better vision of data structures. If we cannot settle the issue of missing data, many important data sets will be discarded or improperly analyzed. By considering previous studies which have only reflected on matrices, we focus here on the problem of electrical grid data in multi-way arrays (tensors) because the data most of the time have more than two modes of variation and are therefore best represented as multi-way arrays. For instance, in electrical grid, data of each record from a bus can be represented as a time-bus matrix; thus, data from multiple channels is three-dimensional (time, bus, and voltage or power) and forms a three-way array. Therefore, we need a robust and reliable approach for factorizing multi-way arrays (i.e., tensors) in the presence of missing data. In this work we present a proper analysis of multi-ways power gird data to evaluate the benefits of using tensor-based processing by computing an approximate Canonical Polyadic decomposition and High Order Singular Value decomposition. After interpolation of missing data, there is a low-rank structure in form of a quite dominant rank-one component (periodic) on top of the rest of the data. Moreover, in the last chapter we employ one of the most applicable tensor factorizations, Canonical Polyadic Decomposition (CP), and work out the CP model as a weighted least squares problem that models only the known entries as the second method for factorizing the missing data. We apply an algorithm called Canonical Polyadic Weighted Optimization using a first-order optimization approach to solve the weighted least squares problem. Based on numerical experiments that we use over the sampled data, this algorithm is shown to successfully factor matrices with noise and up to 70% missing data. We then illustrate the comparison results between these two methods over our power grid data which prove CP weighted optimization algorithm has a better factorization with respect to interpolation method.

Tensor based processing of electrical grid data

JAFARI GIV, DANIAL
2014/2015

Abstract

The problem of missing data is appearing in some areas such as electrical grid data, biomedical signal processing, traffic network analysis, social network services, image processing and communication systems in which data set is aim to uncommon errors. Moreover, these data collections can be quite large and have more than two axes of variation, e.g., frequency, amplitude, time. Many applications in those domains object to capture the underlying hidden structure of the data; in other words, factorizing data sets with missing entries get a better vision of data structures. If we cannot settle the issue of missing data, many important data sets will be discarded or improperly analyzed. By considering previous studies which have only reflected on matrices, we focus here on the problem of electrical grid data in multi-way arrays (tensors) because the data most of the time have more than two modes of variation and are therefore best represented as multi-way arrays. For instance, in electrical grid, data of each record from a bus can be represented as a time-bus matrix; thus, data from multiple channels is three-dimensional (time, bus, and voltage or power) and forms a three-way array. Therefore, we need a robust and reliable approach for factorizing multi-way arrays (i.e., tensors) in the presence of missing data. In this work we present a proper analysis of multi-ways power gird data to evaluate the benefits of using tensor-based processing by computing an approximate Canonical Polyadic decomposition and High Order Singular Value decomposition. After interpolation of missing data, there is a low-rank structure in form of a quite dominant rank-one component (periodic) on top of the rest of the data. Moreover, in the last chapter we employ one of the most applicable tensor factorizations, Canonical Polyadic Decomposition (CP), and work out the CP model as a weighted least squares problem that models only the known entries as the second method for factorizing the missing data. We apply an algorithm called Canonical Polyadic Weighted Optimization using a first-order optimization approach to solve the weighted least squares problem. Based on numerical experiments that we use over the sampled data, this algorithm is shown to successfully factor matrices with noise and up to 70% missing data. We then illustrate the comparison results between these two methods over our power grid data which prove CP weighted optimization algorithm has a better factorization with respect to interpolation method.
HAARDT, MARTIN
ING - Scuola di Ingegneria Industriale e dell'Informazione
18-dic-2015
2014/2015
Tesi di laurea Magistrale
File allegati
File Dimensione Formato  
Thesis.pdf

accessibile in internet per tutti

Descrizione: THESIS
Dimensione 1.39 MB
Formato Adobe PDF
1.39 MB Adobe PDF Visualizza/Apri

I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10589/114564