The problem of missing data is appearing in some areas such as electrical grid data, biomedical signal processing, traffic network analysis, social network services, image processing and communication systems in which data set is aim to uncommon errors. Moreover, these data collections can be quite large and have more than two axes of variation, e.g., frequency, amplitude, time. Many applications in those domains object to capture the underlying hidden structure of the data; in other words, factorizing data sets with missing entries get a better vision of data structures. If we cannot settle the issue of missing data, many important data sets will be discarded or improperly analyzed. By considering previous studies which have only reflected on matrices, we focus here on the problem of electrical grid data in multi-way arrays (tensors) because the data most of the time have more than two modes of variation and are therefore best represented as multi-way arrays. For instance, in electrical grid, data of each record from a bus can be represented as a time-bus matrix; thus, data from multiple channels is three-dimensional (time, bus, and voltage or power) and forms a three-way array. Therefore, we need a robust and reliable approach for factorizing multi-way arrays (i.e., tensors) in the presence of missing data. In this work we present a proper analysis of multi-ways power gird data to evaluate the benefits of using tensor-based processing by computing an approximate Canonical Polyadic decomposition and High Order Singular Value decomposition. After interpolation of missing data, there is a low-rank structure in form of a quite dominant rank-one component (periodic) on top of the rest of the data. Moreover, in the last chapter we employ one of the most applicable tensor factorizations, Canonical Polyadic Decomposition (CP), and work out the CP model as a weighted least squares problem that models only the known entries as the second method for factorizing the missing data. We apply an algorithm called Canonical Polyadic Weighted Optimization using a first-order optimization approach to solve the weighted least squares problem. Based on numerical experiments that we use over the sampled data, this algorithm is shown to successfully factor matrices with noise and up to 70% missing data. We then illustrate the comparison results between these two methods over our power grid data which prove CP weighted optimization algorithm has a better factorization with respect to interpolation method.
Tensor based processing of electrical grid data
JAFARI GIV, DANIAL
2014/2015
Abstract
The problem of missing data is appearing in some areas such as electrical grid data, biomedical signal processing, traffic network analysis, social network services, image processing and communication systems in which data set is aim to uncommon errors. Moreover, these data collections can be quite large and have more than two axes of variation, e.g., frequency, amplitude, time. Many applications in those domains object to capture the underlying hidden structure of the data; in other words, factorizing data sets with missing entries get a better vision of data structures. If we cannot settle the issue of missing data, many important data sets will be discarded or improperly analyzed. By considering previous studies which have only reflected on matrices, we focus here on the problem of electrical grid data in multi-way arrays (tensors) because the data most of the time have more than two modes of variation and are therefore best represented as multi-way arrays. For instance, in electrical grid, data of each record from a bus can be represented as a time-bus matrix; thus, data from multiple channels is three-dimensional (time, bus, and voltage or power) and forms a three-way array. Therefore, we need a robust and reliable approach for factorizing multi-way arrays (i.e., tensors) in the presence of missing data. In this work we present a proper analysis of multi-ways power gird data to evaluate the benefits of using tensor-based processing by computing an approximate Canonical Polyadic decomposition and High Order Singular Value decomposition. After interpolation of missing data, there is a low-rank structure in form of a quite dominant rank-one component (periodic) on top of the rest of the data. Moreover, in the last chapter we employ one of the most applicable tensor factorizations, Canonical Polyadic Decomposition (CP), and work out the CP model as a weighted least squares problem that models only the known entries as the second method for factorizing the missing data. We apply an algorithm called Canonical Polyadic Weighted Optimization using a first-order optimization approach to solve the weighted least squares problem. Based on numerical experiments that we use over the sampled data, this algorithm is shown to successfully factor matrices with noise and up to 70% missing data. We then illustrate the comparison results between these two methods over our power grid data which prove CP weighted optimization algorithm has a better factorization with respect to interpolation method.File | Dimensione | Formato | |
---|---|---|---|
Thesis.pdf
accessibile in internet per tutti
Descrizione: THESIS
Dimensione
1.39 MB
Formato
Adobe PDF
|
1.39 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/114564