Graph data structures model relations between entities in many different application domains. With the emergence of the Internet of Things (IoT) and Big data, graphs have become very large. Modern-day graph processing systems partition graphs over the distributed system and process them in parallel while combining the final results on some centralized system. Although Graph processing systems enable scalable distributed computations over large graphs, they are limited to static scenarios in which the graph's structure does not change. Most modern-day applications are dynamic, resulting in graphs that continuously evolve. Understanding the evolution of graphs is key to enabling timely reactions when necessary. During my PhD, we addressed this problem by proposing a new model to express temporal patterns over graph data structures. The model seamlessly integrates computations over graphs to extract relevant values and temporal operators that define patterns of interest in the evolution of the graph. During the research, the syntax and semantics of this model were developed, and its concrete implementation was in a framework called FlowGraph, a middleware for temporal pattern recognition in large-scale graphs. The performance and scalability of FlowGraph are thoroughly evaluated using various workloads and use cases. FlowGraph presents a level of performance that is comparable to any state-of-the-art graph processing tool which processes static graphs. In the presence of temporal patterns, it can further optimize processing by avoiding complex graph computations until strictly necessary for pattern evaluation.
Le strutture dati dei grafi modellano le relazioni tra entità in molti diversi domini applicativi. Con l'avvento dell'Internet of Things (IoT) e dei Big Data, i grafi sono diventati molto grandi. I moderni sistemi di elaborazione dei grafi suddividono i grafi sul sistema distribuito e li elaborano in parallelo, combinando i risultati finali su un sistema centralizzato. Sebbene i sistemi di elaborazione dei grafi consentano calcoli distribuiti scalabili su grafi di grandi dimensioni, sono limitati a scenari statici in cui la struttura del grafo non cambia. La maggior parte delle applicazioni moderne è dinamica, con conseguenti grafi in continua evoluzione. Comprendere l'evoluzione dei grafi è fondamentale per consentire reazioni tempestive quando necessario. Durante il mio dottorato, abbiamo affrontato questo problema proponendo un nuovo modello per esprimere pattern temporali su strutture dati dei grafi. Il modello integra perfettamente i calcoli sui grafi per estrarre valori rilevanti e operatori temporali che definiscono pattern di interesse nell'evoluzione del grafo. Durante la ricerca, sono state sviluppate la sintassi e la semantica di questo modello e la sua implementazione concreta è avvenuta in un framework chiamato FlowGraph, un middleware per il riconoscimento di pattern temporali in grafi di grandi dimensioni. Le prestazioni e la scalabilità di FlowGraph vengono valutate attentamente utilizzando diversi carichi di lavoro e casi d'uso. FlowGraph offre un livello di prestazioni paragonabile a qualsiasi strumento di elaborazione di grafi all'avanguardia che elabora grafi statici. In presenza di pattern temporali, può ottimizzare ulteriormente l'elaborazione evitando calcoli complessi sui grafi fino a quando non siano strettamente necessari per la valutazione dei pattern.
Efficient processing of graph based data streams
Chaudhry, Hassan Nazeer
2024/2025
Abstract
Graph data structures model relations between entities in many different application domains. With the emergence of the Internet of Things (IoT) and Big data, graphs have become very large. Modern-day graph processing systems partition graphs over the distributed system and process them in parallel while combining the final results on some centralized system. Although Graph processing systems enable scalable distributed computations over large graphs, they are limited to static scenarios in which the graph's structure does not change. Most modern-day applications are dynamic, resulting in graphs that continuously evolve. Understanding the evolution of graphs is key to enabling timely reactions when necessary. During my PhD, we addressed this problem by proposing a new model to express temporal patterns over graph data structures. The model seamlessly integrates computations over graphs to extract relevant values and temporal operators that define patterns of interest in the evolution of the graph. During the research, the syntax and semantics of this model were developed, and its concrete implementation was in a framework called FlowGraph, a middleware for temporal pattern recognition in large-scale graphs. The performance and scalability of FlowGraph are thoroughly evaluated using various workloads and use cases. FlowGraph presents a level of performance that is comparable to any state-of-the-art graph processing tool which processes static graphs. In the presence of temporal patterns, it can further optimize processing by avoiding complex graph computations until strictly necessary for pattern evaluation.| File | Dimensione | Formato | |
|---|---|---|---|
|
Doctoral_Thesis___Politecnico_di_Milano.pdf
accessibile in internet solo dagli utenti autorizzati
Descrizione: Efficient processing of graph-based Data Streams
Dimensione
1.06 MB
Formato
Adobe PDF
|
1.06 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/241737