In recent years, the increasing adoption of graph databases has highlighted the need for efficient reasoning mechanisms to infer new knowledge from structured data. Although existing Knowledge Graph Management Systems (KGMSs), such as the Vadalog Engine of the Bank of Italy, provide powerful reasoning capabilities, their complexity and resource demands make them impractical for small- to medium-sized enterprises. This thesis presents a lightweight reasoning engine that leverages Datalog-based inference through database triggers in Neo4j’s APOC framework. The system introduces DatalogLR, a syntactic adaptation of Datalog designed for graph-structured data, and implements a transpiler that converts DatalogLR rules into semantically equivalent triggers. A proposed trigger controller engine ensures correct execution, handling recursion and stratified negation while simulating chase-based reasoning. The system has been evaluated on real-world datasets, including Wikipedia, DBLP, and the Italian Company Knowledge Graph, demonstrating competitive performance on small and medium-sized graphs with respect to the state-of-the-art Vadalog Engine. These contributions provide a practical alternative for reasoning in constrained environments, bridging the gap between expressivity and computational efficiency.
Negli ultimi anni, la crescente diffusione dei database a grafo ha evidenziato la necessità di meccanismi di reasoning efficienti per inferire nuova conoscenza a partire da questi dati strutturati. Sebbene i Knowledge Graph Management Systems (KGMSs) esistenti, come la Vadalog Engine della Banca d'Italia, offrano potenti capacità di ragionamento, la loro complessità e le elevate richieste di risorse li rendono poco praticabili per aziende di piccole e medie dimensioni. Questa tesi presenta un reasoning engine lightweight, basato su inferenza Datalog attraverso il meccanismo dei trigger della libreria APOC di Neo4j. Il sistema introduce DatalogLR, un adattamento sintattico di Datalog progettato per dati strutturati sottoforma di grafo, e implementa un transpiler che converte regole DatalogLR in trigger semanticamente equivalenti. Un trigger controller, anch'esso contributo del lavoro di tesi, garantisce l’esecuzione corretta, gestendo ricorsione e negazione stratificata e simulando il ragionamento basato sulla chase-procedure. Il sistema è stato valutato su dataset reali, tra cui Wikipedia, DBLP e l'Italian Company Knowledge Graph, dimostrando prestazioni competitive su grafi di piccole e medie dimensioni rispetto alla Vadalog Engine. Questi contributi forniscono un'alternativa pratica per il reasoning ontologico in ambienti con risorse limitate, colmando il divario tra espressività ed efficienza computazionale.
A lightweight reasoning engine with Neo4j triggers
Pisani, Matteo
2023/2024
Abstract
In recent years, the increasing adoption of graph databases has highlighted the need for efficient reasoning mechanisms to infer new knowledge from structured data. Although existing Knowledge Graph Management Systems (KGMSs), such as the Vadalog Engine of the Bank of Italy, provide powerful reasoning capabilities, their complexity and resource demands make them impractical for small- to medium-sized enterprises. This thesis presents a lightweight reasoning engine that leverages Datalog-based inference through database triggers in Neo4j’s APOC framework. The system introduces DatalogLR, a syntactic adaptation of Datalog designed for graph-structured data, and implements a transpiler that converts DatalogLR rules into semantically equivalent triggers. A proposed trigger controller engine ensures correct execution, handling recursion and stratified negation while simulating chase-based reasoning. The system has been evaluated on real-world datasets, including Wikipedia, DBLP, and the Italian Company Knowledge Graph, demonstrating competitive performance on small and medium-sized graphs with respect to the state-of-the-art Vadalog Engine. These contributions provide a practical alternative for reasoning in constrained environments, bridging the gap between expressivity and computational efficiency.File | Dimensione | Formato | |
---|---|---|---|
2024_04_Pisani.pdf
accessibile in internet per tutti a partire dal 12/03/2026
Descrizione: testo della tesi
Dimensione
1.49 MB
Formato
Adobe PDF
|
1.49 MB | Adobe PDF | Visualizza/Apri |
2024_04_Pisani_executive_summary.pdf
accessibile in internet per tutti a partire dal 12/03/2026
Descrizione: executive summary
Dimensione
680.48 kB
Formato
Adobe PDF
|
680.48 kB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/236246