Graph-based representations are powerful tools to model complex information systems with numerous interconnected data points. This makes them particularly suitable for legislative and legal systems, which are designed to be highly structured and interdependent. Despite the availability of the Swiss legislative corpus in digital format, no system currently exists that integrates the underlying structural relationships with extensive metadata and direct access to texts. Furthermore, Swiss legal documents are distributed across multiple formats, limiting their accessibility for granular and in-depth analysis. The thesis presents the construction of a graph database designed to store, explore and analyze the Swiss legislative system. A series of processing pipelines have been developed to parse, reformat and restructure official documents, available primarily in PDF and XML formats, into an integrated graph model. The database is implemented using Neo4j and the document extraction pipelines employ various python libraries. The resulting system is a fully queryable graph database where laws, articles and annexes are represented as interconnected nodes, references are modeled as directed edges, and multilingual article texts are stored as node properties. Building on this structure, network analysis techniques such as centrality measures and community detection have been applied to reveal structural features of the legislative network. In addition, statistical comparisons of a shared set of acts across the three official languages have been performed, revealing notable morphological and syntactical differences. This work demonstrates how graph-based databases can effectively model legislative systems, providing a versatile platform to support both functional exploration and research.
Le rappresentazioni basate su grafi sono strumenti potenti per modellare sistemi informativi complessi e con numerosi punti dati interconnessi. Ciò le rende particolarmente adatte all'ambito legislativo e giuridico, che per sua natura è altamente strutturato e interdipendente. Nonostante il corpus legislativo svizzero sia disponibile in formato digitale, attualmente non esiste un sistema che integri le relazioni strutturali sottostanti con un ampio archivio di metadati e l'accesso diretto ai testi. Inoltre, i documenti legali svizzeri sono distribuiti in diversi formati, limitandone l'accessibilità per analisi approfondite e dettagliate. Questa tesi presenta la costruzione di un database a grafo progettato per archiviare, esplorare e analizzare il sistema legislativo svizzero. Sono state sviluppate una serie di pipeline per esaminare, riformattare e ristrutturare i documenti ufficiali, disponibili principalmente in formato PDF ed XML, ed integrarli in un modello a grafo. Il database è stato implementato usando Neo4j e la pipeline di estrazione dei documenti si serve di varie librerie Python. Il sistema risultante è un database a grafo completamente interrogabile, in cui leggi, articoli e allegati sono rappresentati come nodi interconnessi, i riferimenti sono modellati come archi diretti, e i testi sono memorizzati in più lingue come proprietà dei nodi. Su questa struttura, sono state applicate tecnologie di analisi di rete, come le misure di centralità e il rilevamento di comunità, per rivelare caratteristiche strutturali della rete legislativa. In aggiunta, sono state condottte comparazioni statistiche delle tre lingue ufficiali a partire dallo stesso insieme di atti, mettendo in evidenza notevoli differenze sintattiche e morfologiche. Questo lavoro dimostra che i database basati su grafo possono modellare efficacemente i sistemi legislativi, offrendo una piattaforma versatile che supporti sia l'esplorazione funzionale sia la ricerca.
Building a graph database of the swiss legislation for storage, exploration and analysis
Galetti, Filippo
2024/2025
Abstract
Graph-based representations are powerful tools to model complex information systems with numerous interconnected data points. This makes them particularly suitable for legislative and legal systems, which are designed to be highly structured and interdependent. Despite the availability of the Swiss legislative corpus in digital format, no system currently exists that integrates the underlying structural relationships with extensive metadata and direct access to texts. Furthermore, Swiss legal documents are distributed across multiple formats, limiting their accessibility for granular and in-depth analysis. The thesis presents the construction of a graph database designed to store, explore and analyze the Swiss legislative system. A series of processing pipelines have been developed to parse, reformat and restructure official documents, available primarily in PDF and XML formats, into an integrated graph model. The database is implemented using Neo4j and the document extraction pipelines employ various python libraries. The resulting system is a fully queryable graph database where laws, articles and annexes are represented as interconnected nodes, references are modeled as directed edges, and multilingual article texts are stored as node properties. Building on this structure, network analysis techniques such as centrality measures and community detection have been applied to reveal structural features of the legislative network. In addition, statistical comparisons of a shared set of acts across the three official languages have been performed, revealing notable morphological and syntactical differences. This work demonstrates how graph-based databases can effectively model legislative systems, providing a versatile platform to support both functional exploration and research.| File | Dimensione | Formato | |
|---|---|---|---|
|
2025_07_Galetti.pdf
accessibile in internet per tutti a partire dal 02/07/2026
Dimensione
4.3 MB
Formato
Adobe PDF
|
4.3 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/240980