Retail Media, which involves selling advertising space across a retailer’s online and in store channels, depends on quick and reliable access to first-party data. Within the client’s retail group, data were scattered across several autonomous brands, each running its own checkout and customer-data systems leaving data fragmented and not uniform. The challenge was to merge these heterogeneous data sets into a single, trustworthy source fast enough for daily campaign execution and reliable enough for long-term analytics. After outlining the broader Retail Media initiative, the thesis focuses on the Cloud Data Platform that provides the technical foundation for all data-driven projects and ensures their long-term viability. Built on the Google Cloud Platform and structured according to the Medallion Architecture, the platform centralizes first-party data across brands. A custom Python ingestion engine, orchestrated with Cloud Composer, delivers each daily file unchanged to a Bronze layer for auditability. Schemas are harmonized in the Silver layer, where raw records are cleaned and standardized across brands, and a shared dimensional warehouse model is built in the Gold layer for analytics and reporting. This process replaces manual file consolidation with fully automated pipelines and establishes a centrally governed repository that powers Retail Media dashboards, as well as the company’s broader analytics ecosystem. Marketing and merchandising teams now explore customer segments and campaign KPIs through self-service dashboards, while other business functions can build on the same authoritative source of data.
Il Retail Media, ovvero la vendita di spazi pubblicitari sui canali digitali e nei negozi di un retailer, richiede un accesso rapido e affidabile ai dati di prima parte. All’interno del gruppo retail del cliente, i dati erano sparsi tra diversi marchi autonomi, ciascuno con i propri sistemi di cassa e di gestione dei dati cliente, lasciando i dati frammentati e non uniformi. La sfida consisteva quindi nel fondere questi flussi eterogenei in un’unica fonte attendibile, sufficientemente tempestiva per l’esecuzione quotidiana delle campagne e solida per le analisi di lungo periodo. Dopo aver contestualizzato l’iniziativa di Retail Media nel suo complesso, la tesi si concentra sulla Cloud Data Platform, che costituisce la base tecnica di tutti i progetti data-driven e ne assicura la sostenibilità futura. Realizzata su Google Cloud Platform e organizzata secondo la Medallion Architecture, la piattaforma centralizza i dati di prima parte provenienti da tutti i brand. Un motore di ingestion scritto in Python e orchestrato con Cloud Composer convoglia quotidianamente i file senza modificarli nel layer Bronze, per garantire la tracciabilità; gli schemi vengono armonizzati nel layer Silver, dove i record grezzi vengono ripuliti e standardizzati, e un modello dimensionale di data warehouse viene costruito nel layer Gold per finalità di analisi e reporting. Il processo sostituisce la consolidazione manuale dei file con pipeline completamente automatizzate e istituisce un repository governato centralmente che alimenta le dashboard di Retail Media e, più in generale, l’ecosistema analitico dell’azienda. I team marketing e merchandising possono ora esplorare segmenti di clientela e KPI delle campagne tramite dashboard self-service, mentre le altre funzioni aziendali possono fare riferimento alla stessa fonte autorevole di dati.
Building a cloud data platform for retail media: insights from a real-world client case
Sica, Francesco
2024/2025
Abstract
Retail Media, which involves selling advertising space across a retailer’s online and in store channels, depends on quick and reliable access to first-party data. Within the client’s retail group, data were scattered across several autonomous brands, each running its own checkout and customer-data systems leaving data fragmented and not uniform. The challenge was to merge these heterogeneous data sets into a single, trustworthy source fast enough for daily campaign execution and reliable enough for long-term analytics. After outlining the broader Retail Media initiative, the thesis focuses on the Cloud Data Platform that provides the technical foundation for all data-driven projects and ensures their long-term viability. Built on the Google Cloud Platform and structured according to the Medallion Architecture, the platform centralizes first-party data across brands. A custom Python ingestion engine, orchestrated with Cloud Composer, delivers each daily file unchanged to a Bronze layer for auditability. Schemas are harmonized in the Silver layer, where raw records are cleaned and standardized across brands, and a shared dimensional warehouse model is built in the Gold layer for analytics and reporting. This process replaces manual file consolidation with fully automated pipelines and establishes a centrally governed repository that powers Retail Media dashboards, as well as the company’s broader analytics ecosystem. Marketing and merchandising teams now explore customer segments and campaign KPIs through self-service dashboards, while other business functions can build on the same authoritative source of data.| File | Dimensione | Formato | |
|---|---|---|---|
|
2025_07_Sica.pdf
non accessibile
Descrizione: File tesi Francesco Sica
Dimensione
1.17 MB
Formato
Adobe PDF
|
1.17 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/240818