In recent years, the CPaaS (Communications Platform as a Service) industry has become a key enabler of personalized marketing across channels such as SMS, WhatsApp, and Viber. Yet despite abundant messaging data, user interactions are often treated as isolated transactions, lacking a coherent understanding of customer behavior and intent. This thesis introduces a novel user profiling pipeline powered by generative AI specifically large language models (LLMs) to construct rich, interpretable profiles from unstructured message logs and behavioral signals. Unlike traditional field-based approaches, this system enables semantic search and behavioral segmentation, unlocking insights that were previously inaccessible. To implement this approach, we propose a multi-stage architecture that begins with prompt-based persona generation, followed by semantic search using bi-encoders and reranking with cross-encoders. This architecture allows marketers to query the system in natural language (e.g., “Find users interested in fashion and loyalty programs”) and retrieve ranked user segments based on inferred behavioral affinity. In doing so, it transforms raw CPaaS data into actionable intelligence for targeted campaigns. The system also demonstrates how generative AI can be used for automated, personalized content creation. Campaign messages are tailored to each user's inferred interests and motivations, increasing relevance and engagement while significantly reducing manual effort. Furthermore, the thesis outlines a feedback loop architecture in which campaign performance is used to iteratively refine user profiles. While not yet implemented end-to-end, this design supports a self-improving system capable of adapting over time—bringing the industry closer to the vision of Agentic AI, where campaigns are described in free text and autonomously executed and optimized by intelligent systems. Finally, we ensure ethical and scalable deployment through a privacy-compliant design that includes support for consent-based data clean rooms—enabling richer, user-level profiling while preserving regulatory compliance.
Negli ultimi anni, l'industria CPaaS (Communications Platform as a Service) è diventata un elemento chiave per il marketing personalizzato su canali come SMS, WhatsApp e Viber. Tuttavia, nonostante l’abbondanza di dati di messaggistica, le interazioni con gli utenti vengono spesso trattate come transazioni isolate, senza una comprensione coerente del comportamento e delle intenzioni del cliente. Questa tesi introduce una pipeline innovativa per il profiling degli utenti basata su intelligenza artificiale generativa più precisamente su large language models (LLM) per costruire profili ricchi e interpretabili a partire da log di messaggi non strutturati e segnali comportamentali. A differenza degli approcci tradizionali basati su campi predefiniti, questo sistema abilita la ricerca semantica e la segmentazione comportamentale, sbloccando insight precedentemente inaccessibili. Per implementare questo approccio, proponiamo un’architettura a più stadi che inizia con la generazione di profili tramite prompt, seguita da una ricerca semantica con bi-encoder e un reranking fine con cross-encoder. Questa architettura consente ai marketer di interrogare il sistema in linguaggio naturale (ad es. “Trova utenti interessati alla moda e ai programmi fedeltà”) e ottenere segmenti di utenti ordinati in base all’affinità comportamentale. In questo modo, i dati grezzi della CPaaS vengono trasformati in intelligenza azionabile per campagne mirate. Il sistema dimostra inoltre come l’intelligenza artificiale generativa possa essere utilizzata per la creazione automatica di contenuti personalizzati. I messaggi di campagna vengono adattati agli interessi e alle motivazioni inferite di ciascun utente, aumentando la rilevanza e l’engagement, riducendo al contempo il lavoro manuale. Inoltre, la tesi presenta un’architettura a ciclo di feedback in cui le performance delle campagne vengono utilizzate per affinare progressivamente i profili utente. Sebbene non ancora implementato end-to-end, questo design supporta un sistema auto-migliorativo in grado di adattarsi nel tempo—avvicinando l’industria alla visione dell’Agentic AI, in cui le campagne sono descritte in linguaggio naturale e vengono eseguite e ottimizzate in modo autonomo da sistemi intelligenti. Infine, garantiamo una distribuzione etica e scalabile attraverso un design conforme alla privacy, che include il supporto per data clean room basate sul consenso—abilitando un profiling utente più ricco e dettagliato nel rispetto delle normative.
Reimaging profiling: LLMs for next era of hyper-personalization
Fabris, Filip
2024/2025
Abstract
In recent years, the CPaaS (Communications Platform as a Service) industry has become a key enabler of personalized marketing across channels such as SMS, WhatsApp, and Viber. Yet despite abundant messaging data, user interactions are often treated as isolated transactions, lacking a coherent understanding of customer behavior and intent. This thesis introduces a novel user profiling pipeline powered by generative AI specifically large language models (LLMs) to construct rich, interpretable profiles from unstructured message logs and behavioral signals. Unlike traditional field-based approaches, this system enables semantic search and behavioral segmentation, unlocking insights that were previously inaccessible. To implement this approach, we propose a multi-stage architecture that begins with prompt-based persona generation, followed by semantic search using bi-encoders and reranking with cross-encoders. This architecture allows marketers to query the system in natural language (e.g., “Find users interested in fashion and loyalty programs”) and retrieve ranked user segments based on inferred behavioral affinity. In doing so, it transforms raw CPaaS data into actionable intelligence for targeted campaigns. The system also demonstrates how generative AI can be used for automated, personalized content creation. Campaign messages are tailored to each user's inferred interests and motivations, increasing relevance and engagement while significantly reducing manual effort. Furthermore, the thesis outlines a feedback loop architecture in which campaign performance is used to iteratively refine user profiles. While not yet implemented end-to-end, this design supports a self-improving system capable of adapting over time—bringing the industry closer to the vision of Agentic AI, where campaigns are described in free text and autonomously executed and optimized by intelligent systems. Finally, we ensure ethical and scalable deployment through a privacy-compliant design that includes support for consent-based data clean rooms—enabling richer, user-level profiling while preserving regulatory compliance.| File | Dimensione | Formato | |
|---|---|---|---|
|
Reimagining Profiling LLMs for the Next Era of Hyper-personalization.pdf
accessibile in internet per tutti
Descrizione: Transforming Unstructured Data Into Behavioral In- telligence at Scale
Dimensione
8.05 MB
Formato
Adobe PDF
|
8.05 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/239767