Wi-Fi is a well-established technology and nowadays it is become a need for people. This work is based on what kind of data devices spread when Wi-Fi is ON. In particular we analyze Probe Requests: smartphones (tablets. . . ) that want to attach an Access Point (and get Wi-Fi) have to spread these Probes. In these packets there is lots of information about the device and its owner: the MAC address of the source allows us to distinguish devices and know its vendor; the timestamp help us to understand when and how long people stay in the place where we colleted Probes; SSIDs are the most interesting. These ones are names of Access Points (APs): each device stores a Preferred Network List (PNL) formed by SSIDs of APs to which it was associated previously. Recently Apple introduces a problem: the Randomization of MAC address, producing different MAC address for the same device. We propose to "finger- print" devices crossing timestamp analysis, vendors and matchings in PNLs. Then we have tried to associate to all SSIDs a geographic position (through WIGLE). Taken a PNL of a device we have reconstruct a set of positions where the owners has been previously. Basing on that we calculated a Centroid of the user, such as his influence area. The aim was trying to classify people, basing on their displacements, then we achieve to give to each person a nationality (in the best case) or a continent of provenience. These data have been taken without a groundtruth, but, despite of that, we have collected data in two stations and we have found satisfying results in the matching between Centroids and arrivals of the trains. This can be very useful and exploitable in a shop, for example, to know the provenance of customers, or in certain area, to produce marketing with a specific target of people.
Il Wi-Fi è una tecnologia ben affermata ed oggi è ormai diventata una neces- sità. Questo elaborato si basa sui dati che i dispositivi inviano quando il Wi-Fi è acceso. In particolare analizziamo le Probe Requests: gli smartphones (tablets. . . ) che vogliono agganciare un Access Point (ed ottenere il Wi-Fi) devono inviare queste Probes. In questi pacchetti ci sono molte informazioni sul dispositivo e sul suo pro- prietario: il MAC address del dispositivo sorgente ci permette di distinguere i dispositivi e conoscere la ditta costruttrice; il timestamp ci aiuta a capire quando e per quanto tempo gli utenti sono stati nel luogo di raccolta delle Probes; gli SSIDs sono i più interessanti. Quest’ultimi sono nomi di Access Points (APs): ogni dispositivo salva una lista di reti preferite (PNL) costituita dagli SSIDs degli APsa cui esso si è agganciato in precedenza. Recentemente Apple ha introdotto un problema: la randomizzazione dei MAC address, producendo diversi MAC address per lo stesso dispositivo. Proponiamo di "prendere le impronte" ai dispositivi incrociando analisi sui timestamp delle Probes, costruttori e confronti delle PNLs. In seguito abbiamo cercato di associare a tutti gli SSIDs una coordinata ge- ografica (tramite WIGLE). Presa una PNL di un dispositivo abbiamo ricostruito le posizioni in cui il proprietario è stato in passato. Basandoci su queste abbiamo calcolato poi un Centroide dell’utente, come una sua zona di influenza. Il fine ultimo è stato cercare di classificare gli utenti, basandoci sui loro spostamenti, siamo poi riusciti a dare ad ogni persona una nazionalità (nel caso migliore) o un continente di provenienza. Questi dati sono stati raccolti senza riscontri, nonos- tante ciò abbiamo raccolto dati in due stazioni e ottenuto risultati soddisfacenti confrontando le nostre posizioni con le fermate dei treni in arrivo in quel mo- mento. Questo può essere molto utile e utilizzabile all’interno di un negozio, per esempio, per conoscere la provenienza dei clienti, o in una certa area, per svolgere attività di marketing con un target preciso di persone.
Estimating users’ provenience through analysis of wi-fi probe requests
SARTORI, CAMILLA
2015/2016
Abstract
Wi-Fi is a well-established technology and nowadays it is become a need for people. This work is based on what kind of data devices spread when Wi-Fi is ON. In particular we analyze Probe Requests: smartphones (tablets. . . ) that want to attach an Access Point (and get Wi-Fi) have to spread these Probes. In these packets there is lots of information about the device and its owner: the MAC address of the source allows us to distinguish devices and know its vendor; the timestamp help us to understand when and how long people stay in the place where we colleted Probes; SSIDs are the most interesting. These ones are names of Access Points (APs): each device stores a Preferred Network List (PNL) formed by SSIDs of APs to which it was associated previously. Recently Apple introduces a problem: the Randomization of MAC address, producing different MAC address for the same device. We propose to "finger- print" devices crossing timestamp analysis, vendors and matchings in PNLs. Then we have tried to associate to all SSIDs a geographic position (through WIGLE). Taken a PNL of a device we have reconstruct a set of positions where the owners has been previously. Basing on that we calculated a Centroid of the user, such as his influence area. The aim was trying to classify people, basing on their displacements, then we achieve to give to each person a nationality (in the best case) or a continent of provenience. These data have been taken without a groundtruth, but, despite of that, we have collected data in two stations and we have found satisfying results in the matching between Centroids and arrivals of the trains. This can be very useful and exploitable in a shop, for example, to know the provenance of customers, or in certain area, to produce marketing with a specific target of people.File | Dimensione | Formato | |
---|---|---|---|
tesi_cs.pdf
accessibile in internet solo dagli utenti autorizzati
Descrizione: Testo della Tesi
Dimensione
14.68 MB
Formato
Adobe PDF
|
14.68 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/132469