Bilateral trade models the interaction between a seller and a buyer with private valuations. Within the online learning framework introduced by Nicolò Cesa-Bianchi et Al., mechanisms post fixed prices sequentially and are evaluated against the best fixed price in hindsight. More recently, Bernasconi et Al. introduce a new notion of budget balance, together with a stronger benchmark: the best randomized pricing strategy over price pairs that satisfies global budget balance in expectation. While they prove that learning against such baseline is impossible in adversarial environments, this work shows that the distributional benchmark is attainable in stochastic environment, introducing a globally budget-balanced learning algorithm that achieves O(T^(3/4)) pseudo-regret against such distributional benchmark up to logarithmic factors.
Il bilateral trade modella l’interazione tra un venditore e un acquirente con valutazioni private. Nel framework di online learning introdotto da Nicolò Cesa-Bianchi et Al., i meccanismi pubblicano sequenzialmente prezzi fissi e vengono valutati rispetto al miglior prezzo fisso a posteriori. Più recentemente, Bernasconi et Al. introducono una nuova nozione di budget balance, insieme a un benchmark più forte: la migliore strategia di pricing randomizzata su coppie di prezzi che soddisfa il vincolo di global budget balance in aspettativa. Mentre dimostrano che imparare rispetto a tale riferimento è impossibile in ambienti avversari, questo lavoro mostra che il benchmark distribuzionale è conseguibile in ambienti stocastici, introducendo un algoritmo di apprendimento globalmente budget-balanced che ottiene pseudo-regret O(T^(3/4)) rispetto a tale benchmark distribuzionale a meno di fattori logaritmici.
Online bilateral trade: learning the optimal policy in a stochastic environment
PICCINATO, MATTIA
2024/2025
Abstract
Bilateral trade models the interaction between a seller and a buyer with private valuations. Within the online learning framework introduced by Nicolò Cesa-Bianchi et Al., mechanisms post fixed prices sequentially and are evaluated against the best fixed price in hindsight. More recently, Bernasconi et Al. introduce a new notion of budget balance, together with a stronger benchmark: the best randomized pricing strategy over price pairs that satisfies global budget balance in expectation. While they prove that learning against such baseline is impossible in adversarial environments, this work shows that the distributional benchmark is attainable in stochastic environment, introducing a globally budget-balanced learning algorithm that achieves O(T^(3/4)) pseudo-regret against such distributional benchmark up to logarithmic factors.| File | Dimensione | Formato | |
|---|---|---|---|
|
Tesi.pdf
accessibile in internet per tutti
Descrizione: PDF File of: "Online Bilateral Trade: Learning the Optimal Policy in a Stochastic Environment"
Dimensione
568.52 kB
Formato
Adobe PDF
|
568.52 kB | Adobe PDF | Visualizza/Apri |
|
Executive Summary.pdf
accessibile in internet per tutti
Descrizione: Executive Summary of "Online Bilateral Trade: Learning the Optimal Policy in a Stochastic Environment"
Dimensione
423.14 kB
Formato
Adobe PDF
|
423.14 kB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/249177