Augmenting traders with learning machines

The financial markets are comprised of several participants with diverse roles and objectives. Asset management firms optimize the portfolios of pension funds, institutions and private individuals; market makers offer liquidity by continuously pricing and hedging their risks; proprietary traders invest their own capital with sophisticated methodologies. The approaches adopted by these actors are either manual or expert systems that rely on the experience of traders, and thus are subject to human bias and error. This dissertation proposes innovative techniques to address the limitations of the current trading strategies. Specifically, we explore the use of algorithms capable of autonomously learning the aforementioned sequential decision-making processes. The development of these algorithms entails a careful reproduction of realistic environments, as well as the observance of trading objectives, i.e., maximizing returns while maintaining a low risk profile and minimizing costs. These algorithms all share a common core structure, that is making a trading decision conditional on the current state of the financial markets. Our main theoretical and algorithmic contributions include the extension of the online learning field, as we introduce transaction costs and conservativeness in online portfolio optimization, and the enhancement of Monte Carlo Tree Search algorithms to account for the stochasticity and high noise typical of the financial markets. In terms of experimental contributions, we apply Reinforcement Learning to learn profitable quantitative trading strategies and option hedging approaches superior to the standard Black & Scholes hedge. We also find that Reinforcement Learning combined with Mean Field Games enables the development of competitive bond market making strategies. Finally, we demonstrate that dynamic optimal execution methods can be learned through Thompson Sampling with Reinforcement Learning. The use of such advanced techniques in a production environment may allow the achievement of a competitive advantage that will translate into economical benefits.

I mercati finanziari sono composti da vari attori con ruoli e obiettivi diversi. Gli asset managers ottimizzano i portafogli di fondi pensione, istituzioni e privati; i market makers offrono liquidità al mercato, prezzando nel continuo e coprendo i loro rischi; i proprietary traders investono il loro capitale con metodologie sofisticate. Gli approcci adottati da questi attori possono essere di natura manuale oppure tramite sistemi esperti che dipendono dall’esperienza dei trader, e sono quindi soggetti al bias e all’errore umano. Questa tesi propone tecniche innovative per indirizzare i limiti correnti delle strategie di trading. Più nello specifico, esploriamo l’uso di algoritmi in grado di apprendere autonomamente i processi decisionali sequenziali già menzionati. Lo sviluppo di questi algoritmi richiede una riproduzione attenta e realistica dell’ambiente in cui operano e l’osservazione degli obbiettivi di trading, i.e., la massimizzazione dei rendimenti con poco rischio e minimizzando i costi. Questi algoritmi hanno in comune una struttura centrale, cioè prendere decisioni di trading condizionate allo stato attuale dei mercati finanziari. I principali contributi teorici e algoritmici includono l’estensione del campo dell’online learning, introducendo costi di transazione e conservatività nell’ambito dell’online portfolio optimization, la modifica degli algoritmi di Monte Carlo Tree Search per includere la stocasticità ed il rumore tipici dei mercati finanziari. In termini di contributi sperimentali, la tesi applica il Reinforcement Learning per imparare strategie di trading profittevoli e approcci di option hedging superiori allo standard di Black & Scholes. Si mostra che il Reinforcement Learning, combinato con Mean Field Games permettono lo sviluppo di strategie di market making competitive. Infine, la tesi dimostra che approcci di optimal execution dinamici possono essere imparati tramite Thompson Sampling insieme al Reinforcement Learning. L’utilizzo di tali tecnologie avanzate in un ambiente di produzione potrebbe permettere il raggiungimento di un vantaggio competitivo che si tradurrà in benefici economici.