Self-learning elevator controller. Optimizing elevators control with deep reinforcement learning

Elevators play a crucial role in modern cities. Other than helping people with disabilities, they were a crucial element that fostered the advent of high-rise buildings, contributing to shape our urban areas. When designing a lift system, a trip scheduling algorithm is fundamental to save energy and space costs. Its role is to organize the elevator trips to respond to passengers’ demand, efficiently dispatching people between floors. In this work, we propose a reinforcement learning (RL) based approach to control single and multiple elevators systems, which we denote as self-learning controller. To develop a method employable in a real installation, when modeling agent environment interaction, we paid great attention to develop a realistic simulation of the lift system and of its interaction with passengers. Furthermore, our RL policies only exploit information typically available in a real installation. This work mainly focuses on value-based model-free RL methods, comparing tabular Q-learning and different variants of deep reinforcement Q-learning (DQN). We show that DQN achieves state-of-the-art performances on a single-elevator installation, being competitive with the collective selective control operation. We also propose an application of fully-cooperative multi-agent reinforcement learning (MARL) to control a double-lift installation, with independent DQN learners sharing the same state representation and Q-network. Even if lacking optimal cooperation between agents, this approach matches the performances of our implementation of double-elevator control based on the estimated time of arrival (ETA). This work demonstrates that, under realistic assumptions, RL can successfully control an elevator system, and it is the initial contribution for the self-learning controller project, developed at the Schindler EPFL Lab.

Gli ascensori svolgono un ruolo fondamentale nelle città moderne. Oltre ad aiutare persone con disabilità, hanno contribuito notevolmente allo sviluppo verticale delle aree metropolitane. Nel progettare un impianto di ascensori, un algoritmo di controllo è fondamentale per ridurne costi in termini di spazio ed energia. Il suo scopo è organizzare i viaggi degli ascensori per soddisfare efficientemente le chiamate dei passeggeri, trasportando velocemente le persone tra i piani di un edificio. Questo lavoro propone una tecnica basata sull’apprendimento per rinforzo (RL) per controllare un’installazione con uno o più ascensori, che chiameremo self-learning controller. Al fine di ottenere un approccio applicabile nella pratica, per simulare le interazioni agente-ambiente, prestiamo molta cura nell’ottenere una modellizzazione realistica di un impianto di ascensori e della sua interazione con le persone. Inoltre, i nostri agenti basano le loro decisioni unicamente su informazioni comunemente disponibili in una reale installazione. Questo lavoro confronta Q-learning tabulare e alcune varianti di Q-learning con reti neurali (DQN). Mostriamo che DQN ottiene performance comparabili con lo stato dell’arte in installazioni con un singolo ascensore, essendo in grado di competere con l’algoritmo di controllo collective selective. Inoltre, proponiamo un’applicazione di apprendimento per rinforzo con agenti multipli (MARL) che collaborano per governare una coppia di ascensori, allenando due agenti DQN indipendenti, ma che condividono la stessa Q-network. Anche se la loro collaborazione non è ottimale, sono in grado di ottenere le stesse performance della nostra implementazione dell'algoritmo basato sull'estimated time of arrival (ETA). Questo lavoro dimostra che, in condizioni realistiche, agenti allenati tramite RL possono controllare efficacemente un sistema di ascensori. Si tratta del contributo iniziale del self-learning controller project, che stiamo sviluppando allo Schindler EPFL Lab di Losanna.