A comparison of HPO algorithms through the search space exploration

This thesis has been elaborated during an internship at Alten Laboratory Sophia Antipolis (France). My work is part of a bigger project, that aims to automate the creation of a complete machine learning pipeline for different types of problems. Inside this project, there are many tasks to be faced. During my six months of collaboration, I worked on the hyperparameter tuning problem. This task is usually performed by scientists who exploit their knowledge to find the best hyperparameters for a given model. However, in the Hyperparameters Optimization (HPO) field many algorithms have been proposed to automatize this process of looking for the best hyperparameters. I studied the literature starting from the base approach (Grid Search, Random Search) to the current state-of-the-art techniques. Specifically, I focused on the multi-fidelity family, where two advanced algorithms have been implemented in recent years. One based on Bayesian Optimization (Bayesian Optimization Hyperband) and one based on Differential Evolutionary operations (Differntial Evolution Hyperband ). The goal is to compare current state-of-theart algorithms, analyzing and studying their differences. We implemented a test framework to perform fair and robust comparisons. The main challenge was the time complexity required by the experiments to show the methods’ properties. We compared algorithms looking both at the final performances and at the exploration of the search space. From our studies, we concluded that there is no technique outperforming the other, but there are differences in the search space exploration. DEHB seems to explore more uniformly all the directions of the space, while BOHB is guided by its knowledge acquired by the model base part.

Questa tesi é stata sviluppata durante un intrnship presso l’azienda Alten. Il mio lavoro fa parte di un progetto riguardante l’automazione del Machine Learning. Durante i sei mesi di lavoro ho affrontato il problema dell’hyperparameter selection. Solitamente, dato un modello, la scelta degli iperparametri ricade sugli scienziati, i quali sfruttano la loro conoscenza ed esperienza per ottenere i migliori risultati possibili. Il valore degli iperparametri infatti, puo‘ incidere fortemente sulle perfromance finali del modello. Il campo dell’Hyperparameter Optimization (HPO), ambisce a sviluppare algoritmi in grado di risolvere questo problema. Io ho analizzato e studiato la letteratura, partendo dalle tecniche più semplici, fino allo stato dell’arte. In particolare mi sono concentrato sulla famiglia dei multi-fidelity methods, su cui si basano molti degli algoritmi più recenti. Fra questi Bayesian Optimization Hyperand (BOHB) e Differential Evolution Hyperband (DEHB). L’obbiettivo è studiare gli algoritmi che rappresentano lo stato dell’arte cosi‘ da confrontarli e analizzarne le differenze. Per questo motivo abbiamo costruito un test framework, definendo una metodologia di comparazione. L’ostacolo maggiore é rappresentato dalla complessita‘ computazionale richiesta dagli esperimenti al fine di msotrare le proprieta‘ dei diffrerenti algoritmi. Durante i test di confronto ci siamo concentrati sia sulle performance finali che sull’esplorazione dello spazio di ricerca. Dagli studi effetuati possiamo concludere di non aver trovato una tecnica capace di vincere sulle altre in ogni esperimento. Ma l’analisi dell’esplorazione dello spazio per la ricerca dell’ottimo, puo‘ mostrare alcune differenze fra gli ottimizzatori. DEHB sembrerebbe esplorare lo spazio in maniera piu‘ uniforme in tutte le direzioni, mentre BOHB risulterebbe maggiormente guidato dalla conoscenza acquisita attraverso la suo componente model based.