Incremental learning is a machine learning paradigm where the model is trained in a sequential stream of tasks. This challenging setup requires a model capable of balancing between being able to learn new tasks and not forgetting previously learned knowledge, properties usually referred as plasticity and stabil- ity, respectively. Neural Architecture Search (NAS), a subfield of AutoML, aims to find an optimal architecture for a given ML problem by searching over a large space of possible neural network (NN) architectures. Although NAS-based models have accomplished state-of-the-art results for static datasets, its application to in- cremental learning has been limited to solutions searching for expansions at every new task arrival, making them unfeasible in resource-constrained environments. Our work, called SEAL, focuses on a particular incremental learning scenario we refer to as data incremental learning, where disjointed dataset samples arrive se- quentially and are not retained in memory for the learner to revisit in the future. SEAL improves plasticity reduction by adding dynamic model expansions based on the measured capacity when new data samples arrive, and reduce catastrophic for- getting by training with an additional cross-distillation loss after the expansions. The proposed framework uses a NAS setup to jointly optimize the architecture and coupled growth policy for the NN. Our results show that the learned policies effec- tively reduce forgetting and improve model performance, while reducing memory complexity compared to previous methods.
L’incremental learning è un paradigma di machine learning in cui il modello viene addestrato in un flusso sequenziale di compiti. Questo setup impegnativo richiede un modello in grado di bilanciare la capacità di apprendere nuovi dati senza dimenticare le conoscenze apprese in precedenza, proprietà che sono solitamente indicate come plasticità e catastrophic forgetting, rispettivamente. Neural Architecture Search (NAS), un sotto-campo dell’AutoML, mira a trovare un’architettura ottimale per un dato problema di ML esplorando un ampio spazio di architetture possibili. Sebbene i modelli basati su NAS abbiano raggiunto risultati all’avanguardia per dataset statici, la loro applicazione all’incremental learning è stata finora limitata a soluzioni di ricerca in tempo reale, rendendoli impraticabili in ambienti con risorse limitate. Il nostro lavoro si concentra su un tipo specifico di incremental learning, che chiamiamo data incremental learning, in cui i campioni di dataset disgiunti arrivano in modo sequenziale e non vengono conservati in memoria per essere rivisitati in futuro. Affrontiamo la perdita di plasticità con espansioni dinamiche del modello basate sulla capacità misurata all’arrivo di un nuovo compito e riduciamo il catastrophic forgetting addestrando con una distillation loss aggiuntiva dopo le espansioni. Il framework proposto utilizza un setup NAS per ottimizzare congiuntamente sia l’architettura sia la politica di crescita della rete. I nostri risultati mostrano che le politiche apprese riducono efficacemente entrambi i problemi, ottenendo al contempo metriche di accuratezza incrementale superiori rispetto ai metodi precedenti.
Searching expandable architectures for incremental learning
CASTRO SOLAR, VICENTE JAVIER
2023/2024
Abstract
Incremental learning is a machine learning paradigm where the model is trained in a sequential stream of tasks. This challenging setup requires a model capable of balancing between being able to learn new tasks and not forgetting previously learned knowledge, properties usually referred as plasticity and stabil- ity, respectively. Neural Architecture Search (NAS), a subfield of AutoML, aims to find an optimal architecture for a given ML problem by searching over a large space of possible neural network (NN) architectures. Although NAS-based models have accomplished state-of-the-art results for static datasets, its application to in- cremental learning has been limited to solutions searching for expansions at every new task arrival, making them unfeasible in resource-constrained environments. Our work, called SEAL, focuses on a particular incremental learning scenario we refer to as data incremental learning, where disjointed dataset samples arrive se- quentially and are not retained in memory for the learner to revisit in the future. SEAL improves plasticity reduction by adding dynamic model expansions based on the measured capacity when new data samples arrive, and reduce catastrophic for- getting by training with an additional cross-distillation loss after the expansions. The proposed framework uses a NAS setup to jointly optimize the architecture and coupled growth policy for the NN. Our results show that the learned policies effec- tively reduce forgetting and improve model performance, while reducing memory complexity compared to previous methods.| File | Dimensione | Formato | |
|---|---|---|---|
|
2024_12_Castro_Thesis_01.pdf
accessibile in internet per tutti
Descrizione: Thesis
Dimensione
11.49 MB
Formato
Adobe PDF
|
11.49 MB | Adobe PDF | Visualizza/Apri |
|
2024_12_Castro_Executive_Summary_02.pdf
accessibile in internet per tutti
Descrizione: Executive Summary
Dimensione
1.66 MB
Formato
Adobe PDF
|
1.66 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/230825