In Few-Shot Learning (FSL), models are challenged to learn new tasks based on few examples. Cross-Domain Few-Shot Learning (CDFSL) takes the FSL problem one step further, imposing test tasks to be subject to domain shift. Despite increasing efforts by recent works in the field, the problem of how to effectively meta-learn across multiple training domains while avoiding meta-overfitting remains an important challenge. To this end, we propose Corrupted-Omniglot, a novel CDFSL classification benchmark. Furthermore, we present and analyze multiple techniques that rely on disentanglement, domain agnosticism, and high-quality batch normalization statistics to tackle the CDFSL problem. We discovered that large amounts of pre-training and the usage of domain information during classification significantly worsen performance. More generally, we found that many ways of extending state-of-the-art FSL architectures to address the presence of multiple training and test domains fails to boost performance in CDFSL. Nonetheless, we believe that domain agnosticism and high-quality batch normalization statistics still represent two promising research directions that are not yet sufficiently explored in CDFSL.
Nell'ambito del Few-Shot Learning (FSL) viene richiesto ai modelli di imparare nuovi task sulla base di pochi esempi. Il Cross-Domain Few-Shot Learning (CDFSL) definisce un problema ancora più difficile, imponendo anche che i task di test siano soggetti ad uno shift di dominio. Nonostante gli approcci sempre più numerosi proposti di recente nel campo, il problema di come meta-imparare efficacemente su molti domini di training e al tempo stesso evitare meta-overfitting resta un problema interessante e non banale. In quest'ottica, proponiamo Corrupted-Omniglot, un nuovo benchmark per CDFSL. Inoltre, presentiamo e analizziamo molteplici tecniche che si basano su disentanglement, agnosticismo di dominio e statistiche di batch normalization di alta qualità per affrontare il problema di CDFSL. Abbiamo riscontrato che grandi quantità di pre-training e l'uso di informazione di dominio durante la classificazione peggiora sensibilmente la performance. Più in generale, abbiamo scoperto che molti modi di estendere le architetture state-of-the-art in FSL per tenere conto della presenza di molteplici domini di training e test non ha impatti positivi sulla performance in CDFSL. Ciononostante, reputiamo che agnosticismo di dominio e statistiche di batch normalization di alta qualità siano comunque delle direzioni di ricerca ancora da esplorare opportunamente nel campo di CDFSL.
Structured Meta-Learning for Cross-Domain Few-Shot Classification. A study on structured representations and their effectiveness when dealing with tasks from heterogeneous domains
De ANGELI, NICOLA
2020/2021
Abstract
In Few-Shot Learning (FSL), models are challenged to learn new tasks based on few examples. Cross-Domain Few-Shot Learning (CDFSL) takes the FSL problem one step further, imposing test tasks to be subject to domain shift. Despite increasing efforts by recent works in the field, the problem of how to effectively meta-learn across multiple training domains while avoiding meta-overfitting remains an important challenge. To this end, we propose Corrupted-Omniglot, a novel CDFSL classification benchmark. Furthermore, we present and analyze multiple techniques that rely on disentanglement, domain agnosticism, and high-quality batch normalization statistics to tackle the CDFSL problem. We discovered that large amounts of pre-training and the usage of domain information during classification significantly worsen performance. More generally, we found that many ways of extending state-of-the-art FSL architectures to address the presence of multiple training and test domains fails to boost performance in CDFSL. Nonetheless, we believe that domain agnosticism and high-quality batch normalization statistics still represent two promising research directions that are not yet sufficiently explored in CDFSL.File | Dimensione | Formato | |
---|---|---|---|
2021_04_De Angeli.pdf
Open Access dal 08/04/2022
Dimensione
11.13 MB
Formato
Adobe PDF
|
11.13 MB | Adobe PDF | Visualizza/Apri |
I documenti in POLITesi sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/10589/173660