Model-Agnostic Meta-Learning for Fast Adaption of Deep Networks

Introduction

Meta-learning: learning to learn
- Goal of the trained model is to quickly learn a new task from a small amount of new data
- Model is trained by the meta-learner to be able to learn on a large number of different tasks
Key Idea: train model’s initial params such that model has maximal performance on a new task after the params have been updated through few gradients steps with some small data for the new task.
- Can be viewed as building an internal representation which is suitable to many tasks, fine-tuning the params slightly for good perform.
- the process optimizes for model that are fast and easy to fintune
From dynamical systems view, maxing the sensitivity of the loss functions of new tasks with respect to the params
- when sensitivity is high, small local changes to params can lead to largements in the task loss.
Model and task-agnostic algorithm

Intuition: some internal representations are more transferrable than others
- NN might learn internal features that are broadly applicable to all tasks, rather than a single task
- How to encourage the emergence of such general purpose representations?
Aim to find model parameters that are sensitive to changes in the task
- Small changes in parameters will produce large improvements in loss function of any task when altered in the direction of the gradient of that loss.
Only assumption: loss function is smooth enought that gradient based techniques can be applied.
Meta-optimization is performed over the model params, wheras the objective is computed using updated model params.
- MAML meta-gradient updates involves a gradient through a gradient
- Meta Objective : \(\)