Enhancing Generalization in Meta-learning via Task Augmentation

improving generalization in meta improving n.w
1 / 10
Embed
Share

Explore methods to improve generalization in meta-learning through task augmentation, addressing meta-overfitting and memorization issues. Solutions include involving more data, combining support and query sets, label shifting, and designing effective augmentation strategies such as MetaMix.

  • Meta-learning
  • Task Augmentation
  • Generalization
  • Memorization
  • MetaMix

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Improving Generalization in Meta Improving Generalization in Meta- -learning via Task Augmentation via Task Augmentation learning Huaxiu Yao1, Longkai Huang2, Linjun Zhang3, Ying Wei4, Li Tian2, James Zou1, Junzhou Huang2, Zhenhui Li5 1Stanford University,2Tencent AI Lab, 3Rutgers University 4City University of Hong Kong, 5Pennsylvania State University 1

  2. G Gradient radient- -based Meta based Meta- -learning learning ? sampled from ?? ?/query set ?? Task ??: data ??; support ?? ML model: ? with initial parameter ?0 MAML [Finn et al. 2017] ? ?? Adaptation Random Start Our focus ? ?? Best for Task ??:(?? Global Prior ?0 ?? ?) ?,?? ?2 ?1 Best for Task ?2:(?2 Best for Task ?1:(?1 ?) ?,?2 ?) ?,?2 Risk of meta-overfitting 2

  3. Two Types of Meta Two Types of Meta- -Overfitting Overfitting Learner Overfitting Meta-learning Process - overfits to the meta-training tasks - fails to generalize to the meta-testing tasks Memorization Overfitting - does not rely on support sets for inner-loop adaptation ?;?? ?=0 ??0,?? ? ?? 3

  4. To Alleviate the Meta To Alleviate the Meta- -overfitting overfitting: Involving more Data : Involving more Data Solution 1: Simply combine support and query set for out-loop updates Solution 2: Label shift [Rajendran et al. 2020] Add the same noise on labels of both support and query sets ?,???= ?? ? ? ?? ?? ?,???= ?? ?,???: ?,???= ?? ?+ ? ?? ?? ?? ?,???: ?,???= ?? ?,???= ?? ? ?+ ?? ?? ?? + Increasing the uncertainty for inference learning the constant is as easy as modifying a bias - Outer-loop gradients - Support set cannot contribute to the outer-loop optimization little extra knowledge is introduced to train the initialization ? ?0;?? - ?,????? ? = 0 4

  5. How to Design Augmentation Strategies? How to Design Augmentation Strategies? Augmentation function: ?( ) Criterion 1 (Addressing memorization overfitting) ?;? ?? ? ?;?? ? ??0,? ?? ??0,?? ? ? ?? ? ?? > 0 ? to make predictions Model more heavily relies on the support set ?? Criterion 2 (Addressing learner overfitting) ?)|?? ?) > 0 ?(?0;?(?? Augmented task contributes additional knowledge to update the initialization 5

  6. MetaMix MetaMix Mixup the support set and query set for outer-loop optimization Mixup Process: Outer-loop Loss: ?;? ?? ? ?;?? ? ??0,? ?? ??0,?? ? ? ?? ? ?? Criterion 1: > 0 ?)|?? ?) > 0 Criterion 2: ?(?0;?(?? In classification, MetaMix can also been enhanced by channel shuffle (see more details in the paper). MetaMix can be integrated with your favorite meta-learning algorithms. 6

  7. E Empirical mpirical Comparison Comparison Applications: 1) drug activity prediction (Regression); 2) pose prediction; 3) image classification Drug (Mean ??) Model Pose Regression (MSE) miniImagenet (Acc.) None 0.299 2.413 51.95% Weight Decay 0.307 2.307 52.27% Passively Impose Regularization Meta-dropout 0.319 2.425 52.40% Meta-Regularization 0.297 2.276 54.39% TAML 0.296 2.196 52.78% Label Shift Aug 0.317 2.152 - /63.11% MetaMix (Ours) 0.352 2.003 58.57%/64.91% 7

  8. Analysis Analysis Compatibility Drug (Mean ??) Model Pose Regression (MSE) miniImagenet (Acc.) MAML/ANIL 0.299 2.413 51.95% MAML/ANIL-MetaMix 0.347 2.003 57.55% MetaSGD 0.331 2.331 52.14% MetaSGD-MetaMix 0.364 1.952 56.66% T-Net 0.323 2.609 54.04% T-Net-MetaMix 0.352 2.418 58.57% Overfitting Larger gap between pre-update and post-update testing models -> less memorization overfitting less learner overfitting ANIL ANIL-MetaMix 8

  9. Does MetaMix lead to better generalization? Does MetaMix lead to better generalization? ?? ? and ?? ? ?? ???= ??,? ???,??,? ??? We denote ??= ?? ?=1 MetaMix is imposing a quadratic regularization on ?? (adapted model) for the ?-th task ?? 1 ? ???= ?? + ? ?? ? ?? ?? ? ???,? ? ???,? ?? ?=1 For all ?? {??? ?? :?? ?,?? ?}, with probability at least 1 ?, log?? log(1 ?) ? ? + ? ?? ? ? + 1 ?? ? ?? ?? ?=1 ?1 + ?2 + ?3 + ?4 ?? ?? MetaMix algorithm is regularizing ?? ?,?? and making ? small better generalization 9

  10. Thanks Thanks Q & A 10

Related


More Related Content