Yoshua Bengio: Meta-learning (NeurIPS 2019) | Summary and Q&A
TL;DR
Meta learning involves using multiple time scales of learning and optimization to explicitly optimize for generalization, particularly in out-of-distribution scenarios.
Key Insights
- ❓ Meta learning has been around for several decades and involves optimizing for generalization and out-of-distributionalization.
- 👻 Changes in distribution can be caused by interventions on variables or mechanisms, and a right decomposition of knowledge allows for efficient adaptation.
- ❓ Meta learning can be applied in various contexts, such as disentangling causal mechanisms and learning causal models from known interventions.
Transcript
okay so next I want to talk about the the meta learning aspect and another hypothesis that's important to deal with how the world changes so but a learning is something really hot and cool these days but actually started several decades ago my brother and I have been working on this in the early 90s and actually was Sammy's PhD subject and and what... Read More
Questions & Answers
Q: What is meta learning and how does it differ from normal learning?
Meta learning involves having multiple time scales of learning, with an inner loop for normal learning and an outer loop for optimization. It differs from normal learning by explicitly optimizing for generalization and out-of-distribution scenarios.
Q: How can meta learning be used to optimize for out-of-distributionalization?
By training slow time scale meta parameters in meta learning, agents can learn to generalize well to new environments and adapt quickly to changes in distribution. This allows for better performance in out-of-distribution scenarios.
Q: What hypothesis can be made about changes in distribution?
Underlying physics suggests that changes in distribution are caused by interventions on specific variables or mechanisms. By assuming the relationships between variables are independent, only a few adaptations are needed to account for changes.
Q: What is the advantage of using a right decomposition of knowledge?
With a right decomposition of knowledge, only a few bits are needed to account for changes in distribution. This leads to efficient adaptation and inference, requiring fewer observations.
Summary & Key Takeaways
-
Meta learning involves having multiple time scales of learning, such as an inner loop for normal learning and an outer loop for evolution that optimizes the inner loop's results.
-
By utilizing meta learning, agents can learn to generalize well to new environments and optimize for out-of-distribution scenarios.
-
Changes in distribution can be caused by interventions on variables or mechanisms, and by using a right decomposition of knowledge, only a few observations are needed to adapt or infer these changes.