Meta Learning Shared Hierarchies | Two Minute Papers #210 | Summary and Q&A

19.5K views

•

November 29, 2017

Meta Learning Shared Hierarchies | Two Minute Papers #210

TL;DR

Reinforcement learning can be inefficient and lacks generalization, but using sub-policies can make learning more efficient and transferable to new tasks.

Install to Summarize YouTube Videos and Get Transcripts

Key Insights

🥺 Reinforcement learning often starts with brute force search and leads to crazy behavior and inefficiency.
👶 Sub-policies, dividing tasks into smaller actions, can improve the efficiency of learning and enable transferability to new tasks.
👨‍🔬 Learning algorithms that can generalize across different tasks are a major goal in AI research.
🛀 Neural Task Programming is one such technique that shows promise in generalization.
😲 Training ants to traverse different mazes showcases the potential of sub-policies and generalization.
👋 Creating a good selection of sub-policies is challenging but crucial for their effectiveness.
👾 The search space for sub-policies is significantly smaller than the search space for all possible actions, making it more efficient.

Transcript

Dear Fellow Scholars, this is Two Minute Papers with Károly Zsolnai-Fehér. Reinforcement learning is a technique where we have a virtual creature that tries to learn an optimal set of actions to maximize a reward in a changing environment. Playing video games, helicopter control, and even optimizing light transport simulations are among the more aw... Read More

Questions & Answers

Q: What is reinforcement learning and how does it work?

Reinforcement learning is a technique where a virtual creature learns how to maximize a reward in a changing environment. It starts with a brute force search and gradually improves its actions based on feedback.

Q: Why is reinforcement learning typically ineffective?

Reinforcement learning from scratch requires a lot of experience and often leads to crazy behavior. It also cannot reuse previously acquired knowledge for similar tasks.

Q: How are sub-policies used in reinforcement learning?

Sub-policies break down complex tasks into sequences of smaller actions. These sub-policies can be shared between tasks, allowing for efficient learning and transferability to new, unseen tasks.

Q: What are the challenges in creating sub-policies?

Sub-policies need to be robust enough to be helpful in many possible tasks but not too specific to one problem. Finding the right balance of generality and usefulness is challenging.

Summary & Key Takeaways

Reinforcement learning starts with brute force search and leads to ineffective and inefficient behavior.
The obtained knowledge from training cannot be reused for similar tasks.
Sub-policies, which break down tasks into smaller actions, can be shared between tasks and lead to more efficient learning and transferability to new tasks.