Products
Features
YouTube Video Summarizer
Summarize YouTube videos
Web & PDF Highlighter
Highlight web pages & PDFs
Chat with PDF
Ask any PDF questions with AI
Ask AI Clone
Chat with your highlights & memories
Audio Transcriber
Transcribe audio files to text
Glasp Reader
Read and highlight articles
Kindle Highlight Export
Export your Kindle highlights
Idea Hatch
Hatch ideas from your highlights
Integrations
Obsidian Plugin
Notion Integration
Pocket Integration
Instapaper Integration
Medium Integration
Readwise Integration
Snipd Integration
Hypothesis Integration
Apps & Extensions
Chrome Extension
Safari Extension
Edge Add-ons
Firefox Add-ons
iOS App
Android App
Discover
Discover
Ideas
Discover new ideas and insights
Articles
Curated articles and insights
Books
Book recommendations by great minds
Posts
Essays and notes from readers
Quotes
Inspiring quotes collection
Videos
Curated videos and summaries
Explore Glasp
Glasp Newsletter
Weekly insights and updates
Glasp Talk
Interview series with great minds
Glasp Blog
Latest news and articles
Glasp Use Cases
Learn how others use Glasp
Build & Support
Glasp API
Access Glasp's API for developers
MCP Connector
Connect Glasp to Claude & ChatGPT
Community
Glasp Reddit Community
Students
Student discount and benefits
FAQs
Frequently Asked Questions
AboutPricing
DashboardLog inSign up

How Can AI Learn Without a Reward Function?

December 13, 2019
by
Robert Miles AI Safety
YouTube video player
How Can AI Learn Without a Reward Function?

TL;DR

AI can learn tasks without a predefined reward function using reward modeling. This method relies on human feedback to predict desired behaviours, significantly speeding up the learning process. With this approach, machines can tackle complex tasks previously thought impossible for AI.

Transcript

hi what is technology don't skip ahead I promise I'm going someone with this so you could have some kind of definition from a dictionary that's like technology is machinery and equipment made using scientific knowledge something like that but where are the boundaries of the category what counts for example pair of scissors technology I think most p... Read More

Key Insights

  • ⚖️ Defining technology can be challenging, as it requires a balance between complexity, unpredictability, and scientific knowledge.
  • 🎰 AI is about enabling machines to perform tasks previously considered human cognitive tasks, and as machines can do more, the definition of AI evolves.
  • 🎰 Machine learning, particularly through reward modeling, allows for the expansion of tasks that machines can handle.
  • 🚂 Reward modeling involves training a system through human feedback, enabling the system to learn a reward function without the need for explicit demonstrations.
  • 🎰 Reward modeling has demonstrated success in enabling machines to perform tasks for which traditional programming approaches are unsuitable.
  • 🎁 As machines become capable of more complex tasks, AI programming becomes more challenging and presents potential safety issues.
  • 😒 The use of neural networks in reward modeling provides protection against reward gaming and allows the system to continuously improve.
  • 👍 Reward modeling has proven effective in training agents to perform tasks without the need for a predefined reward function.

Install to Summarize YouTube Videos and Get Transcripts

Explore YouTube Video Summarizer or Get YouTube Transcript Extractor

Questions & Answers

Q: What is the difference between technology and simple tools?

The distinction lies in the complexity and unpredictability of technology compared to simple tools. Technology often involves scientific knowledge and is characterized by intricate mechanisms and systems.

Q: How do the definitions of technology and AI evolve over time?

As our understanding and capabilities increase, what is considered technology and AI expands. Once we fully comprehend and master a specific technology or task, it tends to be excluded from the category of technology or AI.

Q: Is a calculator considered artificial intelligence?

While arithmetic is a cognitive task, most people wouldn't categorize a calculator as AI. The distinction lies in whether a machine can perform a cognitive task that was traditionally associated with human intelligence, and once machines can do so, it is no longer perceived as AI.

Q: What is the goal of AI research?

The goal is to continuously expand the range of tasks that computers can handle and to push the boundaries of what is considered AI. This involves making machines perform cognitive tasks that were previously thought to be beyond their capabilities.

Summary & Key Takeaways

  • Technology is characterized by complexity and unpredictability, distinguishing it from simpler and well-understood tools.

  • AI involves making machines perform tasks that were previously considered human cognitive tasks, and as machines become capable of performing these tasks, the boundaries of AI continue to expand.

  • Traditional approaches to programming are limited in addressing complex tasks, but machine learning, through reward modeling, allows for the training of systems in performing tasks for which traditional programming is unsuitable.


Read in Other Languages (beta)

English

Share This Summary 📚

Summarize YouTube Videos and Get Video Transcripts with 1-Click

Download browser extensions on:

Try YouTube Summary with ChatGPT & Claude or YouTube Transcript Generator

Explore More Summaries from Robert Miles AI Safety 📚

AI Safety Career Advice! (And So Can You!) thumbnail
AI Safety Career Advice! (And So Can You!)
Robert Miles AI Safety
Sharing the Benefits of AI: The Windfall Clause thumbnail
Sharing the Benefits of AI: The Windfall Clause
Robert Miles AI Safety
What Is Objective Robustness in AI Alignment? thumbnail
What Is Objective Robustness in AI Alignment?
Robert Miles AI Safety
Quantilizers: AI That Doesn't Try Too Hard thumbnail
Quantilizers: AI That Doesn't Try Too Hard
Robert Miles AI Safety
Intro to AI Safety, Remastered thumbnail
Intro to AI Safety, Remastered
Robert Miles AI Safety
Is AI Safety a Pascal's Mugging? thumbnail
Is AI Safety a Pascal's Mugging?
Robert Miles AI Safety

Summarize YouTube Videos and Get Video Transcripts with 1-Click

Download browser extensions on:

Try YouTube Summary with ChatGPT & Claude or YouTube Transcript Generator

Apps & Extensions

  • Chrome Extension
  • Safari Extension
  • Edge Add-ons
  • Firefox Add-ons
  • iOS App
  • Android App

Key Features

  • YouTube Video Summarizer
  • Web & PDF Summarizer
  • Web & PDF Highlighter
  • Chat with PDF
  • Ask AI Clone
  • Audio Transcriber
  • Glasp Reader
  • Kindle Highlight Export
  • Idea Hatch

Integrations

  • Obsidian Plugin
  • Notion Integration
  • Pocket Integration
  • Instapaper Integration
  • Medium Integration
  • Readwise Integration
  • Snipd Integration
  • Hypothesis Integration

More Features

  • APIs
  • MCP Connector
  • Blog & Post
  • Embed Links
  • Image Highlight
  • Personality Test
  • Quote Shots

Company

  • About us
  • Blog
  • Community
  • FAQs
  • Job Board
  • Newsletter
  • Pricing
Terms

•

Privacy

•

Guidelines

© 2026 Glasp Inc. All rights reserved.