What is Al "reward hacking"—and why do we worry about it?

35.6K views

•

November 21, 2025

by

YouTube video player

What is Al "reward hacking"—and why do we worry about it?

Transcript

The core interesting part of the story is not that the model learns to hack, 'cause we already knew that there were these cheats available in these environments. The core part is detecting, "Okay, like, is there more to this now?" We realized that these models were evil. And how we realized they're evil? Well, we had to find some way of measuring... Read More

Install to Summarize YouTube Videos and Get Transcripts

Explore YouTube Video Summarizer or Get YouTube Transcript Extractor

Read in Other Languages (beta)

Share This Summary 📚

Summarize YouTube Videos and Get Video Transcripts with 1-Click

Download browser extensions on:

Try YouTube Summary with ChatGPT & Claude or YouTube Transcript Generator

Explore More Summaries from Anthropic 📚

Spotlight on Manus | Code w/ Claude thumbnail

Spotlight on Manus | Code w/ Claude

Scaling enterprise AI: Fireside chat with Eli Lilly’s Diogo Rau and Dario Amodei thumbnail

Scaling enterprise AI: Fireside chat with Eli Lilly’s Diogo Rau and Dario Amodei

Lesson 1A: Introduction to teaching AI Fluency | Teaching AI Fluency thumbnail

Lesson 1A: Introduction to teaching AI Fluency | Teaching AI Fluency

What Are Cloud Code Best Practices? thumbnail

What Are Cloud Code Best Practices?

Why Philosophers Work with AI at Anthropic? thumbnail

Why Philosophers Work with AI at Anthropic?

How to Master Claude Code in 30 Minutes thumbnail

How to Master Claude Code in 30 Minutes

Summarize YouTube Videos and Get Video Transcripts with 1-Click

Download browser extensions on:

Try YouTube Summary with ChatGPT & Claude or YouTube Transcript Generator