Products
Features
YouTube Video Summarizer
Summarize YouTube videos
Web & PDF Highlighter
Highlight web pages & PDFs
Chat with PDF
Ask any PDF questions with AI
Ask AI Clone
Chat with your highlights & memories
Audio Transcriber
Transcribe audio files to text
Glasp Reader
Read and highlight articles
Kindle Highlight Export
Export your Kindle highlights
Idea Hatch
Hatch ideas from your highlights
Integrations
Obsidian Plugin
Notion Integration
Pocket Integration
Instapaper Integration
Medium Integration
Readwise Integration
Snipd Integration
Hypothesis Integration
Apps & Extensions
Chrome Extension
Safari Extension
Edge Add-ons
Firefox Add-ons
iOS App
Android App
Discover
Discover
Ideas
Discover new ideas and insights
Articles
Curated articles and insights
Books
Book recommendations by great minds
Posts
Essays and notes from readers
Quotes
Inspiring quotes collection
Videos
Curated videos and summaries
Explore Glasp
Glasp Newsletter
Weekly insights and updates
Glasp Talk
Interview series with great minds
Glasp Blog
Latest news and articles
Glasp Use Cases
Learn how others use Glasp
Build & Support
Glasp API
Access Glasp's API for developers
MCP Connector
Connect Glasp to Claude & ChatGPT
Community
Glasp Reddit Community
Students
Student discount and benefits
FAQs
Frequently Asked Questions
AboutPricing
DashboardLog inSign up

5.4.7 R5. Predictive Coding - Video 6: Evaluating the Model

December 13, 2018
by
MIT OpenCourseWare
YouTube video player
5.4.7 R5. Predictive Coding - Video 6: Evaluating the Model

TL;DR

The CART model's accuracy on the test set is 85.6%, showing a slight improvement compared to the baseline model's accuracy of 83.7%.

Transcript

Now that we've trained a model, we need to evaluate it on the test set. So let's build an object called pred that has the predicted probabilities for each class from our cart model. So we'll use predict of emailCART, our cart model, passing it newdata=test, to get test set predicted probabilities. So to recall the structure of pred, we can look at ... Read More

Key Insights

  • 🙂 The CART model shows a slight improvement in accuracy compared to the baseline model.
  • 📜 False negatives are considered more costly in document retrieval applications.
  • ❎ Adjusting the cutoff on the ROC curve can help optimize the trade-off between false positives and false negatives.
  • 🧑‍🦽 Manual review is still required for all predicted responsive documents.
  • 🥺 Unbalanced data sets often lead to limited improvements in model accuracy.
  • ⚾ The accuracy of a model should be evaluated based on the specific costs and consequences of false positives and false negatives.
  • 😥 The performance of the baseline model provides a reference point for evaluating the effectiveness of more complex models.

Install to Summarize YouTube Videos and Get Transcripts

Explore YouTube Video Summarizer or Get YouTube Transcript Extractor

Questions & Answers

Q: How is the accuracy of the CART model on the test set calculated?

The accuracy is calculated by comparing the true outcomes with the predicted outcomes using a cutoff of 0.5. The number of correctly predicted responsive and non-responsive documents is divided by the total number of elements in the test set.

Q: How does the accuracy of the CART model compare to the accuracy of the baseline model?

The CART model has an accuracy of 85.6% on the test set, while the baseline model has an accuracy of 83.7%. This indicates a small improvement in accuracy using the CART model.

Q: What are the consequences of false positives and false negatives in document retrieval applications?

False positives, where a non-responsive document is labeled as responsive, require additional work in the manual review process but do not cause further harm. False negatives, where a responsive document is labeled as non-responsive, result in the document being missed entirely in the predictive coding process, which is more costly.

Q: Why is it important to experiment with different cutoffs on the ROC curve?

Different cutoffs on the ROC curve can help find the optimal balance between false positives and false negatives. By adjusting the cutoff, the trade-off between incorrectly labeling non-responsive documents as responsive (false positives) and missing responsive documents (false negatives) can be optimized.

Summary & Key Takeaways

  • A model called "pred" is built with predicted probabilities for each class from the CART model.

  • The accuracy of the CART model on the test set is calculated using a cutoff of 0.5.

  • The accuracy of the baseline model, which predicts all documents as non-responsive, is compared to the CART model's accuracy.


Read in Other Languages (beta)

English

Share This Summary 📚

Summarize YouTube Videos and Get Video Transcripts with 1-Click

Download browser extensions on:

Try YouTube Summary with ChatGPT & Claude or YouTube Transcript Generator

Explore More Summaries from MIT OpenCourseWare 📚

Lecture 13: Syntax, Part 3 thumbnail
Lecture 13: Syntax, Part 3
MIT OpenCourseWare
How Does Pion Decay Involve Weak Interactions? thumbnail
How Does Pion Decay Involve Weak Interactions?
MIT OpenCourseWare
L14.4 The Bayesian Inference Framework thumbnail
L14.4 The Bayesian Inference Framework
MIT OpenCourseWare
L05.11 Linearity of Expectations thumbnail
L05.11 Linearity of Expectations
MIT OpenCourseWare
Unit 10: Utility Analysis and Multidimensional Evaluation, Video 3: Conditions for Value Function thumbnail
Unit 10: Utility Analysis and Multidimensional Evaluation, Video 3: Conditions for Value Function
MIT OpenCourseWare
9. Cognitive Development: How Do Children Think? (audio only) thumbnail
9. Cognitive Development: How Do Children Think? (audio only)
MIT OpenCourseWare

Summarize YouTube Videos and Get Video Transcripts with 1-Click

Download browser extensions on:

Try YouTube Summary with ChatGPT & Claude or YouTube Transcript Generator

Apps & Extensions

  • Chrome Extension
  • Safari Extension
  • Edge Add-ons
  • Firefox Add-ons
  • iOS App
  • Android App

Key Features

  • YouTube Video Summarizer
  • Web & PDF Summarizer
  • Web & PDF Highlighter
  • Chat with PDF
  • Ask AI Clone
  • Audio Transcriber
  • Glasp Reader
  • Kindle Highlight Export
  • Idea Hatch

Integrations

  • Obsidian Plugin
  • Notion Integration
  • Pocket Integration
  • Instapaper Integration
  • Medium Integration
  • Readwise Integration
  • Snipd Integration
  • Hypothesis Integration

More Features

  • APIs
  • MCP Connector
  • Blog & Post
  • Embed Links
  • Image Highlight
  • Personality Test
  • Quote Shots

Company

  • About us
  • Blog
  • Community
  • FAQs
  • Job Board
  • Newsletter
  • Pricing
Terms

•

Privacy

•

Guidelines

© 2026 Glasp Inc. All rights reserved.