Tools

How to Turn YouTube Videos into Study Notes and Flashcards

YouTube has become one of the largest learning platforms on earth, but watching a video and actually retaining the information are two very different things. This guide gives you a concrete, step-by-step workflow for turning any YouTube video into structured notes and flashcards you can review for months.

12 min read
Key Takeaways
    • Watching is not learning: Research on multimedia learning (Mayer, 2009) shows that passive video consumption leads to poor retention unless paired with active note-taking and retrieval practice.
  • Transcripts are the bridge between video and notes: Pulling the transcript from a YouTube video gives you a text version you can highlight, annotate, and reorganize into your own words.
  • AI summarizers cut your processing time by 80%: Tools like Glasp's YouTube Summary generate structured summaries in seconds, giving you a starting point for your own notes instead of forcing you to transcribe everything manually.
  • The Cornell Method and outline method work best for video notes: These two note-taking frameworks map naturally onto the structure of educational YouTube content, which tends to follow a "concept, explanation, example" pattern.
  • Flashcards built from your own notes outperform pre-made decks: Karpicke & Blunt (2011) demonstrated that retrieval practice from self-generated materials produces stronger long-term retention than reviewing someone else's summaries.
  • Combining summarization with spaced repetition creates a complete learning system: When you summarize a video into notes, convert those notes into flashcards, and review them on a spaced schedule, you engage three of the four highest-rated study strategies identified by Dunlosky et al. (2013).

Why Video Learning Needs a Note-Taking System

YouTube hosts over 800 million videos, and a growing share of them are educational. University lectures, coding tutorials, language lessons, science explainers, and professional development content now reach millions of learners who never set foot in a classroom. The platform has democratized access to knowledge at an unprecedented scale.

But access is not the same as learning. A 2020 study by Plass et al. found that students who watched educational videos without any follow-up activity retained only 20-30% of the material after one week. Compare that to students who took notes during or after the video: their retention jumped to 50-65%.

The problem is not the video format itself. The problem is that most people treat YouTube like television. They press play, watch to the end, and move on. No notes. No review. No retrieval. The information enters working memory, hangs around for a few hours, and quietly disappears.

A note-taking system fixes this by giving you three things: a reason to pay attention (you need to capture key ideas), a physical artifact to review later (your notes), and a foundation for active recall (flashcards and self-testing). Without all three, educational YouTube is just entertainment with a conscience.

If you want a deeper look at the research behind learning from video content, see our guide on how to learn from YouTube. This article focuses on the practical workflow: what to do, step by step, and which tools make it faster.


The Science: Why Converting Video to Text Notes Works

The research supporting this workflow comes from two major frameworks in cognitive psychology.

Mayer's Multimedia Learning Theory (2009) explains why videos can be powerful learning tools, but also why they fail without active processing. According to Mayer, humans process information through two channels: visual/pictorial and auditory/verbal. Video engages both channels simultaneously, which can boost learning. But only if the learner actively integrates the two streams into a coherent mental model. Passive viewing often leads to cognitive overload, where the information flows in too fast to be properly encoded.

Taking notes forces integration. When you pause a video to write down a concept in your own words, you are performing what Mayer calls "generative processing." You are selecting the important information, organizing it into a structure, and integrating it with what you already know. This is the mechanism that converts watching into learning.

Dunlosky et al. (2013) evaluated ten common study strategies across hundreds of experiments and rated them by effectiveness:

StrategyUtility RatingRelevance to Video Notes
Practice testing (active recall)HighFlashcards from your notes
Distributed practice (spaced repetition)HighReviewing flashcards over time
Elaborative interrogationModerateAsking "why?" while watching
Self-explanationModerateWriting notes in your own words
SummarizationLow-ModerateWriting summaries of video sections
HighlightingLowMarking transcript passages
Re-reading / re-watchingLowWatching the video again

Notice that re-watching a video (the default behavior for most learners) ranks at the very bottom. The workflow in this guide moves you from the bottom of the table to the top: from passive re-watching, through summarization and highlighting, all the way to practice testing with flashcards.

Karpicke & Blunt (2011) added a critical finding. Students who generated their own retrieval cues (by writing their own questions and testing themselves) outperformed students who studied with pre-made materials. This is why creating your own flashcards from your own notes matters more than downloading someone else's Anki deck.


Step 1: Get the Transcript

Before you can turn a video into notes, you need a text version of what was said. There are three ways to get this.

Option A: YouTube's Built-In Transcript

YouTube auto-generates transcripts for most videos. Click the three dots below the video, select "Show transcript," and the text appears in a sidebar with timestamps. You can copy and paste it into your note-taking app. The downside: auto-generated transcripts have no paragraph breaks, inconsistent punctuation, and frequent errors with technical terms or accented speech.

Option B: AI-Powered Summary Tools

This is the fastest option. Glasp's YouTube Summary generates a structured summary of any YouTube video with one click. It pulls the transcript, processes it with AI, and returns a summary organized by topic with timestamps. You get both the raw transcript and an AI-generated outline.

The advantage of starting with an AI summary is speed. Instead of reading through a 40-minute transcript (roughly 6,000 words), you start with a 500-word summary that captures the main ideas. You can then go back to the full transcript for sections that need more detail.

Option C: Manual Transcription

For short videos (under 10 minutes) or when you want maximum engagement with the material, you can transcribe key sections yourself. This is the most time-consuming option, but the act of typing what you hear forces close attention to every word. Research on the "generation effect" (Slamecka & Graf, 1978) suggests that information you produce yourself is remembered better than information you simply read.

Recommended approach: Use an AI summary tool to get the overall structure, then go back to specific sections of the transcript for detail. This balances speed with depth.


Step 2: Highlight Key Passages

Once you have the transcript (or summary), the next step is identifying what matters. Not everything in a video deserves a place in your notes. Most educational videos follow a pattern: introduction, core concept, explanation, example, tangent, recap. Your highlights should focus on core concepts and their explanations.

What to Highlight

  • Definitions and key terms: Any time the speaker introduces a new concept or vocabulary word.
  • Claims backed by evidence: Statements supported by research, data, or specific examples.
  • Frameworks and models: Any structured way of thinking about a topic (e.g., "the three types of...", "the four-step process for...").
  • Surprising or counterintuitive points: Information that challenges your existing understanding. These are the ideas most likely to be tested and most valuable to remember.
  • Practical instructions: Step-by-step directions you might want to follow later.

What Not to Highlight

  • Greetings, sponsor segments, and filler.
  • Repetitions of the same point in slightly different words.
  • Examples that illustrate a concept you already understand. (Highlight the concept, skip the example.)

If you use Glasp's web highlighter, you can highlight directly on the YouTube transcript sidebar. Your highlights are saved automatically and linked to timestamps, so you can jump back to the exact moment in the video. You can also add notes to each highlight, which becomes useful in the next step.

For more on the science behind effective highlighting, see The Science of Highlighting.


Step 3: Convert Highlights into Structured Notes

Raw highlights are not notes. They are the raw material for notes. This step is where most of the learning actually happens, because it requires you to reorganize and restate the information in your own words.

Two note-taking methods work particularly well for video content.

The Cornell Method

Divide your page (or document) into three sections:

SectionWhat Goes HereExample
Notes column (right, wide)Main ideas and details from the video, in your own words"Dual coding theory: learning improves when info presented in both visual and verbal formats"
Cue column (left, narrow)Questions or keywords that correspond to each note"What is dual coding?"
Summary (bottom)2-3 sentence summary of the entire video"Video covers three evidence-based note-taking strategies. Cornell method is best for lectures. Mind maps work better for conceptual topics."

The cue column is the most important part. Those questions become your self-testing prompts. Cover the notes column, read a question from the cue column, and try to answer from memory. This is active recall in action.

The Outline Method

If the video has a clear linear structure (most tutorials and lectures do), an outline captures the hierarchy of ideas efficiently:

## Topic: [Video Title]

### Main Point 1: [Core concept]
- Supporting detail
- Supporting detail
  - Sub-detail or example

### Main Point 2: [Core concept]
- Supporting detail
- Key quote: "[exact words from speaker]" (timestamp)

### Main Point 3: [Core concept]
- Supporting detail
- My question: [something I want to look up later]

The outline method is faster than Cornell and works well when you plan to convert your notes into flashcards (Step 4). Each bullet point can become a flashcard.

Pro tip: After writing your notes, close them and try to recreate the outline from memory. This single exercise, sometimes called a "brain dump," is one of the most effective study techniques available. It combines summarization with retrieval practice.


Step 4: Create Flashcards from Your Notes

Notes help you organize information. Flashcards help you remember it. The difference is in how you interact with each format. Notes are for reference; flashcards are for testing.

How to Write Effective Flashcards

Not all flashcards are equal. Research on retrieval practice suggests these principles:

One idea per card. If a card requires you to recall five facts at once, it becomes too difficult and you will either memorize the list as a meaningless sequence or avoid the card entirely. Break complex ideas into atomic pieces.

Use your own words. Copying text from the transcript verbatim defeats the purpose. Restate the idea so that the answer reflects your understanding, not the speaker's phrasing.

Ask "why" and "how," not just "what." Factual recall cards ("What is X?") have their place, but conceptual cards ("Why does X lead to Y?" or "How would you apply X to Z?") produce deeper learning.

Include context from the video. Adding a brief note about where the concept appeared ("from Dr. Smith's lecture on memory, ~12:00 mark") helps you reconstruct the full learning context during review.

Example Flashcard Set (from a video on memory techniques)

Front (Question)Back (Answer)
What are the two channels in Mayer's multimedia learning theory?Visual/pictorial and auditory/verbal. Learners process information through both simultaneously.
Why does re-watching a video produce weaker retention than self-testing?Re-watching creates a feeling of familiarity (recognition), but does not strengthen the retrieval pathways needed for recall. Dunlosky et al. (2013) rated re-reading/re-watching as "low utility."
How should you modify your note-taking to support flashcard creation?Use the Cornell Method's cue column or outline bullet points. Each cue/bullet becomes the front of a card, and the corresponding note becomes the back.

Exporting to Spaced Repetition Apps

Once your flashcards are written, load them into a spaced repetition app so you review them at optimal intervals. The most popular options:

  • Anki (free, desktop and mobile): Import from CSV or plain text. Most flexible scheduling algorithm.
  • Quizlet (freemium): Import from spreadsheets. Better for collaborative study and shared decks.
  • RemNote (freemium): Combines note-taking and flashcard creation in one tool.

If you use Glasp, you can export your highlights in Markdown or CSV format, which makes it straightforward to convert them into flashcard imports. For a detailed guide on how spaced repetition works and how to set up your review schedule, see Spaced Repetition for Readers.


Workflow Comparison: Manual vs. AI-Assisted

Here is how the two approaches compare for a typical 20-minute educational YouTube video:

StepManual WorkflowAI-Assisted Workflow
Get transcriptCopy from YouTube's transcript panel, clean up formatting (10-15 min)One-click AI summary via Glasp (30 sec)
Identify key pointsRead full transcript, highlight manually (15-20 min)Review AI summary, highlight key passages (5-7 min)
Write notesOrganize highlights into Cornell or outline format (15-20 min)Use AI summary as skeleton, add your own notes and connections (10-12 min)
Create flashcardsWrite each card manually from notes (10-15 min)Use AI to draft initial cards, edit and personalize (5-8 min)
Total time50-70 min20-28 min
Learning qualityHigh (deep processing throughout)High (if you actively edit and personalize AI output)

The manual workflow takes roughly 2.5 to 3.5 times longer. The AI-assisted workflow is faster, but only if you actively engage with the output. Simply accepting an AI summary without reading, editing, or questioning it produces the same shallow processing as passive re-watching. The AI handles the mechanical work (transcription, initial organization). You handle the cognitive work (evaluation, connection-making, self-testing).

The best results come from a hybrid approach: use AI for the tedious extraction and formatting, then invest your time in the steps that actually produce learning, which are writing notes in your own words, generating questions, and testing yourself.


Advanced: Using AI Chat to Quiz Yourself on Video Content

Once you have notes and flashcards, there is one more technique worth adding to your workflow: using AI chat to simulate a tutor who quizzes you on the video content.

Glasp's AI chat lets you have a conversation about any video you have highlighted. You can ask it to quiz you, explain concepts you found confusing, or generate additional practice questions based on your highlights.

Here is how to use it effectively:

1. Ask for explanation of specific concepts. After watching a video on, say, operant conditioning, ask: "Based on this video's content, explain the difference between positive and negative reinforcement using original examples not mentioned in the video." This forces the AI to work with the video's framework while generating novel material for you to evaluate.

2. Request practice questions at different difficulty levels. Ask: "Generate five multiple-choice questions from this video, ranging from basic recall to application." Then answer them without looking at your notes. Check your answers against the transcript.

3. Use the Feynman Technique via chat. Try explaining a concept from the video in your own words in the chat. Ask the AI to identify gaps or errors in your explanation based on what the speaker actually said. This is a digital version of the Feynman Technique, and it works surprisingly well with AI chat tools.

4. Generate "what if" scenarios. Ask: "How would the speaker's argument change if [different assumption]?" This pushes you into higher-order thinking and tests whether you understood the reasoning, not just the conclusion.

The key principle: AI chat is a tool for active recall, not passive review. If you are just asking the AI to summarize things you have already summarized, you are wasting your time. Use it to test yourself, challenge your understanding, and generate new questions.


Best Tools for the YouTube-to-Notes Workflow

ToolBest ForKey FeaturePrice
GlaspFull workflow (transcript, highlight, summarize, export)YouTube transcript highlighting + AI summary + export to Notion/Obsidian/AnkiFree
AnkiSpaced repetition flashcardsMost powerful scheduling algorithm, massive community decksFree
NotionLong-form note organizationDatabases, templates, linking between notesFreemium
ObsidianNetworked note-takingBidirectional links, graph view, local storageFree (personal)
QuizletQuick flashcard creation and sharingImport from spreadsheets, collaborative study modesFreemium
RemNoteCombined notes and flashcardsTurn any note into a flashcard inlineFreemium

For a comprehensive comparison of highlighting tools that support this workflow, see Best Online Highlighters Compared.

Recommended stack for most learners: Glasp (transcript + highlights + AI summary) into your note-taking app of choice (Notion or Obsidian) into Anki (flashcards). This gives you a complete pipeline from video to long-term retention with minimal friction between steps.


Frequently Asked Questions

Does this workflow work for any YouTube video, or only lectures?

It works best for educational content with a clear informational structure: lectures, tutorials, explainers, interviews with experts, and documentary-style videos. For entertainment or highly visual content (cooking demos, travel vlogs), a transcript-based approach is less useful because the value is in the visuals, not the words.

How long should I spend on notes for a 20-minute video?

Using the AI-assisted workflow, plan for 20-30 minutes total (roughly 1 to 1.5 times the video length). This includes generating the summary, highlighting, writing notes, and creating flashcards. If you are doing everything manually, expect 50-70 minutes. The investment pays for itself: you will remember the content for months instead of days.

Can I just use the AI summary as my notes without rewriting?

You can, but your retention will be significantly lower. The act of restating ideas in your own words is what drives encoding into long-term memory. Think of the AI summary as a first draft, not a final product. Read it, question it, reorganize it, and add your own connections. That processing is where the learning happens.

What is the best flashcard format for video content?

Question-and-answer cards work well for factual content. For conceptual material, use "explain" prompts ("Explain why X happens") or "compare" prompts ("Compare X and Y"). Keep each card focused on one idea. If you need more than 15 seconds to answer a card, it is too broad and should be split.

How often should I review my flashcards?

Follow a spaced repetition schedule. Review new cards the day after you create them, then again after 3 days, then 7 days, then 14 days, then 30 days. Apps like Anki automate this scheduling for you. For a detailed guide, see Spaced Repetition for Readers.

Is it better to take notes during or after the video?

Both approaches have research support. Taking notes during the video captures more detail but can split your attention. Taking notes after (from the transcript or summary) allows you to focus fully on the video first and then process the content. The AI-assisted workflow favors the "after" approach: watch the video once for understanding, then work with the transcript.


Conclusion: Build a System, Not a Watch History

Most people use YouTube as an infinite stream of content. They watch, feel informed, and move on. A week later, they could not tell you the main points of that "life-changing" video they watched last Tuesday.

The workflow in this guide turns that pattern on its head. By extracting the transcript, highlighting the key passages, converting those highlights into structured notes, and building flashcards for long-term review, you transform passive watching into active studying. Every step moves you higher on the effectiveness scale identified by decades of learning science research.

You do not need to apply this workflow to every video you watch. Save it for the content that matters: the lecture that covers material for your exam, the tutorial that teaches a skill you need for work, the interview that contains ideas you want to carry with you for years.

Start with one video today. Pull up Glasp's YouTube Summary, generate the transcript, and follow the four steps. By the time you finish, you will have a set of notes and flashcards that will keep that knowledge accessible for months, not minutes.

The videos are free. The knowledge is free. The only cost is the 20-30 minutes it takes to actually learn what you watched.

Start building your knowledge library

Highlight what matters as you read across the web. Save insights from articles, books, and YouTube videos in one place.

Get Started Free