Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 13: Data 1

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 13: Data 1
Transcript
So today's lecture is going to be on data. In the previous lectures, up until now, we've discussed how you train a model, given data. So we've talked about the architecture. We've talked about the optimizer, tokenization, scaling laws, parallelism. That's all given a fixed data set. And now, we're going to talk about what data do we train on. So my... Read More
Install to Summarize YouTube Videos and Get Transcripts
Explore YouTube Video Summarizer or Get YouTube Transcript Extractor
Read in Other Languages (beta)
Share This Summary 📚
Summarize YouTube Videos and Get Video Transcripts with 1-Click
Download browser extensions on:
Try YouTube Summary with ChatGPT & Claude or YouTube Transcript Generator
Explore More Summaries from Stanford Online 📚

Stanford CS229: Machine Learning | Summer 2019 | Lecture 20 - Variational Autoencoder
Stanford Online

Bayesian Networks 4 - Probabilistic Inference | Stanford CS221: AI (Autumn 2021)
Stanford Online

Stanford Webinar - GPT-3 & Beyond
Stanford Online

Stanford CS224N NLP with Deep Learning | Winter 2021 | Lecture 16 - Social & Ethical Considerations
Stanford Online

Stanford AA228/CS238 Decision Making Under Uncertainty I Policy Gradient Estimation and Optimization
Stanford Online
Summarize YouTube Videos and Get Video Transcripts with 1-Click
Download browser extensions on:
Try YouTube Summary with ChatGPT & Claude or YouTube Transcript Generator