Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 14: Data 2

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 14: Data 2
Transcript
So this is the second lecture on data. In the last lecture, we talked about different data sets that were used to train various language models. We did a historical overview from the data sets that were used to train BERT to all the way up to HOMO and everything in between. And one of the things I wanted to emphasize is that data doesn't just fall ... Read More
Install to Summarize YouTube Videos and Get Transcripts
Explore YouTube Video Summarizer or Get YouTube Transcript Extractor
Read in Other Languages (beta)
Share This Summary 📚
Summarize YouTube Videos and Get Video Transcripts with 1-Click
Download browser extensions on:
Try YouTube Summary with ChatGPT & Claude or YouTube Transcript Generator
Explore More Summaries from Stanford Online 📚

Bayesian Networks 4 - Probabilistic Inference | Stanford CS221: AI (Autumn 2021)
Stanford Online

Stanford AA228/CS238 Decision Making Under Uncertainty I Policy Gradient Estimation and Optimization
Stanford Online

Stanford CS229: Machine Learning | Summer 2019 | Lecture 20 - Variational Autoencoder
Stanford Online

Stanford CS224N NLP with Deep Learning | Winter 2021 | Lecture 16 - Social & Ethical Considerations
Stanford Online

Stanford Webinar - GPT-3 & Beyond
Stanford Online
Summarize YouTube Videos and Get Video Transcripts with 1-Click
Download browser extensions on:
Try YouTube Summary with ChatGPT & Claude or YouTube Transcript Generator