I wasted 100+ hours trying to understand AI from academic papers.
It's a terrible way to start.
Dense math. Vague diagrams. "Attention Is All You Need" is brilliant, but it's not a starting point.
Then I found this Stanford CS229 guest lecture on Building LLMs.
Hereโs why itโs different:
โ Explains the 8-step data filtering pipeline (it's not just "train on the internet").
โ Shows why post-training (SFT, DPO) is what turns a "model" into a useful "assistant".
โ Breaks down Scaling Laws in simple terms (what actually makes models better).
โ Actually covers how to evaluate models (like Chatbot Arena) when there's no single "right" answer.
The kicker?
This 1-hour lecture gives more practical intuition than a week of trying to decode dense academic papers.
Topics that actually matter for builders:
โข Pre-training vs. Post-training (what they are and why you need both)
โข Tokenization (the part everyone skips but shouldn't)
โข Scaling Laws (how compute, data, and parameters relate)
โข The Ops (SFT, DPO, and Evals)
The uncomfortable truth:
While everyone's reading paper summaries,
this lecture's wisdom on data pipelines and evals
is what actually builds real intuition.
Full lecture here: https://lnkd.in/gUzsiN_e
โป๏ธ Repost to help someone escape "paper summary" hell.
It's a terrible way to start.
Dense math. Vague diagrams. "Attention Is All You Need" is brilliant, but it's not a starting point.
Then I found this Stanford CS229 guest lecture on Building LLMs.
Hereโs why itโs different:
โ Explains the 8-step data filtering pipeline (it's not just "train on the internet").
โ Shows why post-training (SFT, DPO) is what turns a "model" into a useful "assistant".
โ Breaks down Scaling Laws in simple terms (what actually makes models better).
โ Actually covers how to evaluate models (like Chatbot Arena) when there's no single "right" answer.
The kicker?
This 1-hour lecture gives more practical intuition than a week of trying to decode dense academic papers.
Topics that actually matter for builders:
โข Pre-training vs. Post-training (what they are and why you need both)
โข Tokenization (the part everyone skips but shouldn't)
โข Scaling Laws (how compute, data, and parameters relate)
โข The Ops (SFT, DPO, and Evals)
The uncomfortable truth:
While everyone's reading paper summaries,
this lecture's wisdom on data pipelines and evals
is what actually builds real intuition.
Full lecture here: https://lnkd.in/gUzsiN_e
โป๏ธ Repost to help someone escape "paper summary" hell.