84 lines
3.4 KiB
Markdown
84 lines
3.4 KiB
Markdown
# Prompt.md
|
||
|
||
Copy the prompt below exactly to replicate this course:
|
||
|
||
```You are an expert Python instructor. Generate a complete, beginner-friendly course called
|
||
“ARIA — Zero-to-Tiny LLM (Python)” that takes a learner from “Hello World” to training a tiny
|
||
decoder-only, character-level LLM in ~17–18 single-file lessons. No safety/guardrail features;
|
||
assume a controlled learning environment.
|
||
|
||
=== Audience & Scope
|
||
- Audience: absolute beginners who have only written “Hello World”.
|
||
- Language: Python.
|
||
- Goal: build up to a tiny decoder-only LLM trained on a small corpus (e.g., Tiny Shakespeare).
|
||
- Keep each lesson runnable in a single .py file (≤ ~200 lines where feasible).
|
||
|
||
=== Output Format (for EACH lesson)
|
||
Use this exact section order:
|
||
1) Title
|
||
2) Duration (estimate)
|
||
3) Outcome (what they will accomplish)
|
||
4) Files to create (filenames)
|
||
5) Dependencies (Python stdlib / NumPy / PyTorch as specified)
|
||
6) Step-by-step Directions
|
||
7) Starter code (complete, runnable) with:
|
||
- A clear module docstring that includes: what it does, how to run, and notes.
|
||
- Function-level Google-style docstrings (Args/Returns/Raises) + at least one doctest where reasonable.
|
||
8) How to run (CLI commands)
|
||
9) What you learned (bullets)
|
||
10) Troubleshooting (common errors + fixes)
|
||
11) Mini-exercises (3–5 quick tasks)
|
||
12) What’s next (name the next lesson)
|
||
|
||
=== Curriculum (keep these names and order)
|
||
01) Read a Text File (with docstrings)
|
||
02) Character Frequency Counter
|
||
03) Train/Val Split
|
||
04) Char Vocabulary + Encode/Decode
|
||
05) Uniform Random Text Generator
|
||
06) Bigram Counts Language Model
|
||
07) Laplace Smoothing (compare w/ and w/o)
|
||
08) Temperature & Top-k Sampling
|
||
09) Perplexity on Validation
|
||
10) NumPy Softmax + Cross-Entropy (toy)
|
||
11) PyTorch Tensors 101
|
||
12) Autograd Mini-Lab (fit y=2x+3)
|
||
13) Char Bigram Neural LM (PyTorch)
|
||
14) Sampling Function (PyTorch)
|
||
15) Single-Head Self-Attention (causal mask)
|
||
16) Mini Transformer Block (pre-LN)
|
||
17) Tiny Decoder-Only Model (1–2 blocks)
|
||
18) (Optional) Save/Load & CLI Interface
|
||
|
||
=== Constraints & Defaults
|
||
- Dataset: do NOT auto-download. Expect a local `data.txt`. If missing, include a tiny built-in fallback sample so scripts still run.
|
||
- Encoding: UTF-8. Normalize newlines to "\n" for consistency.
|
||
- Seeds: demonstrate reproducibility (`random`, `numpy`, `torch`).
|
||
- Dependencies:
|
||
* Stdlib only until Lesson 9;
|
||
* NumPy in Lessons 8–10;
|
||
* PyTorch from Lesson 11 onward.
|
||
- Training defaults (for Lessons 13+):
|
||
* Batch size ~32, block size ~128, AdamW(lr=3e-4).
|
||
* Brief note on early stopping when val loss plateaus.
|
||
- Inference defaults:
|
||
* Start with greedy; then temperature=0.8, top-k=50.
|
||
- Keep code clean: type hints where helpful; no frameworks beyond NumPy/PyTorch; no external data loaders.
|
||
|
||
=== Lesson 1 Specifics
|
||
For Lesson 1, include:
|
||
- Module docstring with Usage example (`python 01_read_text.py`).
|
||
- Functions: `load_text(path: Optional[Path])`, `normalize_newlines(text: str)`,
|
||
`make_preview(text: str, n_chars: int = 200)`, `report_stats(text: str)`, `main()`.
|
||
- At least one doctest per function where reasonable.
|
||
- Fallback text snippet if `data.txt` isn’t found.
|
||
- Output: total chars, unique chars, 200-char preview with literal "\n".
|
||
|
||
=== Delivery
|
||
- Start with a short “How to use this repo” preface and a file tree suggestion.
|
||
- Then render Lessons 01–18 in order, each with the exact section headings above.
|
||
- End with a short FAQ (Windows vs. macOS paths, UTF-8 issues, CPU vs. GPU notes).
|
||
|
||
Generate now.
|
||
```
|