docs: update README with improved formatting and structured content

- Reformatted the README for better readability with consistent indentation and line breaks
- Restructured course outline with clear lesson numbering and descriptions
- Added detailed getting started instructions with step-by-step setup process
- Included repository layout diagram showing file organization
- Enhanced requirements section with clearer dependency structure
- Added what to expect section outlining project characteristics and learning approach
This commit is contained in:
2025-09-23 12:01:01 -04:00
parent 272172e87c
commit b1bb6fc705
2 changed files with 117 additions and 46 deletions

View File

@@ -51,28 +51,28 @@ Use this exact section order:
18) (Optional) Save/Load & CLI Interface
=== Constraints & Defaults
- Dataset: do NOT auto-download. Expect a local `data.txt`. If missing, include a tiny built-in fallback sample so scripts still run.
- Dataset: do NOT auto-download. Expect a local data.txt. If missing, include a tiny built-in fallback sample so scripts still run.
- Encoding: UTF-8. Normalize newlines to "\n" for consistency.
- Seeds: demonstrate reproducibility (`random`, `numpy`, `torch`).
- Seeds: demonstrate reproducibility (random, numpy, torch).
- Dependencies:
* Stdlib only until Lesson 9;
* NumPy in Lessons 810;
* PyTorch from Lesson 11 onward.
* Stdlib only until Lesson 9
* NumPy in Lessons 810
* PyTorch from Lesson 11 onward
- Training defaults (for Lessons 13+):
* Batch size ~32, block size ~128, AdamW(lr=3e-4).
* Brief note on early stopping when val loss plateaus.
* Batch size ~32, block size ~128, AdamW(lr=3e-4)
* Brief note on early stopping when val loss plateaus
- Inference defaults:
* Start with greedy; then temperature=0.8, top-k=50.
* Start with greedy; then temperature=0.8, top-k=50
- Keep code clean: type hints where helpful; no frameworks beyond NumPy/PyTorch; no external data loaders.
=== Lesson 1 Specifics
For Lesson 1, include:
- Module docstring with Usage example (`python 01_read_text.py`).
- Functions: `load_text(path: Optional[Path])`, `normalize_newlines(text: str)`,
`make_preview(text: str, n_chars: int = 200)`, `report_stats(text: str)`, `main()`.
- At least one doctest per function where reasonable.
- Fallback text snippet if `data.txt` isnt found.
- Output: total chars, unique chars, 200-char preview with literal "\n".
- Module docstring with Usage example (python 01_read_text.py)
- Functions: load_text(path: Optional[Path]), normalize_newlines(text: str),
make_preview(text: str, n_chars: int = 200), report_stats(text: str), main()
- At least one doctest per function where reasonable
- Fallback text snippet if data.txt isnt found
- Output: total chars, unique chars, 200-char preview with literal "\n"
=== Delivery
- Start with a short “How to use this repo” preface and a file tree suggestion.