nanochat by Andrej Karpathy
Andrej Karpathy has recently released nanochat where he builds a ChatGPT like LLM from scratch purely for the purpose of learning.
This project is roughly 8000 lines of code hand-written by him.
And it covers:
- Building a tokenizer (rust)
- Pretraining a Transformer LLM on FineWeb
- Midtrain on user-assistant conversations from SmolTalk, tool use, etc
- Efficient inference the model in an Engine with KV cache
- Simple prefill/decode
- Tool use (Python interpreter in a lightweight sandbox)
- Talk to it over CLI or ChatGPT-like WebUI.
And much more.
And the best part is, for ~$100 in cost (~4 hours on an 8XH100 node) you should be able to replicate it 🤯
👉 https://github.com/karpathy/nanochat/discussions/1
Reference
https://x.com/karpathy/status/1977755427569111362
Happy building LLM!