How to run LLMs locally with Docker
In order to run AI models locally, I generally use Ollama
But TIL, you can actually run it via docker itself.
How to get started?
This feature is comes enabled by default if you’re on Docker Desktop 4.40+ for macOS on Apple silicon chips.
If not, you can enable it by running the following command
docker desktop enable model-runner
How to use?
Just pull the model and use it
{model}:{parameters}-{quantization}
docker model pull ai/smollm2:360M-Q4_K_M
docker model run ai/smollm2:360M-Q4_K_M "Why sky is Blue?"
How to use it with OpenAI compatible sdk ?
Just set the base url to
http://localhost:12434/engines/v1
And set the model it as whatever model that you’re running. In the above case, it’ll be ai/smollm2:360M-Q4_K_M
Here is a good video by Travis Media about it
References
Happy local LLM!