How to run LLMs locally with Docker

20 Jul, 2025

In order to run AI models locally, I generally use Ollama

But TIL, you can actually run it via docker itself.

How to get started?

This feature is comes enabled by default if you’re on Docker Desktop 4.40+ for macOS on Apple silicon chips.

If not, you can enable it by running the following command

docker desktop enable model-runner

Just pull the model and use it

{model}:{parameters}-{quantization}

docker model pull ai/smollm2:360M-Q4_K_M

docker model run ai/smollm2:360M-Q4_K_M "Why sky is Blue?"

Just set the base url to

http://localhost:12434/engines/v1

And set the model it as whatever model that you’re running. In the above case, it’ll be ai/smollm2:360M-Q4_K_M

Here is a good video by Travis Media about it

Happy local LLM!