How to use Google Gemma 4 locally with llama.cpp

02 Apr, 2026 by

Similar to Ollama, you can also use llama.cpp to run LLM in your machine as well.

Getting Started

If you’re on macOS, you can install llama.cpp using brew like this

brew install llama.cpp

Depending on the machine, you might want to run appropriate LLM

2026-04-02-at-23.30.382x.png

llama-server -hf ggml-org/gemma-4-E4B-it-GGUF:Q8_0