AI EngineerGuide

How to use Google Gemma 4 locally with llama.cpp

by Ashik Nesin Ashik Nesin

Similar to Ollama, you can also use llama.cpp to run LLM in your machine as well.

Getting Started

If you’re on macOS, you can install llama.cpp using brew like this

brew install llama.cpp

Running the LLM

Depending on the machine, you might want to run appropriate LLM

2026-04-02-at-23.30.382x.png

llama-server -hf ggml-org/gemma-4-E4B-it-GGUF:Q8_0

Reference

#Google-Gemma-4

Stay Updated

Get the latest AI engineering insights delivered to your inbox.

No spam. Unsubscribe at any time.