Response Caching in OpenRouter
OpenRouter now lets you cache the responses across all the models.
Just add X-OpenRouter-Cache: true and it’ll cache chat completions, responses, messages, or embeddings requests.
First call hits the AI provider and other subsequent requests will return the same response and you won’t get billed for that.
If you’re using LLM in test env, local development and other similar environment. You can just enable this feature and so you don’t ended up paying for those use cases.
How to use it?
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-H "Content-Type: application/json" \
-H "X-OpenRouter-Cache: true" \
-d '{
"model": "google/gemini-2.5-flash",
"messages": [{"role": "user", "content": "Why sky is blue?"}]
}'