RAG vs Agentic RAG Explained Visually (with Diagram)

30 Jun, 2025

Back in the days (early 2024-ish 😜), context window of LLM models were very limited like 4k, 8k, etc

In order to do something meaningful, we’ll have to give proper context while keeping context window limit in the mind.

At that time, RAG (Retrieval Augmented Generation) had become common technique to achieve this.

In simple words, Here is how it works:

User asks a question
We search it against the vector database index (think of vector database as a special database in which we can search semantically)
Attach top 5 or 10 documents that matches the user query
LLM uses that context and generates the response

Drawbacks:

It’s good for static sites like docs but not helpful in case if you’re dealing with real time data.

Agentic RAG addresses that limitation.

TLDR: What is Agentic RAG?

AI Agent + RAG = Agentic RAG

The major difference with Agentic RAG and regular RAG is that, it has access to various tools like database, search engine API, memory, MCP servers, etc.

Here is a good comparison diagram by ByteByteGo 👇

Pasted image 20250630215752.png

Reference

https://blog.bytebytego.com/p/ep169-rag-vs-agentic-rag

Happy Agentic RAG

#TIL #Bookmark #Define

RAG vs Agentic RAG Explained Visually (with Diagram)

Reference

Want to stay updated?Sign up for the newsletter.

Want to stay updated?
Sign up for the newsletter.