Back Lazy Retrieval RAG 24 Feb, 2026

🔎 What Does Lazy Retrieval Mean?

Lazy retrieval = retrieval happens only when it is actually needed (at execution time), not before.

In simple words:

The retriever does NOT run when you build the chain.
It runs only when you call chain.invoke().

🧠 Example Using Your Code

chain = (
    {"context": retriever | format_docs,
     "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

At this point:

❌ Retriever is NOT executed
❌ No documents are fetched
❌ No vector search happens

You're just defining a pipeline blueprint.

response = chain.invoke(query)

Now LangChain:

That is lazy execution.

If you do this:

retrieved_docs = retriever.invoke(query)

Now retrieval runs immediately.

Even if:

That is called eager retrieval.

You open app → nothing cooks yet
You place order → food starts cooking

Execution happens only when required.

You cook everything in advance
Even if nobody orders

Wasteful if unused.

Lazy execution means:

LangChain LCEL works lazily by default.

Because it allows:

If retrieval was eager, you would lose this flexibility.

Lazy retrieval = Retrieval runs only when the chain is executed, not when it is defined.

Since you're preparing seriously for AI Engineer roles, this concept is important because:

Lazy execution is used in:

It’s a core distributed systems idea.

Rate This Note

☆ ☆ ☆ ☆ ☆