Back LLM Chain Filter and Embedding Filter: Langchain 05 Feb, 2026

🧠 Big Picture First (Mental Model)

Imagine you are talking to an AI assistant 📚
You have 100 documents, but the AI can only read 10 properly.

So you need a filter before sending documents to the LLM.

There are two popular filters:

  1. 🗣️ LLM Chain Filter“Ask an LLM to decide relevance”

  2. 📐 Embedding Filter“Use math (vectors) to measure similarity”



1️⃣ LLM Chain Filter 🗣️✨

🔍 What is LLM Chain Filter?

LLM Chain Filter uses an LLM itself to decide whether a document is relevant or not.

Think of it as:

🧑‍🏫 “Hey GPT, read this document and tell me:
Is it useful for answering the user’s question?”

📌 It reads text, understands meaning, context, and intent.


🧩 How it Works (Step-by-step)

  1. User asks a question
    👉 “What are side effects of Atorvastatin?”

  2. Retriever fetches many documents

    • Blogs

    • Medical PDFs

    • Reviews

  3. LLM Chain Filter:

    • Sends each document + question to the LLM

    • LLM answers:
      ✔ Relevant / ❌ Not relevant

  4. Only approved documents move forward 🚀


🎨 Intuitive Example

🗂 Documents

  • 📄 Doc 1: “Atorvastatin side effects include muscle pain…”

  • 📄 Doc 2: “History of Roman Empire”

  • 📄 Doc 3: “Fenofibrate vs Statin comparison”

🧠 LLM Decision

  • ✅ Doc 1 → YES (directly useful)

  • ❌ Doc 2 → NO (irrelevant)

  • ⚠️ Doc 3 → YES (contextually helpful)

➡️ LLM understands nuance, not just keywords.


🧪 Code-style Example (Conceptual)

from langchain_core.document_compressors import LLMChainFilter

filter = LLMChainFilter.from_llm(llm)

compressed_docs = filter.compress_documents(
    documents=docs,
    query="Side effects of statins"
)

✅ Pros of LLM Chain Filter

🌟 Very intelligent

  • Understands context

  • Handles synonyms

  • Works well for complex medical, legal, financial text

🧠 Human-like judgment

  • Can reason

  • Can ignore misleading keywords


❌ Cons of LLM Chain Filter

💰 Expensive

  • Every document = LLM call

🐢 Slow

  • Not ideal for 1,000+ docs

🔁 Non-deterministic

  • Slight variation in answers

📛 Overkill

  • Simple keyword queries don’t need it


🏥 Best Use Cases

✔ Medical Q&A
✔ Legal documents
✔ Policy interpretation
✔ Complex RAG pipelines
✔ Small–medium datasets


2️⃣ Embedding Filter 📐⚡

🔍 What is Embedding Filter?

Embedding Filter uses vector similarity (math) to decide relevance.

Think of it as:

📏 “How close is this document to the user query in meaning?”

No LLM reasoning — only numerical similarity.


🧩 How it Works

  1. Convert query → vector

  2. Convert documents → vectors

  3. Measure cosine similarity

  4. Keep documents above a threshold

📐 Higher similarity = more relevant


🎨 Intuitive Example

User Query

“Statin muscle pain”

Document Similarity Scores

DocumentSimilarity
Atorvastatin side effects0.91 ✅
Fenofibrate comparison0.78 ✅
Heart anatomy basics0.45 ❌
Roman history0.02 ❌

➡️ Only top matches survive 🔥


🧪 Code-style Example (Conceptual)

from langchain_core.document_compressors import EmbeddingsFilter

filter = EmbeddingsFilter(
    embeddings=embedding_model,
    similarity_threshold=0.75
)

compressed_docs = filter.compress_documents(
    documents=docs,
    query="Statin side effects"
)

✅ Pros of Embedding Filter

Very fast
💸 Cheap
📈 Scales well
🔁 Deterministic results
🧮 Perfect for large datasets


❌ Cons of Embedding Filter

🤖 No reasoning

  • Can’t understand intent deeply

🎯 Threshold sensitive

  • Wrong threshold = missed info

🧩 Semantic confusion

  • Similar words ≠ useful content


📊 Best Use Cases

✔ Large document collections
✔ E-commerce search
✔ Product reviews
✔ FAQ systems
✔ Fast RAG pipelines


🆚 LLM Chain Filter vs Embedding Filter

FeatureLLM Chain Filter 🗣️Embedding Filter 📐
Intelligence⭐⭐⭐⭐⭐⭐⭐⭐
Speed🐢 Slow⚡ Fast
Cost💰 High💸 Low
ReasoningYesNo
ScalabilityMediumHigh
Best forComplex meaningLarge datasets

🧠 Pro Tip

🔥 Use BOTH together

Retriever
   ↓
Embedding Filter (fast pruning)
   ↓
LLM Chain Filter (deep reasoning)
   ↓
Final Answer

📌 This is called Contextual Compression Retriever
— exactly what you are implementing in your project 👏


🎯 When YOU should use what (Based on your work)

Since you are working on:

  • 🛒 Product reviews

  • 🧠 LLM-based assistants

  • 📊 Large scraped datasets

👉 Best choice for you:

  • Embedding Filter first

  • LLM Chain Filter optionally