1️⃣ HyDE (Hypothetical Document Embeddings)
HyDE is a retrieval technique where the system first imagines a perfect answer to the query and then uses that imagined answer to retrieve real documents.
Instead of embedding the question, you embed a hypothetical answer.
User questions are often:
Short
Vague
Poorly aligned with document language
Documents, however, are:
Long
Structured
Answer-oriented
HyDE bridges this mismatch.
User asks a question
LLM generates a hypothetical ideal answer
That answer is embedded
Vector search is performed using this embedding
Real documents matching that “ideal answer” are retrieved
You don’t search a library using a question like:
“How does kidney failure progress?”
You search using a paragraph describing kidney failure.
HyDE auto-writes that paragraph for you.
Works extremely well for vague queries
Improves recall dramatically
Especially useful in scientific / medical / technical RAG
Depends on LLM quality
Can introduce bias
Extra generation cost
Research RAG
Medical / legal QA
Exploratory queries
Window search retrieves neighboring chunks around a matched chunk to preserve context.
Instead of retrieving isolated chunks, you retrieve a window of context.
Chunking breaks:
Narrative flow
Logical continuity
Cause-effect relationships
Window search fixes this.
Retrieve a relevant chunk
Also fetch:
Previous chunk
Next chunk
Combine them into one context window
Reading only one paragraph from a book rarely gives full meaning.
Window search gives you the paragraph before and after.
Preserves context
Improves answer coherence
Simple and effective
Adds tokens
May include irrelevant text
PDFs
Policies
Books
Manuals
A self-query retriever allows the LLM to:
Understand the user’s intent
Extract structured filters
Generate the retrieval query itself
The LLM becomes the query planner, not just the answer generator.
Users mix:
Natural language
Implicit constraints
Metadata requirements
Example:
“Show me beginner-level Python tutorials after 2022”
This includes:
Topic
Difficulty
Time filter
LLM parses user query
Extracts:
Semantic query
Metadata filters
Executes filtered retrieval
It’s like an intelligent librarian who understands what you really want and applies filters automatically.
Very powerful for enterprise RAG
Handles structured + unstructured data
Reduces irrelevant results
Depends on clean metadata
Needs schema alignment
Product catalogs
Course platforms
Enterprise document search
(Often confused as “Contractual”; correct term is Contextual)
Contextual compression shrinks retrieved documents to keep only the parts relevant to the query.
Retrieval stays the same — context gets compressed.
Even good retrieval often returns:
Long documents
Partially relevant sections
LLMs have:
Token limits
Cost constraints
Retrieve documents
Use:
LLM
Extractor
Reranker
Remove irrelevant sections
Pass only high-signal text to LLM
Instead of handing someone a full book, you highlight only the important sentences.
Saves tokens
Reduces hallucination
Improves precision
Additional compute
Over-compression risk
Long documents
Token-sensitive RAG
High-cost LLMs
RAG Fusion improves retrieval by:
Generating multiple query variants
Retrieving for each
Merging and deduplicating results
One question → many perspectives → better recall
Single queries suffer from:
Wording bias
Vocabulary mismatch
Narrow perspective
LLM rewrites query into multiple forms
Each query retrieves documents
Results are merged
Reranked or deduplicated
You ask:
“How to manage diabetes?”
But also search:
“Diabetes treatment guidelines”
“Blood sugar control methods”
“Lifestyle changes for diabetes”
More doors → more relevant knowledge.
Massive recall improvement
Reduces retrieval blind spots
Excellent for research RAG
Higher cost
More latency
Needs reranking
Knowledge-heavy systems
Scientific QA
Open-domain RAG
| Technique | Solves |
|---|---|
| HyDE | Poor query embeddings |
| Window Search | Lost context |
| Self-Query | Hidden constraints |
| Contextual Compression | Token overload |
| RAG Fusion | Narrow recall |
They are not competitors — they are complementary tools.
HyDE → “Imagine the answer first”
Window Search → “Read around the paragraph”
Self-Query → “Let the system understand filters”
Contextual Compression → “Keep only what matters”
RAG Fusion → “Search from multiple angles”
Advanced RAG retrieval is not about finding more data — it’s about finding the right data, in the right shape, with the right context.