🌈 FAISS search_kwargs
When you convert a FAISS vector store into a retriever:
retriever = vectorstore.as_retriever(
search_type="similarity",
search_kwargs={"k": 5}
)
search_kwargs decides HOW the retriever searches your vector index.
Let’s break down all keyword arguments in a colorful and beautiful way 👇
k — Number of ResultsMeaning:
How many similar documents do you want back?
Default: Usually 4
You use it when:
You want the top K most similar chunks.
Example:
{"k": 5}
🔍 “Bring me the top 5 closest matches!”
score_threshold — Minimum Similarity NeededUsed only in:
"similarity_score_threshold" or "similarity_distance_threshold" retrievers.
Meaning:
Return results ONLY when similarity score > threshold.
Example:
{
"score_threshold": 0.8,
"k": 20 # internal search depth
}
🛑 “Ignore weak matches. Give me only strong, high-score documents!”
fetch_k — Candidates Before Final Selection (MMR Only)Used only in:
search_type="mmr" (Maximum Marginal Relevance)
Meaning:
Search this many candidates internally before selecting diverse ones.
Example:
{
"k": 5, # final 5 results
"fetch_k": 25 # initial candidate pool
}
🔎 “Search 25 items first, then pick the 5 best diverse ones!”
lambda_mult — Diversity Factor (MMR Only)Used only in:
search_type="mmr"
Meaning:
Controls trade-off between similarity and diversity.
lambda_mult value |
Meaning |
|---|---|
0.0 |
Max diversity |
1.0 |
Max similarity |
0.3 – 0.7 |
Good balanced mix |
Example:
{"lambda_mult": 0.5}
🎨 “Blend similarity and diversity 50-50!”
filters / filter — Metadata Filtering(Some FAISS wrappers support this)
Meaning:
Filter only documents with matching metadata.
Example:
{
"k": 4,
"filters": {"source": "policies"}
}
📁 “Search only inside documents where source = policies!”
score_threshold (for Distance)Used in:
search_type="similarity_distance_threshold"
Meaning:
Based on distance (lower distance = closer match).
Only return documents where distance < threshold.
| Search Type | Allowed search_kwargs |
|---|---|
"similarity" |
k |
"similarity_score_threshold" |
k, score_threshold |
"similarity_distance_threshold" |
k, score_threshold |
"mmr" |
k, fetch_k, lambda_mult |
| kwarg | Color | Meaning | Use When |
|---|---|---|---|
| k | 🔵 | How many results to return | Always |
| score_threshold | 🟢 | Minimum similarity score | When filtering weak matches |
| fetch_k | 🟣 | Initial candidates for MMR | For diverse results |
| lambda_mult | 🟠 | Diversity vs similarity | For diversity control |
| filters | 🟡 | Metadata filtering | For scoped search |