LLMs often return unstructured or semi-structured data. The Pydantic Output Parser helps:
Validate that output is in the expected schema.
Automatically convert model output into Python objects using Pydantic.
Raise clear errors if the structure or data types are incorrect.
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field
class Person(BaseModel):
name: str
age: int = Field(..., description="Age in years")
parser = PydanticOutputParser(pydantic_object=Person)
# This is what you'd expect from an LLM
llm_output = '{"name": "Abhi", "age": 30}'
parsed = parser.parse(llm_output)
print(parsed) # Person(name='Abhi', age=30)
.parse()
→ Converts string (usually JSON) to a Pydantic model.
If the output is malformed (e.g., wrong types or missing fields), a ValidationError
is raised.
Compatible with few-shot prompts or tools that require structured output.
Feature | Benefit |
---|---|
Validation | Ensures structure and type correctness |
Reliability | Handles LLM hallucinations or missing fields |
Serialization | Pydantic models can be .dict() or .json() easily |
Integration | Works with LangChain agents/tools/output parsers |
llm_output = '{"name": "Abhi"}' # missing 'age'
try:
parsed = parser.parse(llm_output)
except Exception as e:
print(e)
📤 Output:
1 validation error for Person
age
field required (type=value_error.missing)
from langchain.prompts import PromptTemplate
from langchain.chat_models import ChatOpenAI
prompt = PromptTemplate(
template="Extract name and age: {text}",
input_variables=["text"]
)
model = ChatOpenAI()
output = model.predict(prompt.format(text="My name is Abhi and I'm 30 years old."))
parser = PydanticOutputParser(pydantic_object=Person)
person_data = parser.parse(output)
Even without LangChain, you can manually use Pydantic models to parse LLM output as long as it's in valid JSON or Python-like structure.
Let me know if you'd like a visual flowchart or a LangChain agent demo using PydanticOutputParser
.