The Complete Guide to Using Pydantic for Validating LLM Outputs




In this article, you will learn how to turn free-form large language model (LLM) text into reliable, schema-validated Python objects with Pydantic.

Topics we will cover include:

  • Designing robust Pydantic models (including custom validators and nested schemas).
  • Parsing “messy” LLM outputs safely and surfacing precise validation errors.
  • Integrating validation with OpenAI, LangChain, and LlamaIndex plus retry strategies.

Let’s break it down.

The Complete Guide to Using Pydantic for Validating LLM Outputs
Image by Editor

Introduction

Large language models generate text, not structured data. Even when you prompt them to return structured data, they’re still generating text that looks like valid JSON. The output may have incorrect field names, missing required fields, wrong data types, or extra text wrapped around the actual data. Without validation, these inconsistencies cause runtime errors that are difficult to debug.

Pydantic helps you validate data at runtime using Python type hints. It checks that LLM outputs match your expected schema, converts types automatically where possible, and provides clear error messages when validation fails. This gives you a reliable contract between the LLM’s output and your application’s requirements.

This article shows you how to use Pydantic to validate LLM outputs. You’ll learn how to define validation schemas, handle malformed responses, work with nested data, integrate with LLM APIs, implement retry logic with validation feedback, and more. Let’s not waste any more time.

🔗 You can find the code on GitHub. Before you go ahead, install Pydantic version 2.x with the optional email dependencies: pip install pydantic[email].

Getting Started

Let’s start with a simple example by building a tool that extracts contact information from text. The LLM reads unstructured text and returns structured data that we validate with Pydantic:

All Pydantic models inherit from BaseModel, which provides automatic validation. Type hints like name: str help Pydantic validate types at runtime. The EmailStr type validates email format without needing a custom regex. Fields marked with Optional[str] = None can be missing or null. The @field_validator decorator lets you add custom validation logic, like cleaning phone numbers and checking their length.

Here’s how to use the model to validate sample LLM output:

When you create a ContactInfo instance, Pydantic validates everything automatically. If validation fails, you get a clear error message telling you exactly what went wrong.

Parsing and Validating LLM Outputs

LLMs don’t always return perfect JSON. Sometimes they add markdown formatting, explanatory text, or mess up the structure. Here’s how to handle these cases:

This approach uses regex to find JSON within response text, handling cases where the LLM adds explanatory text before or after the data. We catch different exception types separately:

  • JSONDecodeError for malformed JSON,
  • ValidationError for data that doesn’t match the schema, and
  • General exceptions for unexpected issues.

The extract_json_from_llm_response function handles text cleanup while parse_review handles validation, keeping concerns separated. In production, you’d want to log these errors or retry the LLM call with an improved prompt.

This example shows an LLM response with extra text that our parser handles correctly:

The parser extracts the JSON block from the surrounding text and validates it against the ProductReview schema.

Working with Nested Models

Real-world data is rarely flat. Here’s how to handle nested structures like a product with multiple reviews and specifications:

The Product model contains lists of Specification and Review objects, and each nested model is validated independently. Using Field(..., ge=1, le=5) adds constraints directly in the type hint, where ge means “greater than or equal” and gt means “greater than”.

The check_average_matches_reviews validator accesses other fields using info.data, allowing you to validate relationships between fields. When you pass nested dictionaries to Product(**data), Pydantic automatically creates the nested Specification and Review objects.

This structure ensures data integrity at every level. If a single review is malformed, you’ll know exactly which one and why.

This example shows how nested validation works with a complete product structure:

Pydantic validates the entire nested structure in one call, checking that specifications and reviews are properly formed and that the average rating matches the individual review ratings.

Using Pydantic with LLM APIs and Frameworks

So far, we’ve learned that we need a reliable way to convert free-form text into structured, validated data. Now let’s see how to use Pydantic validation with OpenAI’s API, as well as frameworks like LangChain and LlamaIndex. Be sure to install the required SDKs.

Using Pydantic with OpenAI API

Here’s how to extract structured data from unstructured text using OpenAI’s API with Pydantic validation:

The prompt includes the exact JSON structure we expect, guiding the LLM to return data matching our Pydantic model. Setting temperature=0 makes the LLM more deterministic and less creative, which is what we want for structured data extraction. The system message primes the model to be a data extractor rather than a conversational assistant. Even with careful prompting, we still validate with Pydantic because you should never trust LLM output without verification.

This example extracts structured information from a book description:

The function sends the unstructured text to the LLM with clear formatting instructions, then validates the response against the BookSummary schema.

Using LangChain with Pydantic

LangChain provides built-in support for structured output extraction with Pydantic models. There are two main approaches that handle the complexity of prompt engineering and parsing for you.

The first method uses PydanticOutputParser, which works with any LLM by using prompt engineering to guide the model’s output format. The parser automatically generates detailed format instructions from your Pydantic model:

The PydanticOutputParser automatically generates format instructions from your Pydantic model, including field descriptions and type information. It works with any LLM that can follow instructions and doesn’t require function calling support. The chain syntax makes it easy to compose complex workflows.

The second method is to use the native function calling capabilities of modern LLMs through the with_structured_output() function:

This method produces cleaner, more concise code and makes use of the model’s native function calling capabilities for more reliable extraction. You don’t need to manually create parsers or format instructions, and it’s generally more accurate than prompt-based approaches.

Here’s an example of how to use these functions:

Using LlamaIndex with Pydantic

LlamaIndex provides multiple approaches for structured extraction, with particularly strong integration for document-based workflows. It’s especially useful when you need to extract structured data from large document collections or build RAG systems.

The most straightforward approach in LlamaIndex is using LLMTextCompletionProgram, which requires minimal boilerplate code:

The output_cls parameter automatically handles Pydantic validation. This works with any LLM through prompt engineering and is good for quick prototyping and simple extraction tasks.

For models that support function calling, you can use FunctionCallingProgram. And when you need explicit control over parsing behavior, you can use the PydanticOutputParser method:

Here’s how you’d extract product information in practice:

Use explicit parsing when you need custom parsing logic, are working with models that don’t support function calling, or are debugging extraction issues.

Retrying LLM Calls with Better Prompts

When the LLM returns invalid data, you can retry with an improved prompt that includes the error message from the failed validation attempt:

Each retry includes the previous error message, helping the LLM understand what went wrong. After max_retries, the function returns None instead of crashing, allowing the calling code to handle the failure gracefully. Printing each attempt’s error makes it easy to debug why extraction is failing.

In a real application, your llm_call_function would construct a new prompt including the Pydantic error message, like "Previous attempt failed with error: {error}. Please fix and try again."

This example shows the retry pattern with a mock LLM function that progressively improves:

The first attempt misses the required attendees field, the second attempt includes it but with the wrong type, and the third attempt gets everything correct. The retry mechanism handles these progressive improvements.

Conclusion

Pydantic helps you go from unreliable LLM outputs into validated, type-safe data structures. By combining clear schemas with robust error handling, you can build AI-powered applications that are both powerful and reliable.

Here are the key takeaways:

  • Define clear schemas that match your needs
  • Validate everything and handle errors gracefully with retries and fallbacks
  • Use type hints and validators to enforce data integrity
  • Include schemas in your prompts to guide the LLM

Start with simple models and add validation as you find edge cases in your LLM outputs. Happy exploring!

References and Further Reading





Leave a Reply

Your email address will not be published. Required fields are marked *