A Practical Guide to Using Ollama’s Structured JSON Output
Integrating Large Language Models (LLMs) like Ollama into your applications can enhance data processing and automate various tasks. A key feature that facilitates this integration is Ollama’s ability to produce structured outputs in JSON format. This guide will help you understand and utilize this feature effectively, ensuring your LLM integrations are consistent and reliable.
What is Ollama’s Structured JSON Output?
Ollama’s structured JSON output allows developers to define specific schemas for the responses generated by the LLM. By enforcing a predefined structure, you ensure that the data returned is consistent and easy to work with, making integration into your applications straightforward.
Benefits of Structured JSON Output
- Standardization: Ensures all responses follow a specific format.
- Ease of Integration: JSON is widely supported, simplifying incorporation into various systems.
- Error Reduction: Validates responses against your schema to identify discrepancies early.
- Maintainability: Organized data structures make your codebase easier to manage and update.
Getting Started
To use Ollama’s structured JSON output, you’ll need Python along with the ollama and pydantic libraries. Follow these steps to set up and implement structured outputs in your projects.
Installation
First, install the required libraries using pip:
pip install ollama pydantic
Step-by-Step Examples
Example 1: Retrieving Book Information
Suppose you want to get details about a specific book. Here’s how you can structure the output to get consistent data.
from typing import List
from ollama import chat
from pydantic import BaseModel
class Book(BaseModel):
title: str
author: str
genres: List[str]
published_year: int
response = chat(
messages=[
{
'role': 'user',
'content': 'Can you provide details about "1984" by George Orwell?'
}
],
model='llama3.2',
format=Book.model_json_schema(),
)
book = Book.model_validate_json(response.message.content)
print("Book Information:")
print(book)
Example Output:
Book Information:
title='1984' author='George Orwell' genres=['Dystopian', 'Political Fiction', 'Social Science Fiction'] published_year=1949
Example 2: Extracting Recipe Details
If you want to extract detailed information about a recipe from a user’s description, follow this example.
class Ingredient(BaseModel):
name: str
quantity: str
optional: bool = False
class Recipe(BaseModel):
name: str
ingredients: List[Ingredient]
steps: List[str]
prep_time_minutes: int
cook_time_minutes: int
response = chat(
messages=[
{
'role': 'user',
'content': '''
I want to make spaghetti carbonara.
Ingredients:
- 200g spaghetti
- 100g pancetta
- 2 large eggs
- 50g Pecorino cheese
- 50g Parmesan
- Freshly ground black pepper
- Salt
Instructions:
1. Cook the spaghetti.
2. Fry the pancetta.
3. Beat the eggs and mix with cheese.
4. Combine everything together with spaghetti.
''',
}
],
model='llama3.2',
format=Recipe.model_json_schema()
)
recipe = Recipe.model_validate_json(response.message.content)
print("Recipe Details:")
print(recipe)
Example Output:
Recipe Details:
name='Spaghetti Carbonara' ingredients=[Ingredient(name='spaghetti', quantity='200g', optional=False), Ingredient(name='pancetta', quantity='100g', optional=False), Ingredient(name='eggs', quantity='2 large', optional=False), Ingredient(name='Pecorino cheese', quantity='50g', optional=False), Ingredient(name='Parmesan', quantity='50g', optional=False), Ingredient(name='Freshly ground black pepper', quantity='', optional=False), Ingredient(name='Salt', quantity='', optional=False)] steps=['Cook the spaghetti.', 'Fry the pancetta.', 'Beat the eggs and mix with cheese.', 'Combine everything together with spaghetti.'] prep_time_minutes=15 cook_time_minutes=20
Example 3: Image Description
This example demonstrates how to analyze an image and receive a structured description.
from typing import List, Optional, Literal
class ObjectDetected(BaseModel):
name: str
confidence: float
attributes: str
class ImageDescription(BaseModel):
summary: str
objects: List[ObjectDetected]
scene: str
colors: List[str]
time_of_day: Literal['Morning', 'Afternoon', 'Evening', 'Night']
setting: Literal['Indoor', 'Outdoor', 'Unknown']
text_content: Optional[str] = None
path = './images/image-01.png'
response = chat(
model='llava',
format=ImageDescription.model_json_schema(),
messages=[
{
'role': 'user',
'content': 'Analyze this image and describe what you see, including any objects, the scene, colors and any text you can detect.',
'images': [path],
},
],
options={'temperature': 0},
)
image_description = ImageDescription.model_validate_json(response.message.content)
print("Image Description:")
print(image_description)
Example Output:
Image Description:
summary='A bustling city park during the afternoon.'
objects=[ObjectDetected(name='bench', confidence=0.98, attributes='wooden, occupied'), ObjectDetected(name='tree', confidence=0.95, attributes='lush green foliage'), ObjectDetected(name='people', confidence=0.99, attributes='various ages and ethnicities')] scene='Park' colors=['green', 'blue', 'brown'] time_of_day='Afternoon' setting='Outdoor' text_content=None
Sharing My Experience
While Ollama’s documentation recommends adding “Return as JSON” in your prompt and setting the temperature to 0 to achieve more deterministic results, my experience has shown that these settings can sometimes lead to less desirable outcomes. Specifically, even with these configurations, the LLM tends to output excessive and lengthy information for fields that are intended to receive single values. This can complicate the validation process and require additional post-processing to extract the necessary data.
Additionally, when using the image description script that leverages a vision model, I noticed that response times can vary significantly based on your machine configuration. On more powerful machines, the analysis is relatively swift, but on less capable setups, the process can take noticeably longer, potentially affecting the responsiveness of your application.
Lessons Learned
-
Prompt Refinement: Simply adding “Return as JSON” may not be sufficient. Refining the prompt to be more specific about the expected format for each field can help guide the LLM more effectively.
-
Schema Precision: Ensure that your
pydanticschemas are as precise as possible. Clearly defining the expected data types and constraints can aid in catching discrepancies early. -
Post-Processing Needs: Be prepared to implement additional post-processing steps to clean up and extract the relevant information from the LLM’s output, especially when dealing with single-value fields.
-
Iterative Testing: Continuously test and iterate on your prompts and schemas to find the optimal balance that minimizes excessive output while maintaining the integrity of the data structure.
-
Performance Considerations: When using vision models for image descriptions, be aware that the response time can be influenced by your machine’s capabilities. Optimizing your hardware or considering distributed processing can help mitigate delays in applications where speed is critical.
Best Practices
1. Define Clear and Accurate Schemas
Ensure your pydantic models accurately reflect the data you expect. This minimizes mismatches and ensures smooth data handling.
2. Craft Specific Prompts
The clarity of your prompts directly affects the quality of the structured output. Be as specific as possible to guide the LLM towards generating the desired structure.
3. Implement Robust Error Handling
Even with structured outputs, unexpected responses can occur. Use try-except blocks to catch validation errors and handle them appropriately.
from pydantic import ValidationError
try:
book = Book.model_validate_json(response.message.content)
except ValidationError as e:
print("Validation failed:", e)
4. Optimize for Performance
Validating JSON can add some overhead. Optimize your schemas and validation processes to ensure your application remains responsive, especially under heavy load.
5. Keep Schemas Updated
As your application evolves, update your schemas to accommodate new data requirements or changes in response structures. This keeps your integration robust and adaptable.
Conclusion
Ollama’s Structured JSON Output feature simplifies the integration of LLMs into your applications by ensuring consistent and predictable data formats. By defining clear schemas and using tools like pydantic, you can manage the data from LLMs effectively, reducing the unpredictability inherent in language models. However, it’s important to recognize that following documentation guidelines alone might not always yield perfect results. Fine-tuning prompts and schemas based on practical experiences can lead to more reliable integrations.
Whether you’re extracting basic information, detailed data, or analyzing images, structured JSON output helps standardize your workflows and makes your integrations more reliable.
Get the Code
Access the complete code on GitHub. Clone the repository, install dependencies, and run the examples to try out Ollama’s structured JSON output.