April 15, 2026

Building a GTM Intelligence Agent with Pydantic AI

What I learned building a competitive analysis agent using Pydantic AI, including structured outputs, tool design, and the importance of validation.

8 min read

I spent the last month building a GTM intelligence agent using Pydantic AI. The goal was simple: analyze a SaaS company’s competitive positioning and identify messaging gaps. The reality was more complicated.

What I tried

Pydantic AI promises structured outputs from LLMs by using Pydantic models as type hints. This sounded perfect for my use case—I wanted the agent to return specific fields like “messaging_gap” and “differentiation_score” rather than freeform text.

I set up a basic agent with three tools:

  • scrape_website — Fetch and clean landing page copy
  • analyze_positioning — Extract value proposition and target audience
  • compare_competitors — Rank differentiation on a 1-10 scale

How I set it up

First, I defined the output schema:

from pydantic import BaseModel

class PositioningAnalysis(BaseModel):
    value_proposition: str
    target_audience: str
    differentiation_score: int  # 1-10
    messaging_gaps: list[str]

Then I wired it into a Pydantic agent:

from pydantic_ai import Agent

agent = Agent(
    'openai:gpt-4o',
    result_type=PositioningAnalysis,
    system_prompt=(
        "You are a GTM strategist. Analyze the given company's "
        "positioning and score their differentiation."
    )
)

The nice part: Pydantic AI handles the validation retry loop automatically. If the LLM returns invalid JSON or missing fields, it retries with a corrected prompt.

What actually happened

It worked—mostly. The structured outputs were reliable about 80% of the time. But I hit some unexpected issues:

1. Token cost was higher than expected

Structured outputs require the model to think in JSON, which adds tokens. My original estimate of $0.50 per analysis ended up closer to $1.20. Not a dealbreaker, but it adds up.

2. The scores were inconsistent

The 1-10 differentiation score varied too much between runs. Same company, similar context, but the model would give a 7 one time and a 4 the next. I switched to rubric-based evaluation (with explicit criteria for each score level) and consistency improved dramatically.

3. Tool outputs needed validation

The scrape_website tool sometimes returned boilerplate text instead of actual copy. I added a relevance filter that discards pages with less than 50% unique content.

What I’d do differently

  • Use smaller models for scraping tasks — GPT-4o-mini is sufficient for extracting text from HTML
  • Add a human-in-the-loop step — The agent should flag low-confidence analyses for review
  • Cache aggressively — Many competitors have similar messaging; there’s no need to re-analyze

The biggest win wasn’t the agent itself—it was the schema design process. Thinking carefully about what fields I actually needed forced me to clarify my own mental model of GTM analysis.

Pydantic AI is still early, but the structured output approach feels like the right direction for agentic workflows. The model doesn’t just produce text—it produces data you can build on.