Mastering Agentic AI: The Multi-Agent Collaboration Pattern

In the previous post, Planning gave our agents the ability to decompose complex objectives into actionable steps. That’s the bridge from reacting to strategizing. But there’s still an implicit assumption underneath every pattern we’ve covered so far: one agent does the work. One LLM, one set of instructions, one persona.

That assumption breaks the moment the problem genuinely spans multiple domains. A research project needs a researcher, an analyst, and a writer — each with different prompts, different tools, different ways of thinking. Stuffing all three roles into a single prompt produces mediocre versions of all three.

Multi-Agent Collaboration is the answer. It’s where the agentic team metaphor stops being a metaphor and becomes the actual architecture.

Pattern #7: Multi-Agent Collaboration
#

The Problem
#

A monolithic agent is fine for well-defined, single-domain problems. It collapses on anything that requires multiple specialties — and not just in quality. It also collapses in maintainability. Every new capability you bolt on adds another section to the system prompt. The prompt becomes a 2,000-token novel that the model partially ignores. The temperature setting that’s right for creative writing is wrong for fact-checking. The tools that are useful for one task are noise for another.

The fix isn’t a bigger prompt. The fix is to break the work into separate agents — each with its own role, its own model parameters, its own tools — and design how they communicate.

The Solution
#

A multi-agent system is three things: a set of specialized agents, a communication structure between them, and an interaction protocol that orchestrates the flow. LangGraph is purpose-built for this — every agent is a node, every handoff is an edge, and shared state is the message bus.

The book identifies four collaboration archetypes that cover most real systems. Let me walk through each with a concrete example from my repo.

1. Sequential Handoff — Agent A finishes, hands off to Agent B. The cleanest pattern, and the right starting point.

class BlogState(TypedDict):
    topic: str
    research: str
    blog_post: str

def researcher(state: BlogState) -> dict:
    """Thorough researcher: facts, trends, expert viewpoints."""
    # ... LLM call with temperature=0 (deterministic)
    return {"research": research_output}

def writer(state: BlogState) -> dict:
    """Skilled blog writer: turns research into engaging post."""
    # ... LLM call with temperature=0.7 (creative)
    return {"blog_post": blog_output}

builder = StateGraph(BlogState)
builder.add_node("researcher", researcher)
builder.add_node("writer", writer)
builder.add_edge(START, "researcher")
builder.add_edge("researcher", "writer")
builder.add_edge("writer", END)

Two agents, two distinct personas, two different temperatures. The Researcher is cold and factual; the Writer is warm and engaging. Each uses a system prompt tuned to its role. The graph state carries the research output from one to the other. This is where you start feeling the difference from a single-prompt agent — the same task done with two specialists produces measurably better output than one generalist.

2. Parallel Agents — Multiple agents working independently, results merged. This is where Multi-Agent meets Parallelization:

class ParallelState(TypedDict):
    city: str
    results: Annotated[List[str], operator.add]  # reducer for merging

def weather_agent(state): ...   # returns {"results": ["[WEATHER] ..."]}
def news_agent(state): ...      # returns {"results": ["[NEWS] ..."]}

builder.add_edge(START, "weather_agent")
builder.add_edge(START, "news_agent")
builder.add_edge("weather_agent", END)
builder.add_edge("news_agent", END)

Same reducer trick we saw in Parallelization. The Annotated[List[str], operator.add] annotation tells LangGraph to concatenate updates from concurrent branches instead of overwriting. The weather agent and the news agent fire simultaneously, both append to results, no coordination needed.

3. Supervisor / Coordinator — A manager agent classifies the request and routes it to the right specialist. This composes Routing (Chapter 2) with Multi-Agent:

def coordinator(state: CoordinatorState) -> dict:
    """Classifies the request: greeter or task_executor?"""
    # ... LLM classifies, returns route
    return {"route": route}

def greeter(state): ...
def task_executor(state): ...

builder.add_conditional_edges(
    "coordinator", route_request,
    {"greeter": "greeter", "task_executor": "task_executor"}
)

This is the classic hierarchical structure — a supervisor at the top, specialists below. The supervisor doesn’t do the work; it decides who does. From here you scale up easily: add a third specialist, add an edge, done. In production this is how I model anything that looks like “customer support escalation,” “incident triage,” or “first-pass classification before deep work.”

4. Agent-as-Tool — A sub-agent wrapped as a callable tool from a parent agent. This is where composition becomes truly hierarchical:

# Sub-agent: a full graph that generates image descriptions
image_graph = build_image_subgraph()

@tool
def generate_image(prompt: str) -> str:
    """Generates a detailed image description based on a creative prompt."""
    result = image_graph.invoke({"prompt": prompt})
    return result["description"]

parent_agent = create_react_agent(llm, [generate_image])

The parent agent doesn’t know — or care — that generate_image is a graph with its own state and its own LLM calls. It looks like a tool, behaves like a tool. This is the pattern you reach for when sub-tasks are themselves agents: nested specialization, where a high-level orchestrator delegates to sub-agents that have their own internal complexity.

The Six Architectures
#

The book describes six topologies for how agents can be wired together. The four archetypes above slot into these:

Single Agent — one agent, optional tools. Our baseline.
Network — peer-to-peer agents talking directly. Resilient but coordination is hard.
Supervisor — central coordinator delegates to specialists. What I use 80% of the time.
Supervisor as Tool — supervisor offers tools/data to peers, doesn’t command. Useful when autonomy matters.
Hierarchical — multi-level supervisors. The org chart pattern. Necessary when complexity warrants it; usually overkill.
Custom — hybrid designs for specific problems. The honest answer for most production systems.

Pick the simplest topology that works. The temptation with multi-agent systems is to over-engineer immediately — to build a 10-agent hierarchy because it looks like serious engineering. Resist. Most problems are solved by 2–3 agents in a sequential or supervisor pattern.

Why This Matters
#

Multi-Agent collaboration is the pattern that lets you compose everything we’ve covered so far into a coherent system. A Supervisor can route to a chain that runs a Planner that produces a plan executed by parallel specialists that each use Tool Use and Reflection internally. That sentence sounds like a stack-overflow disaster, but in LangGraph it’s a graph you can draw on a whiteboard.

Real wins from this pattern:

Modularity. Each agent is independently testable. A bug in the Researcher doesn’t break the Writer.
Specialization. Right model for each role. Use a fast/cheap model for routing; reserve the frontier model for synthesis.
Scalability. Adding capability means adding an agent and an edge — not surgery on a 2,000-token mega-prompt.
Robustness. The book makes this point well: failure of one agent doesn’t necessarily cause total system failure. With proper fallbacks (which we’ll get to in Chapter 12), individual agents can fail without taking the system down.

Trade-offs are real and worth naming:

Complexity. More agents = more moving parts. Debugging a 6-node graph that misbehaves requires tracing state across nodes — not just reading a prompt.
Latency. Sequential handoffs add round trips. Parallel branches help, but only when the work is genuinely independent.
Cost. More LLM calls = more tokens. Multi-agent systems are not cheap.
Coordination overhead. The smartest decision in a multi-agent system is often “should this be multi-agent at all?” If a single well-prompted agent does the job, leave it alone.

Rule of thumb: decompose only when the roles are genuinely different. Different domain, different tone, different tools, different model. If two “agents” share 80% of the prompt, they’re one agent.

The Bigger Picture
#

This is post #7 in my series documenting Antonio Gulli’s Agentic Design Patterns. As always, full credit for the conceptual framework goes to him.

Multi-Agent closes Part 1 of the book — the foundational patterns. From here we move into Part 2: Advanced Capabilities. Memory Management, Learning, MCP, Goal Setting. The patterns that turn an agent from a one-shot executor into something that persists, learns, and works against long-term objectives.

All the code from this post is in my repository: carlosprados/Agentic_Design_Patterns, under 07_Multi_Agent/. Six runnable examples covering the four archetypes plus an iterative loop and a custom coordinator. All work with Gemini and Ollama through the shared get_llm() abstraction.

What’s Next
#

In the next post we’ll tackle Memory Management — the difference between an agent that forgets every conversation and one that remembers who you are, what you’ve discussed, and what you’ve already tried. Short-term context, long-term persistence, and the LangGraph patterns for both.

Stay tuned.

Pattern #7: Multi-Agent Collaboration#

The Problem#

The Solution#

The Six Architectures#

Why This Matters#

The Bigger Picture#

What’s Next#