AI & ML

Managing Multiple AI Agents: Architecture for Complex Development Projects

· 5 min read
SitePoint Premium
Stay Relevant and Grow Your Career in Tech
  • Premium Results
  • Publish articles on SitePoint
  • Daily curated jobs
  • Learning Paths
  • Discounts to dev tools
Start Free Trial

7 Day Free Trial. Cancel Anytime.

How to Manage Multiple AI Agents for Complex Development Projects

  1. Decompose the project into vertical slices with minimal shared mutable state between agents.
  2. Define task manifests specifying each agent's goal, context files, constraints, and output artifacts.
  3. Stand up Redis and the orchestrator service using Docker Compose with shared volumes.
  4. Register agent roles with scoped context access matching their task manifests.
  5. Dispatch tasks via Redis Streams with consumer groups and track lifecycle state in Redis hashes.
  6. Enforce per-agent token budgets and parallelism caps to control cost and respect rate limits.
  7. Merge agent outputs using branch-per-agent Git strategy and run a reviewer agent for validation.

Managing multiple AI agents becomes a practical necessity once development tasks exceed the capacity of a single coding session. This tutorial walks through building a multi-agent orchestration layer using a custom MultiAgentOrchestrator class with Claude Code agents, Redis for shared state management, LangChain for agent execution abstractions, and Docker for reproducible infrastructure.

Table of Contents

When One Agent Isn't Enough

Managing multiple AI agents becomes a practical necessity once development tasks exceed the capacity of a single coding session. Context window limits, task confusion from juggling disparate concerns, and the lack of specialization all impose a hard ceiling on what one agent can accomplish reliably. Consider a full-stack feature that requires an API endpoint, a React component, integration tests, and API documentation. Feeding all of that into a single Claude Code session leads to observable degradation as context fills up: hallucinated file paths, dropped requirements from earlier in the conversation, and contradictory code across files as the agent loses coherence.

This tutorial walks through building a multi-agent orchestration layer using a custom MultiAgentOrchestrator class with Claude Code agents, Redis for shared state management, LangChain for agent execution abstractions, and Docker for reproducible infrastructure. The result is a system that decomposes that full-stack feature into parallel agent tasks, coordinates their execution, and merges the outputs.

Prerequisites:

  • Node.js ≥ 18 (with native ESM support)
  • Docker Engine ≥ 20 and Docker Compose V2
  • Git
  • A valid Anthropic API key with access to the model you intend to use
  • Working familiarity with Node.js, basic Docker and Docker Compose usage, and experience prompting at least one AI coding agent

Project setup:

mkdir multi-agent-orchestrator && cd multi-agent-orchestrator
npm init -y

# Enable ES module syntax:
npm pkg set type=module

npm install ioredis@^5 @langchain/anthropic@^0.2 langchain@^0.2 @langchain/core@^0.2

Create a .env file in the project root containing your API key and Redis password. Do not commit this file — add .env to your .gitignore.

ANTHROPIC_API_KEY=your_key_here
REDIS_PASSWORD=your_redis_password_here

Multi-Agent Architecture Overview

Core Components and Data Flow

The architecture follows a straightforward data flow. The orchestrator (the MultiAgentOrchestrator class built in this tutorial) sits at the center. It pushes task definitions into a Redis-backed task queue. A pool of Claude Code agent instances consumes tasks from their dedicated channels, executes them within scoped contexts, and writes output artifacts to a shared artifact store implemented as Docker volumes mounted across services.

In text-diagram form:

Orchestrator (MultiAgentOrchestrator) → Task Queue (Redis Streams) → Agent Pool (Claude Code Instances) → Shared Artifact Store (Docker Volumes / Filesystem)

Each component has a distinct responsibility. The orchestrator owns task decomposition, dispatch ordering, and status aggregation. Redis serves as both message broker and state store, tracking each task through its lifecycle. The orchestrator treats Claude Code agents as stateless workers — each task spawns a fresh session so the agent receives a task manifest, operates within a constrained file scope, and produces defined output artifacts. The shared volume layer ensures agents can read common source files and write results to predictable paths.

Choosing an Orchestration Pattern

Three patterns dominate multi-agent system design. The router pattern uses a single orchestrator that dispatches tasks to specialized agents based on task type. This tutorial implements the router pattern because the orchestrator decides which model, context window, and token budget each task receives — unlike the pipeline pattern (fixed sequential order) or the swarm pattern (agents self-select work).

The orchestrator treats Claude Code agents as stateless workers — each task spawns a fresh session so the agent receives a task manifest, operates within a constrained file scope, and produces defined output artifacts.

The pipeline pattern chains agents sequentially, with each agent's output feeding the next agent's input. This fits workflows with strict ordering dependencies, such as "generate schema, then generate migrations, then generate seed data." It sacrifices parallelism for simplicity in dependency management.

The swarm pattern allows agents to self-organize, picking up tasks and delegating subtasks among themselves. Swarm patterns differ from the router pattern in that there is no central dispatcher — agents autonomously claim and subdivide work. While powerful for open-ended exploration, swarm patterns introduce coordination complexity and make cost control significantly harder. They are not covered here.

// orchestrator.js
import { ChatAnthropic } from "@langchain/anthropic";
import Redis from "ioredis";

// Fail fast if the API key is missing — do not wait until first invocation
if (!process.env.ANTHROPIC_API_KEY) {
  throw new Error("ANTHROPIC_API_KEY is required. Set it in your .env file.");
}

class MultiAgentOrchestrator {
  constructor() {
    this.redis = new Redis({
      host: process.env.REDIS_HOST ?? "redis",
      port: parseInt(process.env.REDIS_PORT ?? "6379", 10),
      password: process.env.REDIS_PASSWORD,
      maxRetriesPerRequest: 3,
      enableReadyCheck: true,
    });
    this.redis.on("error", (err) =>
      console.error("[Redis] connection error:", err)
    );
    this.agentRegistry = new Map();
  }

  registerAgent(role, config) {
    // Verify current model IDs at https://docs.anthropic.com/en/docs/models-overview
    const model = new ChatAnthropic({
      modelName: "claude-sonnet-4-20250514",
      apiKey: process.env.ANTHROPIC_API_KEY,
    });
    this.agentRegistry.set(role, { model, ...config });
  }

  async getAgentByRole(role) {
    const agent = this.agentRegistry.get(role);
    if (!agent) {
      throw new Error(`Unknown role: ${role}`);
    }
    return agent;
  }

  async dispatchTask(task) {
    await this.redis.xadd(
      `tasks:${task.assignedAgent}`,
      "*",
      "payload",
      JSON.stringify(task)
    );
    await this.redis.hset(`task:${task.id}`, "status", "pending", "retries", "0");
  }
}

export { MultiAgentOrchestrator };

const orchestrator = new MultiAgentOrchestrator();
orchestrator.registerAgent("api-developer", { contextScope: ["src/api/**"] });
orchestrator.registerAgent("test-engineer", { contextScope: ["tests/**"] });
orchestrator.registerAgent("frontend-developer", {
  contextScope: ["src/ui/**"],
});
orchestrator.registerAgent("docs-writer", { contextScope: ["docs/**"] });

Note: The contextScope stored in the registry defines each agent's intended file boundaries. Context scope enforcement is the responsibility of the Claude Code invocation layer (not shown here); the orchestrator passes contextFiles from the task manifest to the agent process, which should use them to restrict file access.

Decomposing Projects for Multi-Agent Workflows

Task Decomposition Strategy

Effective decomposition follows three rules. First, minimize shared mutable state between agents. If two agents need to modify the same file, that signals the split is wrong. Second, define clear input/output contracts so each agent knows exactly what files it can read and what artifacts it must produce. Third, prefer vertical slices over horizontal layers. An agent that builds an entire API endpoint (route, controller, validation) produces more coherent output than one that writes only route definitions across multiple endpoints.

For the full-stack feature scenario, decomposition yields four parallel tasks: an API endpoint agent that owns route, controller, and model code; a frontend agent that builds the React component and its local state management; an integration test agent that writes end-to-end tests against the API contract; and a documentation agent that generates OpenAPI specs and usage guides. The test and docs agents depend on the API agent's output contract but can begin scaffolding work in parallel.

Defining Agent Roles and Contracts

Each agent task spec must declare its goal, the context files it can access, constraints on its behavior, and the exact output artifacts expected. This eliminates ambiguity and prevents agents from wandering outside their scope.

Conflict prevention requires either a branch-per-agent strategy, where each agent works on a dedicated Git branch that gets merged by the orchestrator, or file-locking conventions that guarantee no two agents write to the same path. Branch-per-agent is safer because it uses Git's existing merge and conflict detection machinery. The exception: when multiple agents must co-edit the same file within the same function, branch-per-agent will produce merge conflicts that require manual resolution, and file-level locking with coordinated write ordering works better. This assumes an initialized Git repository with a remote configured for the project.

tasks:
  - id: "task-api-endpoint"
    role: "api-developer"
    goal: "Implement POST /api/v1/projects endpoint with validation and error handling"
    contextFiles:
      - "src/api/routes/index.js"
      - "src/api/models/project.js"
      - "src/api/middleware/validation.js"
      - "src/api/middleware/errors.js"
    constraints:
      - "Do not modify files outside src/api/"
      - "Use Zod for request validation"
      - "Follow existing error response format in src/api/middleware/errors.js"
    outputArtifacts:
      - "src/api/routes/projects.js"
      - "src/api/controllers/projectController.js"
    branch: "agent/api-endpoint"

  - id: "task-integration-tests"
    role: "test-engineer"
    goal: "Write integration tests for POST /api/v1/projects"
    dependsOn: ["task-api-endpoint"]
    contextFiles:
      - "tests/helpers/setup.js"
      - "src/api/routes/projects.js"
    constraints:
      - "Use Vitest and Supertest"
      - "Do not modify source files"
    outputArtifacts:
      - "tests/integration/projects.test.js"
    branch: "agent/integration-tests"

Implementing the Orchestration Layer

Setting Up the Orchestrator with Claude Code Agents

The infrastructure layer requires Redis for task queuing and state management, plus a Node.js service that runs the orchestrator logic. Docker Compose ties these together with shared volume mounts so agents can access the project source tree and write their output artifacts to predictable locations.

services:
  redis:
    image: redis:7-alpine
    command: redis-server --requirepass ${REDIS_PASSWORD}
    healthcheck:
      test: ["CMD", "redis-cli", "-a", "${REDIS_PASSWORD}", "ping"]
      interval: 5s
      timeout: 3s
      retries: 5
    volumes:
      - redis-data:/data
    # Uncomment the following only for local development debugging.
    # This exposes Redis to all host interfaces with no network-level protection.
    # ports:
    #   - "6379:6379"

  orchestrator:
    build: ./orchestrator
    depends_on:
      redis:
        condition: service_healthy
    environment:
      - REDIS_HOST=redis
      - REDIS_PORT=6379
      - REDIS_PASSWORD=${REDIS_PASSWORD}
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} # Set in .env file; never hardcode or commit this value
    volumes:
      - type: bind
        source: ./
        target: /workspace
      - agent-artifacts:/artifacts

volumes:
  redis-data:
  agent-artifacts:

Note on volumes: The project source is bind-mounted from the host so agents can read existing files. The agent-artifacts named volume stores generated output. You will also need to create an orchestrator/Dockerfile for the build: ./orchestrator directive to work.

Security note: Redis is configured with requirepass via the REDIS_PASSWORD environment variable. The ports block is commented out by default. If you uncomment it for local debugging, be aware it exposes Redis to all host interfaces. For non-local environments, keep it removed and configure Redis ACLs for additional protection.

Claude Code agent sessions are spawned programmatically. The orchestrator reads a task from the Redis stream, resolves the agent configuration from the registry, and launches a Claude Code process scoped to the task's declared context files and output paths.

Task Dispatch and State Management with Redis

Redis Streams serve as the task distribution mechanism. The orchestrator publishes tasks to agent-specific streams. Each agent process consumes from its dedicated channel using consumer groups. The worker routes each task to the correct specialist and acknowledges it after execution completes. Task status tracking uses Redis hashes, transitioning through pending, running, completed, and failed states.

Handling partial failures requires a retry policy. If an agent fails, the worker acknowledges the original message to remove it from the pending entries list, then re-enqueues a fresh copy of the task up to a configured maximum retry count. The worker marks tasks that exceed the retry limit as permanently failed, writes them to a tasks:dead-letter stream for inspection, and the orchestrator halts their dependent tasks.

If two agents need to modify the same file, that signals the split is wrong.

// worker.js
import Redis from "ioredis";
import { ChatAnthropic } from "@langchain/anthropic";
import { AgentExecutor } from "langchain/agents";
import { createToolCallingAgent } from "langchain/agents";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { readFile } from "fs/promises";

// Note: Building the agent passed to AgentExecutor (e.g., via createToolCallingAgent)
// is specific to your tool setup. See LangChain.js docs:
// https://js.langchain.com/docs/modules/agents/

const redis = new Redis({
  host: process.env.REDIS_HOST ?? "redis",
  port: parseInt(process.env.REDIS_PORT ?? "6379", 10),
  password: process.env.REDIS_PASSWORD,
  maxRetriesPerRequest: 3,
  enableReadyCheck: true,
});
redis.on("error", (err) => console.error("[Redis] connection error:", err));

async function publishTask(task) {
  await redis.xadd(`tasks:${task.role}`, "*", "payload", JSON.stringify(task));
  await redis.hset(`task:${task.id}`, "status", "pending", "retries", "0");
  console.log(`Dispatched task ${task.id} to ${task.role}`);
}

async function buildContextString(contextFiles) {
  const entries = await Promise.all(
    (contextFiles ?? []).map(async (filePath) => {
      try {
        const content = await readFile(filePath, "utf8");
        return `### ${filePath}
\`\`\`
${content}
\`\`\``;
      } catch {
        return `### ${filePath}
[File not found]`;
      }
    })
  );
  return entries.join("

");
}

async function consumeAndExecute(role, agentConfig) {
  const stream = `tasks:${role}`;
  const group = "orchestrator-group";
  const consumer = `consumer-${role}-${process.pid}`;

  // Ensure consumer group exists (safe to call if already created)
  await redis.xgroup("CREATE", stream, group, "0", "MKSTREAM").catch((e) => {
    if (!e.message.includes("BUSYGROUP")) throw e;
  });

  // Persistent consume loop: continuously read and process messages
  while (true) {
    try {
      // Use XREADGROUP to consume via the consumer group.
      // ">" means: deliver only messages not yet delivered to this consumer.
      const results = await redis.xreadgroup(
        "GROUP", group, consumer,
        "COUNT", 1,
        "BLOCK", 2000,
        "STREAMS", stream, ">"
      );
      if (!results) continue;

      const [, messages] = results[0];
      for (const [messageId, fields] of messages) {
        // fields is a flat array: ["payload", "{...}", ...]; find "payload" explicitly
        const payloadIdx = fields.indexOf("payload");
        if (payloadIdx === -1 || payloadIdx + 1 >= fields.length) {
          console.error(`Malformed message ${messageId}: missing payload field`);
          await redis.xack(stream, group, messageId);
          continue;
        }
        const task = JSON.parse(fields[payloadIdx + 1]);
        await redis.hset(`task:${task.id}`, "status", "running");

        try {
          // Construct the agent from the registered model configuration.
          // Verify current model IDs at https://docs.anthropic.com/en/docs/models-overview
          if (!agentConfig || !agentConfig.model) {
            throw new Error(`No model registered for role: ${role}`);
          }
          const prompt = ChatPromptTemplate.fromMessages([
            [
              "system",
              "You are a {role} agent. Operate only within the provided context.",
            ],
            ["human", "{input}"],
          ]);
          const agent = createToolCallingAgent({
            llm: agentConfig.model,
            tools: agentConfig.tools ?? [],
            prompt,
          });
          const executor = new AgentExecutor({
            agent,
            tools: agentConfig.tools ?? [],
          });

          // Inject contextFiles into the prompt so the agent receives scoped file context
          const contextString = await buildContextString(task.contextFiles);
          const result = await executor.invoke({
            input: `${task.goal}

## Context Files
${contextString}`,
            role,
          });

          await redis.hset(
            `task:${task.id}`,
            "status", "completed",
            "output", JSON.stringify(result)
          );
          await redis.xack(stream, group, messageId);
        } catch (error) {
          const rawRetries =
            (await redis.hget(`task:${task.id}`, "retries")) ?? "0";
          const retries = parseInt(rawRetries, 10);

          // Always acknowledge the original message before re-enqueuing
          // to prevent duplicate delivery from the pending entries list (PEL)
          await redis.xack(stream, group, messageId);

          if (retries < 3) {
            await redis.hset(
              `task:${task.id}`,
              "retries", String(retries + 1),
              "status", "pending"
            );
            await redis.xadd(stream, "*", "payload", JSON.stringify(task));
          } else {
            await redis.hset(
              `task:${task.id}`,
              "status", "failed",
              "error", error.stack ?? error.message
            );
            await redis.xadd(
              "tasks:dead-letter", "*",
              "taskId", task.id,
              "role", role,
              "error", error.message
            );
          }
        }
      }
    } catch (outerError) {
      console.error("Stream read error:", outerError);

      // Back off briefly before retrying the outer read loop
      await new Promise((r) => setTimeout(r, 1000));
    }
  }
}

// Graceful shutdown — handle both SIGTERM (containers) and SIGINT (Ctrl-C in development)
const shutdown = async (signal) => {
  console.log(`Received ${signal}, disconnecting...`);
  await redis.disconnect();
  process.exit(0);
};
process.on("SIGTERM", () => shutdown("SIGTERM"));
process.on("SIGINT", () => shutdown("SIGINT"));

export { publishTask, consumeAndExecute };

Merging Agent Outputs

Once agents complete their tasks, the orchestrator collects artifacts from each agent's branch. The orchestrator detects conflicts automatically by running git merge --no-commit --no-ff <branch> against the target branch to surface actual merge conflicts; git diff alone only shows textual differences and does not reliably detect merge conflicts. For most well-decomposed tasks, conflicts should be rare or nonexistent.

A "reviewer agent" can serve as a final validation pass: a Claude Code instance that receives the merged diff, checks for consistency across API contracts, verifies import paths, and flags potential integration issues before the orchestrator commits the merge. Implementing the reviewer agent follows the same dispatch pattern described above — define a task manifest with the merged diff as context and consistency checks as the goal.

Practical Patterns and Pitfalls

Context Window Management

Keeping per-agent context lean is critical for output quality. Scope each agent's visible file tree to only the files declared in its task manifest, and provide summarized project context (coding conventions, architecture notes) rather than full source files. Use explicit ignore lists to prevent agents from pulling in irrelevant dependencies or configuration files.

Cost and Rate-Limit Controls

The orchestrator should enforce per-agent token budgets, capping the maximum tokens any single agent task can consume. In practice, set the maxTokens parameter in the ChatAnthropic constructor (or wrap each executor.invoke call with a token-counting check that aborts execution if the budget is exceeded). Parallelism caps prevent hitting API rate limits: consult your Anthropic usage tier limits at https://docs.anthropic.com/en/api/rate-limits to determine your requests-per-minute (RPM) concurrency ceiling, then run at most one fewer agent simultaneously to leave headroom for retries and status checks.

Common Anti-Patterns

Giving every agent full repository access defeats the purpose of scoped decomposition and inflates context unnecessarily. When you over-decompose into too many micro-tasks, coordination overhead exceeds the time saved through parallelism. Skipping the contract step and relying on agents to self-coordinate leads to conflicting outputs, duplicated work, and subtle integration bugs that are harder to diagnose than the original monolithic approach.

Skipping the contract step and relying on agents to self-coordinate leads to conflicting outputs, duplicated work, and subtle integration bugs that are harder to diagnose than the original monolithic approach.

Implementation Checklist

The following checklist captures the complete workflow for standing up multi-agent orchestration on a development project:

  1. Map the feature to vertical slices with minimal shared mutable state. Define project scope and identify which work units can run in parallel.
  2. Write task manifests with input/output contracts for each agent. Specify goal, context files, constraints, output artifacts, and dependencies.
  3. Stand up Redis and the orchestrator via Docker Compose. Use the bind-mount configuration to give agents access to the project source tree, and named volumes for generated artifacts. Ensure the Redis healthcheck passes before the orchestrator starts.
  4. Register agent roles and assign context scopes. Each role gets a constrained view of the codebase matching its task manifests.
  5. Implement the dispatch loop with status tracking in Redis. Use Redis Streams with consumer groups for task distribution and hashes for lifecycle state.
  6. Set per-agent token budgets and parallelism caps. Enforce hard limits in the orchestrator to control cost and stay within rate limits.
  7. Configure branch-per-agent or file-lock strategy to prevent write conflicts before they happen.
  8. Add a reviewer agent or automated merge-conflict check. Use git merge --no-commit --no-ff to validate merged output before committing to the target branch.
  9. Run an end-to-end dry run with a small feature before scaling. Verify the full cycle from decomposition through merge on a low-risk task.
  10. Monitor costs, latency, and failure rates. Refine task boundaries based on observed agent performance and coordination friction. Inspect the tasks:dead-letter stream for permanently failed tasks.

What Comes Next

Multi-agent orchestration is fundamentally about disciplined decomposition, clear contracts, and minimal external dependencies — no dedicated database beyond Redis, no service mesh, no additional infrastructure beyond Docker. The stack is straightforward: Redis, Docker, and a Node.js orchestrator are sufficient to coordinate Claude Code agents across complex features.

A useful decision heuristic: if a task fits comfortably in a single agent's context window and does not benefit from specialization, a single agent remains the right choice. Multi-agent orchestration pays off when tasks are naturally parallelizable, span distinct domains (backend, frontend, testing, documentation), or exceed context limits.

For teams looking to extend this pattern further, natural next steps include adding human-in-the-loop approval gates between task completion and merge, and integrating the orchestrator into CI/CD pipelines so agent-generated code flows through existing quality gates. Swarm patterns for open-ended research or refactoring tasks — where the decomposition itself is ambiguous — are worth evaluating once the router-based workflow is stable.