AnswerQA

When do I drop down to the Agent SDK instead of using filesystem agents?

Answer

The Agent SDK is for building Claude-powered applications, not for using Claude Code itself. Use programmatic agents (defined in code via `agents` in query options) when you need dynamic agent definitions, when you are integrating into your own app, or when filesystem markdown is too static for your runtime.

By Kalle Lamminpää Verified May 7, 2026

The Agent SDK is for building Claude-powered applications, not for using Claude Code itself. Once you are inside an SDK app, programmatic agents (passed via the agents option in query()) outclass filesystem agents whenever the agent definition needs to come from runtime state rather than a static markdown file.

When to reach for the SDK at all

Use the SDK when you are writing an application that uses Claude as the agent loop:

  • A backend service that receives webhooks and runs Claude as a worker.
  • A custom UI (web app, Slack bot, Discord bot) that fronts a Claude conversation.
  • An internal tool with its own UX where Claude handles the reasoning and tool-calling.
  • A CI integration that runs Claude programmatically with hard-coded inputs.

Use Claude Code (the CLI) when the user is you, sitting at a terminal. The CLI is built on the same agent loop that the SDK exposes; the SDK is the same engine without the terminal UI on top.

The cheap mistake is reaching for the SDK to “extend Claude Code” when a skill, a plugin, or a slash command would do. Skills and plugins are mature within Claude Code; the SDK is a lower abstraction designed for “Claude inside my app”.

Three ways to define subagents in the SDK

Once you are in the SDK, the SDK docs document three ways to define a subagent:

  1. Programmatic (recommended for SDK apps): pass an agents map in your query() options. Each entry is an AgentDefinition with description, tools, optional prompt, and other fields.
  2. Filesystem-based: drop markdown files in .claude/agents/. The SDK reads them on startup the same way Claude Code does.
  3. Built-in general-purpose: Claude can invoke the built-in general-purpose subagent via the Agent tool with no definition required.

A minimal programmatic subagent

query() returns an async iterator, so consume it with for await (TypeScript) or async for (Python). The Agent tool must be in allowedTools, since subagents are invoked through it.

TypeScript:

import { query } from "@anthropic-ai/claude-agent-sdk";

for await (const message of query({
  prompt: "Review the current branch for SQL safety issues and missing tests.",
  options: {
    allowedTools: ["Read", "Glob", "Grep", "Bash", "Agent"],
    agents: {
      "sql-reviewer": {
        description:
          "Reviews a git diff for SQL injection risks, missing parameter binding, and unbounded queries. Returns a numbered list of findings.",
        tools: ["Read", "Glob", "Grep", "Bash"],
        prompt:
          "You review SQL safety in a git diff. Flag any string-interpolated SQL, any query without LIMIT, and any direct concatenation of user input into a query. Report file:line for each finding.",
      },
      "test-coverage-reviewer": {
        description:
          "Identifies code paths added in the diff that lack test coverage. Returns paths and a brief note per path.",
        tools: ["Read", "Glob", "Grep", "Bash"],
        prompt:
          "You review test coverage. For each new function or branch added in the diff, check whether a corresponding test exists. Report missing coverage as path:function with one-line context.",
      },
    },
  },
})) {
  if ("result" in message) console.log(message.result);
}

Python:

import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition


async def main():
    async for message in query(
        prompt="Review the current branch for SQL safety issues and missing tests.",
        options=ClaudeAgentOptions(
            allowed_tools=["Read", "Glob", "Grep", "Bash", "Agent"],
            agents={
                "sql-reviewer": AgentDefinition(
                    description="Reviews a git diff for SQL injection risks...",
                    tools=["Read", "Glob", "Grep", "Bash"],
                    prompt="You review SQL safety in a git diff...",
                ),
                "test-coverage-reviewer": AgentDefinition(
                    description="Identifies code paths added in the diff that lack test coverage...",
                    tools=["Read", "Glob", "Grep", "Bash"],
                    prompt="You review test coverage...",
                ),
            },
        ),
    ):
        if hasattr(message, "result"):
            print(message.result)


asyncio.run(main())

The user prompt asks for a code review; Claude reads both description fields, picks sql-reviewer and test-coverage-reviewer to delegate to, runs them in parallel, and synthesizes the result.

Why programmatic beats filesystem in an SDK app

Definitions can come from runtime state. A multi-tenant SaaS that runs a code review agent for each customer can build the agent’s prompt and tools from the customer’s settings (their style guide, their allowed tools). Filesystem markdown does not change at runtime.

Hot-reloading without restarts. A new customer signs up; the next query() includes their agent. Filesystem agents reload only on session start.

Type-safe definitions. TypeScript’s AgentDefinition type catches missing required fields and wrong shapes at compile time. (Note: tools is typed as string[], so a misspelled tool name compiles fine and only fails at runtime when Claude tries to use it.) Markdown frontmatter has no type checking at all.

Custom UI for invocation. Programmatic agents can be invoked by name from your app’s logic, not just by the model auto-deciding from the description. Useful when your UI has explicit “Run code review” buttons.

The recommended path in the SDK docs is programmatic for application code; filesystem for “I have an SDK app but I also want my Claude Code-style markdown agents to work”. The two coexist; pick programmatic when the app shape demands it.

Footguns

Built-in general-purpose subagent steals invocations from yours. Claude can invoke the built-in general-purpose agent via the Agent tool at any time, regardless of which agents you defined. If the user prompt is ambiguous and your custom agent’s description is weaker than “general-purpose”, Claude reaches for general-purpose instead. Tighten your descriptions with specific verbs and nouns; consider explicit invocation in the user prompt (“Use the sql-reviewer agent to…”).

tools is restriction, not pre-approval. Listing tools: ["Read", "Glob", "Grep"] means the agent has only those tools. Forgetting Bash when the agent needs to run git diff produces a confused subagent that says “I cannot run shell commands”. The mcpServers field is similar: list the MCP servers the agent should be able to reach.

Subagents run in a fresh conversation; CLAUDE.md only loads when settingSources allows it. Each subagent starts with no parent conversation history; the only channel from parent to subagent is the prompt string passed via the Agent tool. Project CLAUDE.md is loaded into the subagent only if your SDK options include settingSources (TypeScript) or setting_sources (Python) covering it. With default options the SDK reads .claude/ from the working directory, so CLAUDE.md does flow through, but if you set settingSources: [] to lock down config, the subagent runs without it. If you want shared rules regardless of settingSources, put them in the agent’s prompt field explicitly.

Authentication setup is your problem now. Claude Code handles login. The SDK requires you to configure auth: an ANTHROPIC_API_KEY, or CLAUDE_CODE_USE_BEDROCK / CLAUDE_CODE_USE_VERTEX / CLAUDE_CODE_USE_FOUNDRY for cloud providers. Forgetting this in production produces 401s during the first agent call.

The TypeScript SDK bundles a native Claude Code binary as an optional dependency. That is convenient for local dev; it can be a problem in slim container images that prune optional deps. Test your prod image early; if the binary is missing, the SDK falls back to API-only mode and some Claude Code features (sandboxing, certain tools) are unavailable.

Filesystem and programmatic agents can collide. If you have an .claude/agents/code-reviewer.md file and you also pass agents: { "code-reviewer": {...} } programmatically, the programmatic definition wins for that query but the filesystem one still loads in other contexts. Resolution rules are documented; verify which one Claude is using when behavior is surprising.

When NOT to use the SDK

  • You are using Claude Code at the terminal. Skills, plugins, and .claude/agents/ cover the same ground without the application boilerplate.
  • You can express the work as a single prompt. A one-shot claude --print "..." from a shell script is cheaper than spinning up an SDK app.
  • You want filesystem agents specifically because of the markdown layout. Then keep them in .claude/agents/ even inside an SDK app; you do not have to convert.
  • The work needs Claude Code’s specific UI. The SDK is the engine; the CLI’s permission prompts, /commands, autocomplete, and TUI are not part of the SDK. Replicate them yourself if your app needs them.
  • You are not building an application. Internal tooling, ad-hoc analysis, “I want Claude to fix this bug”: stay in the CLI. The SDK rewards apps; it punishes one-off scripts.

Sources

  • Agent SDK overview
    Authoritative: SDK packaging, language bindings (TypeScript + Python), `ClaudeAgentOptions`, the relationship between the SDK and the Claude Code agent loop.
  • Subagents in the SDK
    Three ways to create subagents (programmatic via `agents`, filesystem `.claude/agents/`, built-in general-purpose). Programmatic is the documented recommendation for SDK applications.
  • Subagents (Claude Code reference)
    Filesystem-based subagent layout. Mentioned for comparison: same concept, but the markdown files only update on session restart.

Was this helpful?