When do I drop down to the Agent SDK instead of using filesystem agents?

Question

Kalle Lamminpää · Accepted Answer

The Agent SDK is for building Claude-powered applications, not for using Claude Code itself. Once you are inside an SDK app, programmatic agents (passed via the agents option in query()) outclass filesystem agents whenever the agent definition needs to come from runtime state rather than a static markdown file.

When to reach for the SDK at all

Use the SDK when you are writing an application that uses Claude as the agent loop:

A backend service that receives webhooks and runs Claude as a worker.
A custom UI (web app, Slack bot, Discord bot) that fronts a Claude conversation.
An internal tool with its own UX where Claude handles the reasoning and tool-calling.
A CI integration that runs Claude programmatically with hard-coded inputs.

Use Claude Code (the CLI) when the user is you, sitting at a terminal. The CLI is built on the same agent loop that the SDK exposes; the SDK is the same engine without the terminal UI on top.

The cheap mistake is reaching for the SDK to "extend Claude Code" when a skill, a plugin, or a slash command would do. Skills and plugins are mature within Claude Code; the SDK is a lower abstraction designed for "Claude inside my app".

Three ways to define subagents in the SDK

Once you are in the SDK, the SDK docs document three ways to define a subagent:
Programmatic (recommended for SDK apps): pass an agents map in your query() options. Each entry is an AgentDefinition with description, tools, optional prompt, and other fields.
Filesystem-based: drop markdown files in .claude/agents/. The SDK reads them on startup the same way Claude Code does.
Built-in general-purpose: Claude can invoke the built-in general-purpose subagent via the Agent tool with no definition required.

A minimal programmatic subagent

query() returns an async iterator, so consume it with for await (TypeScript) or async for (Python). The Agent tool must be in allowedTools, since subagents are invoked through it.

TypeScript:

Python:

The user prompt asks for a code review; Claude reads both description fields, picks sql-reviewer and test-coverage-reviewer to delegate to, runs them in parallel, and synthesizes the result.

Why programmatic beats filesystem in an SDK app

Definitions can come from runtime state. A multi-tenant SaaS that runs a code review agent for each customer can build the agent's prompt and tools from the customer's settings (their style guide, their allowed tools). Filesystem markdown does not change at runtime.

Hot-reloading without restarts. A new customer signs up; the next query() includes their agent. Filesystem agents reload only on session start.

Type-safe definitions. TypeScript's AgentDefinition type catches missing required fields and wrong shapes at compile time. (Note: tools is typed as string[], so a misspelled tool name compiles fine and only fails at runtime when Claude tries to use it.) Markdown frontmatter has no type checking at all.

Custom UI for invocation. Programmatic agents can be invoked by name from your app's logic, not just by the model auto-deciding from the description. Useful when your UI has explicit "Run code review" buttons.

The recommended path in the SDK docs is programmatic for application code; filesystem for "I have an SDK app but I also want my Claude Code-style markdown agents to work". The two coexist; pick programmatic when the app shape demands it.

Footguns

Built-in general-purpose subagent steals invocations from yours. Claude can invoke the built-in general-purpose agent via the Agent tool at any time, regardless of which agents you defined. If the user prompt is ambiguous and your custom agent's description is weaker than "general-purpose", Claude reaches for general-purpose instead. Tighten your descriptions with specific verbs and nouns; consider explicit invocation in the user prompt ("Use the sql-reviewer agent to...").

tools is restriction, not pre-approval. Listing tools: ["Read", "Glob", "Grep"] means the agent has only those tools. Forgetting Bash when the agent needs to run git diff produces a confused subagent that says "I cannot run shell commands". The mcpServers field is similar: list the MCP servers the agent should be able to reach.

Subagents run in a fresh conversation; CLAUDE.md only loads when settingSources allows it. Each subagent starts with no parent conversation history; the only channel from parent to subagent is the prompt string passed via the Agent tool. Project CLAUDE.md is loaded into the subagent only if your SDK options include settingSources (TypeScript) or settingsources (Python) covering it. With default options the SDK reads .claude/ from the working directory, so CLAUDE.md does flow through, but if you set settingSources: [] to lock down config, the subagent runs without it. If you want shared rules regardless of settingSources, put them in the agent's prompt field explicitly.

Authentication setup is your problem now. Claude Code handles login. The SDK requires you to configure auth: an ANTHROPICAPIKEY, or CLAUDECODEUSEBEDROCK / CLAUDECODEUSEVERTEX / CLAUDECODEUSEFOUNDRY for cloud providers. Forgetting this in production produces 401s during the first agent call.

The TypeScript SDK bundles a native Claude Code binary as an optional dependency. That is convenient for local dev; it can be a problem in slim container images that prune optional deps. Test your prod image early; if the binary is missing, the SDK falls back to API-only mode and some Claude Code features (sandboxing, certain tools) are unavailable.

Filesystem and programmatic agents can collide. If you have an .claude/agents/code-reviewer.md file and you also pass agents: { "code-reviewer": {...} } programmatically, the programmatic definition wins for that query but the filesystem one still loads in other contexts. Resolution rules are documented; verify which one Claude is using when behavior is surprising.

When NOT to use the SDK

You are using Claude Code at the terminal. Skills, plugins, and .claude/agents/ cover the same ground without the application boilerplate.
You can express the work as a single prompt. A one-shot claude --print "..." from a shell script is cheaper than spinning up an SDK app.
You want filesystem agents specifically because of the markdown layout. Then keep them in .claude/agents/ even inside an SDK app; you do not have to convert.
The work needs Claude Code's specific UI. The SDK is the engine; the CLI's permission prompts, /commands, autocomplete, and TUI are not part of the SDK. Replicate them yourself if your app needs them.
You are not building an application. Internal tooling, ad-hoc analysis, "I want Claude to fix this bug": stay in the CLI. The SDK rewards apps; it punishes one-off scripts.

When do I drop down to the Agent SDK instead of using filesystem agents?

When to reach for the SDK at all

Three ways to define subagents in the SDK

A minimal programmatic subagent

Why programmatic beats filesystem in an SDK app

Footguns

When NOT to use the SDK

Sources

Read more