Four extension mechanisms, one decision. Pick the wrong one and you waste tokens, break context isolation, or end up shipping a binary when a markdown file would have done.
The four-way comparison
| Skill | Subagent | MCP server | Plugin | |
|---|---|---|---|---|
| Lives at | .claude/skills/<name>/SKILL.md | .claude/agents/<name>.md | External process via .mcp.json | Plugin root with .claude-plugin/plugin.json plus skills/, agents/, hooks, .mcp.json |
| Context | Loaded into parent context when invoked | Fresh context per invocation by default; can be resumed with retained history | Stays in parent context | Whatever its parts use |
| Persistence | Stateless | Default one-shot; resumable for agent-team workflows | Stateful across turns (connections, OAuth) | Plugin data persists via ${CLAUDE_PLUGIN_DATA} |
| Tool list | Inherits the session’s tools; allowed-tools pre-approves listed tools while the skill is active | Static (tools, disallowedTools, mcpServers) | Dynamic (server can update its own tool list) | Whatever its parts have |
| Reach | Local to the project or user | Local | Network-capable, holds connection state | All of the above |
| Distribution | Copy the directory | Copy the file | Configure a local stdio process or a remote HTTP server | Marketplace or git URL with one config block |
Decision tree
Do you need persistent connection state across turns?
A connection to a database, a long-lived OAuth session, a websocket to a service: only an MCP server keeps a connection alive across turns. Skills are stateless; default subagent invocations start a fresh context (resumable agent-team workflows are an exception); plugins are containers, not connections, though their parts can persist data via ${CLAUDE_PLUGIN_DATA}.
Do you need a dynamic tool list (tools that change at runtime)? A server exposing “every column of every table in your DB” or “the actions in your Linear workspace” cannot enumerate them at config time. MCP server is the only mechanism that updates its tool list during the session.
Do you need an isolated context for noisy research? Reading dozens of files to answer one question, getting a code-review second opinion, parallelizing independent lookups: subagent. The parent only sees the summary, which is the whole point.
Do you need a repeated prompt with optional supporting files? A code-review checklist, a release-notes formatter, a PR-template generator: skill. The body stays in context once invoked, supporting files load on demand, and the model can auto-invoke from the description.
Do you need to ship two or more of the above as one installable thing? A team toolkit that bundles a skill, a hook, and an MCP server: plugin. Plugins are containers, not a fifth primitive. They exist to make distribution and config one step instead of three.
When the answer is “more than one”
Skill that calls a subagent. A /research skill whose body says “spawn three subagents in parallel…” is a thin wrapper around a subagent call. Use this when the prompt itself is the reusable part and the underlying work needs context isolation.
Subagent that uses an MCP tool. The subagent’s frontmatter lists the MCP tool name in tools (or whitelists the server in mcpServers), so the fresh context still has access to the persistent server. Use this when the research workload needs database lookups but should not pollute the parent with query payloads.
Plugin that ships a skill plus an MCP server. A linear-bridge plugin that includes a /linear-search skill and a Linear MCP server. The user installs the plugin once, both pieces become available, and userConfig in plugin.json prompts for the API key at install time.
Footguns
Skills inflate parent context; subagents protect it. A 2,000-line SKILL.md is a 2,000-line tax on every turn after invocation. The same logic delegated to a subagent stays out of parent context entirely; the parent gets the summary back. If your “skill” reads many files, you have written a subagent with the wrong frontmatter. Move the body to .claude/agents/<name>.md and have the skill call it instead.
MCP servers cannot be a “private skill”. They are processes. A team that wants a private internal linter ships an MCP server and discovers that every developer needs to install and run a binary. If the workflow is “type a slash command, get a result with no external state”, a skill is cheaper. MCP earns its keep when the work needs persistent connection state, OAuth, or per-request tool selection that a static allowed-tools cannot express.
Plugins are not a fifth primitive. A plugin is a manifest plus pointers. The actual work happens in the skills, agents, hooks, or MCP servers it contains. A plugin with no internal pieces does nothing. The right time to reach for a plugin is when you have already built two or more of the other things and want to distribute them as one unit.
Subagents do not auto-invoke unless the description is right. A subagent at .claude/agents/code-reviewer.md only fires when Claude decides to invoke it, and that decision is based on the description in the file. A description like “Reviews code” is too generic; the model picks the agent only when the user prompt explicitly says “review this code”. Tighten descriptions to the verbs and nouns a real prompt would contain (for example: “Review a git diff for SQL safety, missing tests, and unhandled error paths”).
MCP server output bloats parent context unless you wrap noisy queries in a subagent. A mcp__postgres__query that returns 500 rows lands in the parent context as 500 rows. The fix is the previous footgun in reverse: invoke the MCP tool from inside a subagent so only the summary returns. MCP plus subagent is the standard pattern for “let Claude run queries without flooding the parent conversation”.
When NOT to extend Claude Code at all
- You will use the prompt once. Just type it. Skill, plugin, agent: all three pay an authoring tax that only earns out at three or more invocations.
- The work belongs in
CLAUDE.md. “Always X for this project” is a project rule, not an invocation-shaped extension.CLAUDE.mdloads at session start; you do not type a slash command to apply it. - The work is a real production service. A plugin or MCP server is for Claude-shaped workflows, not for replacing your CI, your formatter, or your linter. Production tooling should be production tooling; Claude calls it through Bash or HTTP when needed, not through an extension you maintain alongside it.
- You have not tried the basic CLI yet. New users sometimes browse plugin marketplaces before they have written ten prompts. The basic CLI plus
CLAUDE.mdcovers more ground than any extension matrix; learn the floor before you build the ceiling.