Spec-Driven Development Tools — Market Comparison, Features & Pricing (2026)
Compare Amazon Kiro, BMAD-METHOD, GitHub Spec Kit, Claude Code, Cursor, and OpenSpec
Spec-driven development (SDD) is a methodology where a formal specification is written and approved before any code is generated — the spec is the source of truth, and code is the build output. It emerged in 2025 as a direct response to "vibe coding" failures: AI agents that produce plausible code that drifts from intent, hallucinates APIs, and degrades as projects scale. By mid-2026, six tools dominate the SDD landscape: Amazon Kiro (spec-first IDE), BMAD-METHOD (multi-agent SDLC framework), GitHub Spec Kit (agent-agnostic CLI), Claude Code with Managed Agents, Cursor with .cursor/rules, and OpenSpec (CI/CD-native YAML specs). Each solves a different slice of the problem.
How to use this Spec-Driven Development
- 1Answer the static vs living spec question first: do you need specs that agents read once (GitHub Spec Kit, BMAD, Cursor rules) or specs that are enforced continuously at runtime (Kiro Agent Hooks, OpenSpec CI/CD, Claude Code Outcomes)?
- 2Assess your project profile: greenfield with clear scope → BMAD pays off. Brownfield or iterative feature work → Cursor or OpenSpec. Multi-tool team → GitHub Spec Kit. AWS/regulated industry → Kiro.
- 3Model token costs before adopting BMAD at scale — real-world usage averages $800–$2,000 per developer per month on frontier models. OpenSpec and GitHub Spec Kit run at a fraction of that cost.
- 4For teams new to SDD, start with GitHub Spec Kit — lowest learning curve, model-agnostic, MIT-licensed, and the /specify + /plan + /tasks workflow maps cleanly to any existing development process.
- 5Combine tools for complex projects: GitHub Spec Kit for structured spec creation, Claude Code for deep architectural reasoning on implementation, Cursor for rapid iteration on individual components.
What to consider when adopting spec-driven development
Spec lifecycle: static vs living
Static spec tools (GitHub Spec Kit, BMAD, Cursor rules) produce documents that agents read at generation time. Specs can drift from implementation without the tool detecting it — requires manual upkeep. Living spec tools (Kiro Agent Hooks, OpenSpec CI/CD validation, Claude Code Outcomes) enforce specs continuously — drift is caught at file save, PR, or deployment. Living specs have higher setup cost but lower long-term maintenance burden.
Agent flexibility
GitHub Spec Kit is the only tool explicitly designed to work with all major AI coding agents simultaneously — Copilot, Claude Code, Cursor, Codex CLI, Gemini CLI, Windsurf, and Qwen Code. BMAD is similarly model-agnostic. Kiro is locked to Claude Sonnet + Amazon Nova. Cursor has a model selector but is primarily a single-IDE tool. Consider agent flexibility before committing if your team uses or plans to use multiple AI coding assistants.
Brownfield support
Claude Code and Cursor both excel at brownfield work — existing codebases with legacy architecture, undocumented behaviour, and partial test coverage. BMAD's documentation-first assumptions cause friction on legacy codebases (confirmed in active GitHub issues #446 and #563). Kiro and GitHub Spec Kit handle brownfield reasonably well. OpenSpec's YAML schema approach works well for adding spec discipline to new features in existing codebases.
Token and API costs
BMAD has the highest real-world token cost — ~31,667 tokens per workflow run, $800–$2,000/developer/month on frontier models for large projects. One independent test put a full planning-to-implementation cycle at ~$200. GitHub Spec Kit and OpenSpec are significantly cheaper. Claude Code and Cursor use flat-rate or credit-based subscriptions that are more predictable. For teams considering BMAD, model the monthly API cost at your expected project volume before adopting.
EARS notation
EARS (Easy Approach to Requirements Syntax) is a structured natural language format used by Kiro and GitHub Spec Kit for writing AI-readable requirements. Templates follow patterns like "The system shall [action]" and "When [event], the system shall [response]." EARS statements are machine-parseable enough for agent validation while remaining readable for non-technical stakeholders — unlike formal logic notation. Learning basic EARS patterns significantly improves spec quality regardless of which SDD tool you choose.
CI/CD integration
OpenSpec is the only fully CI/CD-native tool — YAML spec files are validated in the pipeline, spec changes trigger automated checks, and contract-first enforcement runs on every PR. Kiro integrates with CI/CD via Agent Hooks. GitHub Spec Kit and BMAD do not have native CI/CD integration — specs are documents, not pipeline-executable artifacts. For teams where pipeline enforcement is the goal, OpenSpec is the right architectural choice.
Tips and things to know
- ✓Write a CLAUDE.md or .cursor/rules file even if you are using BMAD or Spec Kit — these architecture constraint files are complementary to formal specs, not redundant. They encode the standards that every agent interaction should respect without requiring a full spec for each feature.
- ✓The "Spec Tax" is real: Amazon Kiro, BMAD, and full GitHub Spec Kit workflows add meaningful planning time before first code. For features you are confident about, the overhead exceeds the benefit. Reserve full SDD for features with genuine complexity, ambiguous requirements, or high downstream risk.
- ✓For BMAD users: tune your model choice aggressively. Switching from Claude Opus to Claude Sonnet for development tasks (reserving Opus for the Architect agent only) can reduce token costs by 60–70% with minimal quality impact on implementation tasks.
- ✓GitHub Spec Kit's /constitution command is the highest-leverage starting point for any new project — a well-written constitution file prevents entire categories of architectural drift across all subsequent feature specs.
- ✓Combine Kiro for planning and spec creation with Claude Code for implementation on complex features. The common real-world pattern in 2026: Kiro for structured requirements, Claude Code for deep multi-file implementation reasoning, Cursor for rapid component iteration.
Official resources and further reading
Amazon Kiro — Official Documentation
Kiro's official documentation covering spec-driven workflows, Agent Hooks configuration, Steering files, and the three-phase development process.
GitHub Spec Kit — GitHub Repository
The official GitHub Spec Kit repository — MIT-licensed, 90,000+ stars, with full documentation on the /specify, /plan, /tasks, and /implement workflow.
BMAD-METHOD — GitHub Repository
The BMAD-METHOD repository — MIT-licensed, ~49,000 stars, with installation instructions, agent documentation, and workflow guides.
Spec-Driven Development Tools 2026 — Market Comparison
Spec-driven development (SDD) emerged in 2025 as a direct response to the failure mode of "vibe coding" — AI agents that produce plausible code that drifts from intent, hallucinates APIs, and decays as projects scale. By mid-2026, every major AI coding tool has shipped its own SDD flavour. The tools split on a fundamental question: does the spec live as a static document agents read once, or as a living asset agents execute against continuously?
📄 Static spec tools
Specs are Markdown/YAML documents. Agents read them at generation time but cannot detect drift. GitHub Spec Kit, BMAD, Cursor rules. Fast to adopt, but specs and code diverge without manual upkeep.
⚡ Living spec tools
Specs are enforced at runtime — Agent Hooks, CI/CD schema validation, or grader agents verify implementation matches spec continuously. Kiro, Claude Code Outcomes, OpenSpec. Higher setup cost, lower drift risk.
Agentic IDE that turns natural language into EARS specs before writing a line of code
Learning curve
Medium
Brownfield
Good
Spec lifecycle
Living — agent-enforced
Free: Free: 50 interactions/month
Token/API cost: Credit-based — per-prompt visibility into cost
Amazon's ground-up replacement for Q Developer, launched internationally May 7, 2026. The three-phase workflow (requirements → design → tasks) enforces structured planning before any code generation begins. Agent Hooks are the standout feature: event-driven automations that fire on file save, commit, or deletion — tests update when you save a component, docs refresh when you change an API endpoint. Steering files let teams encode architecture standards that apply to every contributor. The model routing is intelligent: Claude Sonnet 4.5 for reasoning-heavy spec generation, Amazon Nova for high-throughput code generation. The honest limitation: the credit system makes costs less predictable than flat-rate tools, and the spec overhead ("Spec Tax") slows down small single-developer features where the planning cost exceeds the benefit.
Multi-agent SDLC framework with named role personas across 19+ specialised agents
Learning curve
High
Brownfield
Limited
Spec lifecycle
Static — document only
GitHub stars: ~49,000 stars (June 2026)
Free: Free — MIT open-source
Token/API cost: ~31,667 tokens per workflow run
BMAD (Breakthrough Method for Agile AI-Driven Development) is the most thorough documentation-generation framework in this comparison — and the most expensive to run. The role-based agent system (Mary the BA, Winston the Architect, Devon the Developer) structures the full SDLC with formal handoffs between specialised personas. V6 ships with 19+ agents, 50+ named workflows, and native Claude Code hooks. Community is substantial: ~49,000 GitHub stars and 5,700 forks as of June 2026. The trade-off is time and token cost: one real-world benchmark put the same CRM dashboard at 12 minutes with OpenSpec vs 5.5 hours with BMAD. Heavy planning pays off on greenfield projects with clear scope; it fights brownfield codebases where BMAD's documentation-first assumptions do not map cleanly to existing architecture. Token costs are the highest in this comparison — model well before adopting at scale.
Agent-agnostic CLI and slash commands for spec-first workflows across all major AI coding tools
Learning curve
Low
Brownfield
Good
Spec lifecycle
Static — document only
GitHub stars: 90,000+ stars (May 2026)
Free: Free — MIT open-source
Token/API cost: Moderate — lighter than BMAD; heavier than OpenSpec
The reference implementation of spec-driven development, open-sourced by GitHub in September 2025. The headline feature is genuine agent-agnosticism: the same spec artifacts work with Copilot, Claude Code, Cursor, Codex CLI, Gemini CLI, Windsurf, and Qwen Code without modification. The four-stage workflow (Specification → Plan → Tasks → Implementation) is the most widely adopted SDD pattern in the ecosystem. Slash commands (/constitution, /specify, /clarify, /plan, /tasks, /analyze, /implement, /checklist) map cleanly to development milestones and make the workflow teachable to new team members. 90,000+ GitHub stars — the most starred SDD tool in this comparison — reflect strong community adoption. Limitation: spec artifacts are static documents that drift from implementation unless a developer actively updates them; there is no automated sync between the spec file and the running code.
Native spec-driven workflows via CLAUDE.md, sub-agents, and May 2026 Managed Agents update
Learning curve
Low
Brownfield
Excellent
Spec lifecycle
Living — agent-enforced
Free: Included in Claude Pro ($20/mo)
Token/API cost: Token-based on Claude API pricing
Claude Code's May 2026 Managed Agents update introduced three spec-relevant capabilities: Outcomes (success rubrics evaluated by a separate grader), webhook notifications on completion, and multiagent orchestration where a lead agent delegates to specialists. Context Compaction (beta) summarises older context as conversations grow, enabling longer spec-coherent agentic runs. Independent benchmarks show Claude Code processing a 40-page specification document and maintaining consistency through the final implementation file — the strongest long-context coherence in this comparison. Routines (April 2026) enable scheduled tasks on Anthropic infrastructure that continue when your machine is off. The primary limitation for SDD: understanding resets between sessions, so persistent spec management across multiple sessions requires an external file like CLAUDE.md or a framework like BMAD or Spec Kit on top. Best combined with GitHub Spec Kit for structure.
AI-first IDE with rules-based spec enforcement via .cursor/rules and @Spec context
Learning curve
Low
Brownfield
Excellent
Spec lifecycle
Static — document only
Free: Hobby plan: free (2,000 completions/month)
Token/API cost: Flat-rate subscription — most predictable cost model in comparison
Cursor is the speed benchmark in this comparison — the common real-world pattern is Cursor for rapid iteration plus Kiro or BMAD for structured planning on complex features. The .cursor/rules approach to spec-driven development is lightweight: rules files constrain agent behaviour globally (coding standards, architecture conventions, forbidden patterns) without requiring formal spec documents. @Spec context injection lets you reference a spec file in any chat for targeted generation. Background agents (beta in 2026) run tasks asynchronously. The honest assessment: Cursor is not a spec-driven tool by design — it is a speed-first AI IDE that can be made more spec-aware through rules files. For genuine SDD workflows, combine Cursor with GitHub Spec Kit or BMAD rather than treating .cursor/rules as a spec layer. Flat-rate pricing ($20/mo Pro) is the most predictable cost model in this comparison.
Lightweight open-source spec framework optimised for speed — 12-minute task vs 5.5 hours for BMAD
Learning curve
Low
Brownfield
Good
Spec lifecycle
Living — synced to code
GitHub stars: Smaller community than BMAD/Spec Kit — actively maintained
Free: Free — MIT open-source
Token/API cost: Low — significantly cheaper than BMAD per workflow run
OpenSpec is the pragmatist's spec tool: YAML-validated spec files that are machine-readable (unlike BMAD's Markdown PRDs), CI/CD-native (spec changes trigger pipeline checks), and dramatically faster than heavier frameworks. The benchmark is stark: the same CRM dashboard task took 12 minutes with OpenSpec vs 90 minutes with GitHub Spec Kit and 5.5 hours with BMAD. The YAML schema format means specs can be linted, diffed, and validated automatically — treating specs as code rather than documents. Token cost is the lowest in this comparison. The trade-off: smaller community (no 90,000-star GitHub presence), less tooling ecosystem, and the YAML format is less readable for non-technical stakeholders reviewing requirements. Best for developer-led teams where CI/CD integration and iteration speed matter more than stakeholder-friendly documentation.
Feature-by-feature comparison
| Feature | Amazon KiroBest Structured IDE | BMAD-METHODMost Comprehensive Framework | GitHub Spec KitBest Agent-Agnostic | Claude Code (native SDD)Best Reasoning Depth | Cursor (.cursor/rules) | OpenSpec |
|---|---|---|---|---|---|---|
| Type | IDE | Framework | CLI | CLI | IDE | Framework |
| License | Commercial | Open-source (MIT) | Open-source (MIT) | Commercial | Commercial | Open-source (MIT) |
| Best for | Teams wanting a complete spec-first IDE with event-driven automation and AWS ecosystem integration — especially regulated industries where auditable requirements are mandatory | Complex greenfield projects where exhaustive documentation has clear value — regulated industries, large teams, or projects where "almost right" has high downstream cost | Teams using multiple AI coding tools who want a portable spec workflow that is not locked to any single agent or IDE | Teams that need deep architectural reasoning on complex codebases — especially brownfield refactors, legacy modernisation, and large multi-file changes where context coherence is critical | Teams prioritising speed and iteration velocity over formal documentation — spec-via-rules as a lightweight constraint layer rather than a full SDD workflow | Teams that need spec-driven discipline with CI/CD integration and minimal planning overhead — especially brownfield additions and iterative feature work |
| Spec format | EARS notation (Easy Approach to Requirements Syntax) — structured natural language stored as Markdown in .kiro/specs/ | PRD + Architecture document + User stories + Task breakdown — all as Markdown files tracked in Git | Constitution file (EARS-style project principles) + per-feature spec files — all Markdown, Git-tracked | CLAUDE.md project context file + inline spec in conversation + Outcomes (success rubrics) via Managed Agents | .cursor/rules files (Markdown constraints applied to every agent interaction) + @Spec file references in chat | YAML schema-validated spec files — machine-readable, CI/CD-compatible |
| Spec lifecycle | Living — agent-enforced | Static — document only | Static — document only | Living — agent-enforced | Static — document only | Living — synced to code |
| Agent flexibility | Locked to vendor | Model-agnostic | Model-agnostic | Single-model optimised | Model-agnostic | Model-agnostic |
| Human-in-loop | ✓ Yes | ✓ Yes | ✓ Yes | ✓ Yes | ✓ Yes | ✗ No |
| Brownfield support | Good | Limited | Good | Excellent | Excellent | Good |
| CI/CD native | ✓ Yes | ✗ No | ✗ No | ✓ Yes | ✗ No | ✓ Yes |
| Verification | Acceptance criteria in spec; Agent Hooks run checks on file save/commit; Steering files enforce architecture standards | QA agent generates test plans; Developer agent runs tests; no automated enforcement between spec and code | Checklist-based — /checklist command verifies implementation against spec items; no automated runtime enforcement | Outcomes feature: success rubrics evaluated by a separate grader agent; Context Compaction maintains spec coherence over long sessions | Rules files constrain generation; no automated spec-vs-code verification; relies on developer review | YAML validation in CI/CD pipeline; spec changes trigger automated checks; contract-first enforcement via schema |
| Learning curve | Medium | High | Low | Low | Low | Low |
| GitHub stars | Closed-source — no public repo | ~49,000 stars (June 2026) | 90,000+ stars (May 2026) | Closed-source | Closed-source — $10B valuation (2026) | Smaller community than BMAD/Spec Kit — actively maintained |
| Free tier | Free: 50 interactions/month | Free — MIT open-source | Free — MIT open-source | Included in Claude Pro ($20/mo) | Hobby plan: free (2,000 completions/month) | Free — MIT open-source |
| Paid from | $19/month (Pro) | Free (framework). API costs: $800–$2,000/developer/month on Claude Opus 4.5 or Sonnet 5 | Free (framework). API costs depend on chosen AI coding agent | $20/month (Claude Pro, usage-capped) | $20/month (Pro) | Free. API costs: fraction of BMAD due to lighter planning overhead |
| Token cost | Credit-based — per-prompt visibility into cost. Claude Sonnet 4.5 + Amazon Nova routing | ~31,667 tokens per workflow run. Large projects: 230M tokens/week. Independent test: ~$200 for a full planning-to-implementation cycle | Moderate — lighter than BMAD; heavier than OpenSpec. Benchmark: 90 min for same task BMAD took 5.5 hr | Token-based on Claude API pricing. Context Compaction (beta) reduces long-session costs by summarising older context | Flat-rate subscription — most predictable cost model in comparison | Low — significantly cheaper than BMAD per workflow run. Best cost-efficiency in comparison |
Verified June 2026. Pricing and GitHub stars change rapidly — confirm on each tool's official page before committing to a workflow.
Which SDD tool should you use?
You need a complete spec-first IDE with event-driven automation
Amazon Kiro. Three-phase spec workflow (requirements → design → tasks), Agent Hooks that fire on file events, Steering files for team-wide architecture enforcement. Best for regulated industries where auditable requirements are mandatory. $19/month Pro.
→ Use Amazon Kiro
You are building complex greenfield software with a clear scope
BMAD-METHOD. 19+ specialised agent personas, formal SDLC handoffs, exhaustive documentation. Token costs are the highest in this comparison ($800–$2,000/developer/month) but the planning depth pays off on large, complex projects where "almost right" has high cost.
→ Use BMAD-METHOD
Your team uses multiple AI coding tools and needs portability
GitHub Spec Kit. Works with Copilot, Claude Code, Cursor, Codex CLI, Gemini CLI, and Windsurf without modification. 90,000+ GitHub stars. The reference SDD implementation for agent-agnostic teams. Free, MIT-licensed.
→ Use GitHub Spec Kit
You need deep reasoning on brownfield or complex existing codebases
Claude Code with CLAUDE.md. Best long-context coherence in the comparison — processes 40-page specs without consistency loss. May 2026 Outcomes feature adds grader-agent verification. Combine with GitHub Spec Kit for persistent spec management across sessions.
→ Use Claude Code
You prioritise iteration speed over formal documentation
Cursor with .cursor/rules. Flat-rate $20/month, lowest learning curve, excellent brownfield support, fastest time-to-code. Use .cursor/rules for lightweight spec constraints. Combine with BMAD or Spec Kit when a feature genuinely needs formal planning.
→ Use Cursor
You need CI/CD-native spec enforcement at low token cost
OpenSpec. YAML-validated specs that run in CI/CD pipelines — machine-readable, lintable, automatically checked on every PR. The fastest framework in benchmarks (12 min vs 5.5 hr for BMAD on the same task). MIT-licensed, free.
→ Use OpenSpec
Pricing summary (June 2026)
| Tool | Licence | Free tier | Paid from | Real API cost |
|---|---|---|---|---|
| Amazon Kiro | Commercial | 50 interactions/mo | $19/mo (Pro) | Credit-based — per-prompt visibility |
| BMAD-METHOD | MIT open-source | Free (npx bmad-method install) | Free framework | $800–$2,000/dev/mo on frontier models |
| GitHub Spec Kit | MIT open-source | Free | Free framework | Depends on chosen AI coding agent |
| Claude Code | Commercial | Included in Claude Pro ($20/mo) | $20/mo (Claude Pro) | Token-based; Context Compaction reduces cost |
| Cursor | Commercial | Hobby: 2,000 completions/mo | $20/mo (Pro) | Flat-rate — most predictable in comparison |
| OpenSpec | MIT open-source | Free | Free framework | Low — fraction of BMAD per run |
API/token costs for open-source frameworks (BMAD, Spec Kit, OpenSpec) depend entirely on your choice of underlying AI coding assistant. BMAD costs are the most significant variable — model well before adopting at scale.
Frequently asked questions
What is spec-driven development (SDD)?
Spec-driven development is a methodology where a formal specification is written before any code is generated, and that specification serves as the source of truth throughout the project lifecycle. In the context of AI coding agents, the spec defines what the code must do, what constraints it must satisfy, and how success is measured — so the agent generates against a verified target rather than improvising from a chat prompt. The spec is to SDD what a type signature is to typed programming: a constraint that prevents an entire class of errors.
What is the difference between SDD and TDD?
Test-driven development (TDD) defines success at the function level through unit tests written before implementation. Spec-driven development defines success at the feature or system level through formal requirements documents before implementation. They are complementary: SDD generates the feature requirements; TDD generates the verification tests for each requirement. BMAD and GitHub Spec Kit both produce user stories that map to TDD test cases — making them stackable rather than competing methodologies.
What is EARS notation and why do SDD tools use it?
EARS (Easy Approach to Requirements Syntax) is a structured natural language format for writing requirements that AI agents can parse unambiguously. Requirements follow templates like "The system shall [action]" or "When [event], the system shall [response]." Amazon Kiro and GitHub Spec Kit both use EARS for spec files. The benefit: EARS statements are machine-readable enough for agent validation but human-readable enough for stakeholder review — unlike formal logic, which is agent-friendly but inaccessible to non-engineers.
Is BMAD-METHOD worth the token cost?
For complex greenfield projects where exhaustive documentation has real value, yes. Independent benchmarks put BMAD's full planning-to-implementation cycle at approximately $200 in API costs, with large projects burning $800–$2,000 per developer per month. The value equation: BMAD's planning depth catches design issues before implementation that would cost significantly more to fix after. For small features, brownfield work, or tight budgets, OpenSpec or GitHub Spec Kit deliver SDD discipline at a fraction of the cost.
Can I use GitHub Spec Kit with Claude Code and Cursor at the same time?
Yes — that is explicitly what GitHub Spec Kit is designed for. The spec artifacts (constitution file, per-feature specs, task breakdowns) are plain Markdown files in your Git repository. Any AI coding agent that can read files can consume them. Teams commonly use Spec Kit for spec creation and planning, then implement with whichever agent is best for the specific task — Claude Code for complex architectural reasoning, Cursor for rapid iteration.
Related tools you might need
AI Agent Platforms
Compare LangGraph, CrewAI, AutoGen, OpenAI SDK, n8n, and Flowise — then build the right system prompt
Prompt Template Library
Browse 500+ battle-tested prompts for every use case
AI Code Reviewer
Review code for bugs, performance, and best practices
AI Unit Test Generator
Generate unit tests for any function or class
Frequently asked questions
Spec-driven development (SDD) is a methodology where a formal specification is written and approved before any code is generated. The spec defines what the system must do, the constraints it must satisfy, and how success is measured. In AI-assisted development, the spec serves as a grounding document that prevents agents from drifting from intent as complexity scales. The spec is the source of truth; code is the build output.