Teju's Blog

Full stack engineer and AI architect. Notes from the work.


OpenSpec: spec-driven development for AI coding agents

Most of the time an AI coding agent ships something wrong, the prompt and the agent’s instructions did not pin down what “right” looked like. OpenSpec is a tool for closing that gap. It makes the spec a first-class artifact the agent has to read before any code gets written.

What it is

OpenSpec is an npm CLI (@fission-ai/openspec) and a set of slash commands for AI coding tools (Claude Code, Cursor, Windsurf, Codex, around 25 others). You install it once, run openspec init in your project, and it scaffolds an openspec/ directory that the agent now treats as the source of truth for what should exist in the codebase.

Three things live in that directory:

  • openspec/specs/ is the current state. Requirements and scenarios for each feature, in markdown.
  • openspec/changes/<change-name>/ is a proposed change in flight. Each change is its own folder containing a proposal, scoped spec edits, design notes, and a task checklist.
  • openspec/changes/archive/YYYY-MM-DD-<name>/ holds completed changes. Permanent record of what shipped and why.

The package is MIT-licensed, requires Node 20.19 or higher, and exposes an openspec update command that refreshes the agent’s instruction file (CLAUDE.md, .cursorrules, AGENTS.md, depending on the tool) so the latest slash commands are active.

The workflow

Three commands cover the whole loop.

update specs /opsx:propose /opsx:apply /opsx:archive openspec/specs/ openspec/changes/ changes/archive/

/opsx:propose <name> is the planning step. The agent reads the existing specs, writes a proposal explaining the why, edits the scoped portions of specs/ to reflect the new desired state, writes a design doc for the non-obvious technical decisions, and breaks the work into a numbered task list. Nothing in your code is touched yet.

/opsx:apply is the implementation step. The agent picks tasks off tasks.md, writes code, marks each task complete as it goes. If you stop it midway, the unchecked tasks are still there when you come back. The propose-and-apply split exists so you can read the plan before any file is touched. Without that gap, you are watching the agent type and trying to catch it in real time.

/opsx:archive is the audit step. The proposal, the spec edits, the design notes, and the completed task list get moved into a dated archive directory. The main specs/ tree is updated with whatever the change altered. The change folder under changes/ is now empty, and you can start the next one.

A real session, lifted from the OpenSpec docs:

You: /opsx:propose add-dark-mode
AI:  Created openspec/changes/add-dark-mode/
     proposal.md, specs/, design.md, tasks.md
     Ready for implementation.

You: /opsx:apply
AI:  Implementing tasks...
     1.1 Add theme context provider
     1.2 Create toggle component
     2.1 Add CSS variables
     2.2 Wire up localStorage
     All tasks complete.

You: /opsx:archive
AI:  Archived to openspec/changes/archive/2026-05-26-add-dark-mode/
     Specs updated.

There is an expanded profile (/opsx:new, /opsx:continue, /opsx:ff, /opsx:verify, /opsx:bulk-archive, /opsx:onboard) for teams running OpenSpec at scale, switchable via openspec config profile. The base three above are what you need for everyday work.

OpenSpec is a workflow tool. It constrains how a change moves from idea to code, but it does not check that the code actually matches the spec; that is on you and your tests.

What a change folder actually contains

A change folder for one feature ends up looking like this:

openspec/changes/add-rate-limiter/
├── proposal.md
├── design.md
├── tasks.md
└── specs/
    └── api.md

proposal.md is two or three paragraphs. Why are we doing this, what changes at the user-visible level, what stays out of scope. Short enough that you can read it in a minute and decide whether to keep going.

design.md is the part you would otherwise put on a whiteboard. Which algorithm, which library, which trade-off you took, the reasoning for each. Three to five paragraphs is typical. If design.md is empty, the change is probably too small to need OpenSpec.

tasks.md is the only file the agent actively checks off as it works. Numbered, granular enough that a single task is a single file edit or a single test addition. The granularity matters because the agent uses task boundaries to decide when to stop and re-orient.

specs/api.md (or whichever spec file the change touches) is the diff against the existing spec. When the change archives, the proposed edits are merged into openspec/specs/api.md and the standalone copy in the change folder gets archived.

How OpenSpec slots into the harness

A harness is the production-engineering code around a model call: retries, tool dispatch, audit logs, capability gates, the rest of the machinery covered in the harness engineering post. OpenSpec is not itself a harness. It is a piece of context that lives inside one.

How the pieces line up:

  • System prompt extension. openspec update writes a section into the agent’s instruction file telling it to consult openspec/specs/ before coding and to use the propose flow for any non-trivial change. This is the agent.md pattern.
  • Lazy-loaded artifacts. Proposals, designs, and task lists do not get inlined into every prompt. The agent reads them on demand from disk, the same way it reads any other file. A complex change with five spec files in flight does not bloat the per-request token count.
  • Tool surface. /opsx:propose, /opsx:apply, /opsx:archive are tools (slash commands wired to the agent’s runtime). Each has a defined schema and a predictable filesystem side effect, so the harness can give them retries, timeouts, and audit logging without knowing anything specific about specs.
  • Confirmation gate. The propose/apply split is a built-in human-in-the-loop checkpoint. You can read the proposal and the task list before any code gets written, and revise or reject either before apply runs.
  • Audit log. The archive/ directory is the audit log. Every shipped change is a dated folder containing the proposal, the design, and the task checklist exactly as they existed at apply time. Six months from now you can answer “why is this code here” by reading one folder.

The place I have found OpenSpec most useful is when the codebase is large enough that the agent cannot fit a complete mental model into a single context window. The specs directory acts like a retrieval store the agent itself maintains. It goes there for ground truth instead of grepping or asking. The propose-then-apply split is what lets a human review the plan in English before any file gets written.

OpenSpec works less well on greenfield projects with no shape yet. The spec is a description of the current and desired state. If “desired state” is “I do not know yet, let me think for an afternoon”, the agent will write an aspirational spec that drifts from the code on the very first apply. Better to wait until there is something concrete to anchor against.

Setup

For most projects, three commands:

bash
npm install -g @fission-ai/openspec@latest
cd your-project
openspec init

openspec init picks the right config file for your AI tool. It knows about Claude Code, Cursor, Codex, Windsurf, Zed, and around two dozen others, and writes the OpenSpec section into whichever instruction file your tool uses. Run openspec config profile if you want the expanded command set, and openspec update whenever the package upgrades to regenerate the agent instructions.

On model choice: use a reasoning-grade model for both planning and implementation. The vendor recommends Codex 5.5 or Claude Opus 4.7. I have run it with smaller models on small projects and the proposal step degrades fastest. The implementation step is more forgiving because tasks.md does most of the heavy lifting.


← all posts