# Specnaut
> Specnaut is an enhanced fork of the
> [`specify` CLI from GitHub Spec Kit](https://github.com/github/spec-kit), distributed as a single
> native binary (no Python prerequisites). It scaffolds the files your AI coding harness consumes —
> SpecKit slash-commands, spec / plan / tasks templates, a constitution, agents, and a backlog
> system — directly into an existing project, in one command.
Specnaut does **not** call any LLM and does **not** orchestrate any agent at runtime. Your AI
harness (Claude Code, Cursor, Codex, GitHub Copilot CLI, Windsurf, OpenCode, Antigravity) is what
reads the generated files and acts on them.
This page is the canonical documentation. The same content is available as raw Markdown at
[`/llms.txt`](/llms.txt) for LLM consumption — see [llmstxt.org](https://llmstxt.org/) for the
convention.
## Install
The fastest path on macOS or Linux:
```bash
curl -fsSL https://raw.githubusercontent.com/specnaut/specnaut-cli/main/install.sh | bash
```
The installer downloads the platform binary, verifies the SHA256 checksum, and places it in
`/usr/local/bin` (auto-elevating via `sudo` if needed). On non-writable prefixes with no terminal it
falls back to `~/.local/bin`.
Pin a specific version:
```bash
curl -fsSL https://.../install.sh | VERSION=v0.7.1 bash
```
Custom install dir:
```bash
curl -fsSL https://.../install.sh | PREFIX=$HOME/.local/bin bash
```
Or via Homebrew (macOS / Linux):
```bash
brew tap specnaut/tap
brew install specnaut
```
Manual download: pick the binary for your OS/arch from
[GitHub Releases](https://github.com/specnaut/specnaut-cli/releases), `chmod +x`, place it on your
`$PATH`. On macOS clear the quarantine attribute with
`xattr -d com.apple.quarantine /path/to/specnaut`.
### Install as a plugin / extension (five harnesses)
If you want Specnaut's skills and sub-agents available across **all your projects** without running
`specnaut init`, install Specnaut as a plugin / extension in your harness. Specnaut ships adapters
for five harnesses with the same skill content across all of them — the bundled router skill, the
phase docs, the bootstrap skill, the sub-agents, and the SessionStart hook (where supported).
| Harness | Install command |
| ---------------------- | ------------------------------------------------------------------------------------------------------------------------ |
| **Claude Code** | `/plugin install specnaut/specnaut-cli-plugin` |
| **Codex CLI / App** | `/plugins` → search "specnaut" → install (once the marketplace listing lands; see Notes) |
| **Cursor** | `/add-plugin specnaut/specnaut-cli` |
| **OpenCode** | Add `"plugin": ["specnaut@git+https://github.com/specnaut/specnaut-cli.git"]` to `opencode.json` |
| **GitHub Copilot CLI** | `copilot plugin marketplace add specnaut/specnaut-cli-marketplace`
`copilot plugin install specnaut@specnaut-marketplace` |
The skill content is identical across harnesses; only the surface conventions differ (slash-command
prefix, auto-activation mechanism, tool naming). See the per-harness tool-mapping references at
`plugin/skills/using-specnaut/references/-tools.md` for the equivalent of every Claude Code
tool on each harness.
#### Claude Code — slash-command prefix
```
/specnaut-plugin:specnaut specify ""
/specnaut-plugin:specnaut plan
```
Slightly verbose, but unambiguous (the plugin's slash-commands are namespaced and the consolidated
router itself is named `specnaut`). If you scaffold project-local with `specnaut init` instead, you
get the shorter `/specnaut specify "..."` form.
To test a local checkout of the plugin without publishing:
```bash
claude --plugin-dir /path/to/specnaut/plugin
```
#### Auto-activation across harnesses
Specnaut ships a `using-specnaut` bootstrap skill loaded automatically at session start on every
harness that supports it (via plugin/hooks/hooks.json on Claude Code, plugin/hooks/hooks-cursor.json
on Cursor, the `experimental.chat.messages.transform` hook in `.opencode/plugins/specnaut.js` on
OpenCode). The bootstrap skill teaches the agent Specnaut's skill registry, agent registry, and
routing principles so you don't need to invoke `/specnaut` explicitly — typing "plan this issue" or
"review my work" is enough for the right skill to fire.
#### Notes on Codex CLI and the shared marketplace
Codex CLI and Copilot CLI distribute plugins through marketplaces. Specnaut has two adapter targets
that need a one-time human setup before the marketplace listings are live:
- **Codex CLI** — `.codex-plugin/plugin.json` ships in this repo; the
`scripts/sync-to-codex-plugin.sh` script (fires on every release tag) mirrors the Specnaut plugin
content into `specnaut/plugins` (a fork of `openai/plugins`). Until that fork is rebased into
upstream and the `CODEX_SYNC_TOKEN` PAT is provisioned (see issues #298–#300), the sync emits a
workflow warning and skips — same fail-safe pattern as the Homebrew tap bump.
- **Copilot CLI + shared marketplace** — `.claude-plugin/marketplace.json` lives in
`specnaut/specnaut-cli-marketplace` (a separate repo). `scripts/sync-to-marketplace.sh` bumps the
version on every release. Until the marketplace repo + `MARKETPLACE_SYNC_TOKEN` are provisioned
(see issues #309–#310), the sync skips with a warning.
**Plugin vs `specnaut init`** — they complement each other:
| Aspect | Binary (`specnaut init`) | Plugin (`/plugin install`) |
| ---------------------------------- | --------------------------- | ----------------------------------- |
| Scope | Project-local (`.claude/`) | User-scope (all projects) |
| Slash-command style | `/specnaut specify` (short) | `/specnaut-plugin:specnaut specify` |
| Customizable per-project | Yes | No (user-scope, shared) |
| Backlog skill, hooks, `.specnaut/` | Yes | No (project-stateful — binary-only) |
| Kept in sync | `specnaut upgrade` | `/plugin update` |
Most teams use both: the plugin provides discoverability and keeps the agents up-to-date across all
projects; `specnaut init` provides the short slash-commands and project-local customization.
## Quickstart
### Create a new project
```bash
specnaut init my-project
cd my-project
```
This scaffolds a tree configured for the **Claude Code** harness by default (`.claude/`,
`.specnaut/`, `AGENTS.md`, `.specnaut/backlog.md`, …). Open the project in your harness — that's
where you'll run the rest.
### Step 1 after `init`: run `/specnaut constitution`
`/specnaut constitution` is the expected first action after `specnaut init`. It scaffolds your
project's guiding principles (architecture, quality gates, ways of working) into
`.specnaut/memory/constitution.md` so the rest of the pipeline (`/specnaut specify`,
`/specnaut plan`, `/specnaut tasks`, `/specnaut implement`) has something to anchor on.
The generated constitution comes pre-populated with four opinionated baseline blocks (all
user-tunable): **Engineering methodology** (TDD / DDD / SOLID-DRY-KISS-YAGNI /
Boy-Scout-escalation), **Architecture layers** (hexagonal default: `domain/` / `application/` /
`infrastructure/` / `presentation/`), **Back-end patterns** (Repository, service objects, DI through
constructors, thin controllers, errors as domain types, pure domain), and **Front-end patterns**
(view/logic separation, no business rules in templates, smart vs dumb components, single source of
truth for state, typed API client, accessibility mandatory). New projects via `specnaut init`
inherit all four blocks automatically. `specnaut upgrade` delivers updated agents and skills but
does **not** rewrite an existing constitution — to adopt the new baselines in an existing project,
rebase your constitution manually.
Refine the generated constitution and the root `AGENTS.md` for your stack, then move on to
`/specnaut specify ""` for your first feature.
### Add Specnaut to an existing project
```bash
cd my-existing-project
specnaut init --here
```
Specnaut merges its `.gitignore` block into your existing file (non-destructively, fenced with
`# --- Specnaut: gitignore ---` markers). Other specnaut-managed files use upgrade-aware semantics:
if you customize a generated file, `specnaut upgrade` will preserve it unless you pass `--force`.
**Declaring a file preserved across a forced refresh.** `specnaut upgrade` auto-preserves a file
whose hash diverged, but `specnaut init --force` would otherwise overwrite every managed file. To
keep a customized file (e.g. a tailored `.claude/agents/product-owner.md`) even through a forced
refresh, list it in a version-controllable `.specnaut/preserve.yml` manifest:
```yaml
preserved:
- .claude/agents/product-owner.md
```
Both `init --force` and `upgrade` then leave that file untouched and print one notice per preserved
path — never a silent skip. The file stays lock-tracked, so `specnaut diff` keeps showing how it has
drifted from the evolving bundle. A project with no `preserve.yml` behaves exactly as before. To
deliberately discard a customization and restore the bundled version for one run, add
`--reset-preserved` (it overrides every declaration for that run and reports each override; it is
never the default). A declared path that is not a managed bundle file is reported as an ineffective
declaration (a warning) rather than silently honored.
Pass `--dry-run` to preview the plan without touching disk — combined with `--force` it shows which
files would be overwritten and which would be merged, but writes nothing. `--dry-run` is the trump
card: it wins over `--force`.
**Inside a monorepo workspace?** When the target sits inside an enclosing Specnaut workspace (an
ancestor with `.specnaut/` whose `deno.json` `workspace` list declares the target as a member),
`specnaut init` and `specnaut upgrade` provision `.specnaut/` as usual but skip the agentic files
(`.claude/skills`, `.claude/agents`, `.claude/commands`) — those are inherited from the parent, so
no copy is scattered into the sub-repo. To force full provisioning anyway, drop an empty
`.specnaut/standalone.yml` marker in the target.
### What's in `.specnaut/installed.lock` and should I commit it?
`specnaut init` writes a small YAML file at `.specnaut/installed.lock`. It records the harness you
chose, the templates version installed, and a SHA-256 + install timestamp for every file Specnaut
emitted. It contains no secrets — only file paths, content hashes, and version strings.
**Commit it.** `specnaut upgrade` reads this lock to know which harness to map templates to, to
detect files you have customized (so it doesn't clobber them), and to drop orphaned files that are
no longer part of the bundle. `specnaut check --project` also surfaces the harness, templates
version, and backlog backend from this file (and warns when `backlog-config.yml` has empty required
fields for the github / gitlab backends). Without the lock, both commands degrade gracefully but
cannot do their real job — `specnaut upgrade` will refuse and ask you to re-run
`specnaut init --here --force` to rebuild the lock from scratch.
### Pick a different harness
```bash
specnaut init my-project --ai cursor
specnaut init my-project --ai antigravity
specnaut init my-project --ai codex
# … etc.
```
Seven harness targets are supported: `claude` (default), `cursor`, `codex`, `windsurf`, `copilot`,
`opencode`, `antigravity`. Each emits files in the convention that harness expects.
### Pick a backlog backend
```bash
specnaut init my-project --backlog github
specnaut init my-project --backlog gitlab
specnaut init my-project --backlog local # default
```
Three backends are supported: `local` (default), `github`, `gitlab`. See
[Backlog as product source of truth](#3-backlog-as-product-source-of-truth) for what each one stores
and how the PO agent talks to it.
#### Pre-fill the backlog config with `--backlog-url`
When the chosen backend is `github` or `gitlab`, `specnaut init` can take the project's Kanban URL
up front and write a fully-populated `.specnaut/backlog-config.yml` — no manual edit needed before
running `/backlog`. Pass the project URL via `--backlog-url`:
```bash
# GitHub org-owned project
specnaut init --here --ai claude --backlog github \
--backlog-url https://github.com/orgs/myorg/projects/1
# GitHub user-owned project
specnaut init --here --ai claude --backlog github \
--backlog-url https://github.com/users/alice/projects/12
# GitLab (gitlab.com or self-hosted)
specnaut init --here --ai claude --backlog gitlab \
--backlog-url https://gitlab.com/mygroup/myproject
```
Three URL formats are supported:
- GitHub org-owned: `https://github.com/orgs//projects/`
- GitHub user-owned: `https://github.com/users//projects/`
- GitLab project: `https:////`
For GitHub, the `repo:` field of the populated config is derived from `git remote get-url origin`
(both HTTPS and SSH remote shapes are recognised). Pass `--backlog-repo /` to override
that derivation when the project lives across multiple repos or the local remote isn't `origin`.
Without `--backlog-url` on a TTY, `specnaut init` interactively prompts for the URL after the
backend picker. **In non-TTY mode (CI / scripted setup) `--backlog-url` is required when `--backlog`
is `github` or `gitlab`** — omitting it exits with code `2` and a clear error message. The
non-clobber invariant still holds: re-running `init` against a project with an existing
`backlog-config.yml` does NOT overwrite it.
### Run `init` non-interactively (CI / scripts)
When you pass both `--ai` and `--backlog` (and `--backlog-url` when the backend is remote), no
interactive prompt is shown — `specnaut init` runs fully unattended, which is what you want in CI or
scripted setup:
```bash
# Local backend — zero-config, just the two flags
specnaut init my-project --ai claude --backlog local
# GitHub backend — --backlog-url is required in non-TTY mode
specnaut init my-project --ai claude --backlog github \
--backlog-url https://github.com/orgs/myorg/projects/1
# GitLab backend — same shape
specnaut init --here --no-git --ai cursor --backlog gitlab \
--backlog-url https://gitlab.com/mygroup/myproject
```
Without those flags, `specnaut init` shows an arrow-key picker (↑/↓ to move, space/enter to select)
when stdin is a TTY, and falls back to a numeric prompt — or the defaults — when stdin is piped.
### Pick a versioning scheme
`specnaut init` asks which scheme to use for the bundled `/specnaut tag-version` and
`/specnaut release-version` commands. Two options:
- **SemVer** (`v1.2.3`) — recommended for libraries / SDKs whose consumers reason about breaking
changes by version number.
- **Date-based** (`vYY.M.Da`) — recommended for apps / SaaS / deployed products where the version
number is just a release identifier. No major/minor/patch guesswork; the letter suffix handles
same-day re-tags.
Specnaut pre-selects a sensible default by scanning the project for SemVer signals:
- **Library publishing markers** — `package.json` `exports`, `pyproject.toml` `[project]` /
`[tool.poetry]`, `Cargo.toml` `[lib]`, `composer.json` `type=library`.
- **Semver-shaped git tags** — any local tag matching `v?MAJOR.MINOR.PATCH` (with optional
pre-release / build suffix), e.g. `v1.2.3`, `1.0.0-rc.1`, `v2.0.0+build.5`. Date-shaped tags like
`v25.5.16a` are explicitly excluded so brownfield repos already on date scheme don't get
mis-suggested.
- **CHANGELOG.md** — Keep-a-Changelog style headers (`## [1.2.0]`, `## v1.2.0`, `## 1.2.0`).
Any one signal flips the suggestion to SemVer. When zero signals are found, Specnaut suggests
date-based. The user can always override at the picker. The choice is persisted by **rewriting the
scaffolded skill** itself (the unchosen scheme's blocks are stripped at scaffold time), so the
on-disk `.specnaut/scripts/release/tag.sh` only contains the chosen scheme's logic. To switch
schemes later, re-run `specnaut init` and pick the other option.
Pass `--scheme semver|date` to bypass the picker in non-TTY mode.
### Other commands
```bash
specnaut check # diagnose your environment
specnaut check --project # also diagnose the current specnaut project
# (warns if the plugin was uninstalled after migration)
specnaut upgrade # update templates to the binary's version
# (when specnaut-plugin is installed + harness=claude:
# vanilla agent/command files are auto-migrated to the plugin)
specnaut upgrade --dry-run # preview the upgrade plan
specnaut upgrade --force # apply destructive changes (backs up customizations)
specnaut upgrade --reset-preserved # ignore .specnaut/preserve.yml for this run (reports each override)
specnaut diff # show how managed files diverge from the bundle (read-only)
specnaut diff --only-customised # restrict the diff to files you actually changed
specnaut init --here --force --reset-preserved # forced refresh that overrides preserve declarations
specnaut reconcile --status # list files pending post-upgrade reconciliation (JSON)
specnaut reconcile --accept-upstream # take new template version (backs up local)
specnaut reconcile --accept-current # keep local version (re-stamps lock SHA)
specnaut self-update # upgrade the binary itself
specnaut self-update --check # only report whether an update is available
specnaut --version # print version
specnaut --help # full usage
```
### Bundled tag + release commands
Every scaffolded project ships two router commands under `/specnaut`:
- **`/specnaut tag-version`** — creates an annotated git tag using the project's versioning scheme.
Bumps automatically (latest tag → next). For SemVer, `--bump major|minor|patch` controls the
direction (default `patch`); for date-based, the letter suffix increments. Pushes to `origin` if a
remote is configured, else stays local. Pass `--no-push` to skip.
- **`/specnaut release-version`** — generates **categorized release notes** for a tag (default:
latest) covering every commit since the previous tag. The output is the release-body Markdown, one
section per non-empty Conventional Commits bucket (Features / Bug Fixes / Performance / Refactors
/ Documentation / Tests / Build & CI / Chores / Style / Other). Pipe the output into
`gh release create` / `glab release create` to publish.
The scripts live at `.specnaut/scripts/release/{tag,release}.sh` — the same path across all 8
harnesses.
For **GitHub**-hosted projects, the bundled `release-github.sh` wrapper is the one-command path:
```bash
bash .specnaut/scripts/release/release-github.sh # latest tag, auto-baseline, publish
bash .specnaut/scripts/release/release-github.sh --draft # create as draft
```
For **GitLab**-hosted projects, `release-gitlab.sh` mirrors the same contract:
```bash
bash .specnaut/scripts/release/release-gitlab.sh # latest tag
bash .specnaut/scripts/release/release-gitlab.sh v1.2.3 # specific tag
```
Both wrappers compute the baseline as **the previous tag with a published release attached** (not
the previous tag by date) — tags pushed without a release are "subsumed" and their commits land in
this release, with the subsumed tag names listed inline. They push the tag to `origin` if needed,
then call `gh release create` / `glab release create`. Idempotent: a second run against an
already-released tag exits 0 with an explanatory message.
For **local-only** projects (no remote, or you just want a Markdown artifact), the bundled
`release-local.sh` wrapper writes the categorized body to a file:
```bash
bash .specnaut/scripts/release/release-local.sh # latest tag → RELEASE_NOTES_.md
bash .specnaut/scripts/release/release-local.sh --out NOTES.md v1.2.3
```
No remote API calls, no auth — paste the output into any release UI, attach to a deploy email, or
pipe to a custom publisher.
## Available harnesses
| Key | Display name | Output root |
| ------------- | ------------------ | ----------------------- |
| `claude` | Claude Code | `.claude/` |
| `cursor` | Cursor | `.cursor/` |
| `codex` | Codex CLI | `.codex/`, `.agents/` |
| `windsurf` | Windsurf | `.windsurf/` |
| `copilot` | GitHub Copilot CLI | `.github/instructions/` |
| `opencode` | OpenCode | `.opencode/` |
| `antigravity` | Antigravity | `.agent/` |
All harnesses share the same source-of-truth content in `templates/core/`. The per-harness adapters
in `src/infrastructure/harness/` map that core bundle to each harness's directory layout and
frontmatter conventions.
Some harnesses also ship harness-specific helper files alongside the core scaffold:
- **Claude** — `.claude/CLAUDE.md` (harness reference, including `/goal`, `/loop`, and
`claude agents` usage notes) + `.claude/loop.md` (default prompt for `/loop`, Claude's recurring
periodic-maintenance feature).
- **Codex** — `.codex/AGENTS.md` (harness reference) + `.codex/goal.md` (default prompt for `/goal`,
Codex's experimental one-shot long-horizon feature; enable via `goals = true` under `[features]`
in `config.toml`).
## Project-specific skill overlays
Specnaut's skill folders are plain markdown — anything you put under your harness's `skills/`
directory (e.g. `.claude/skills//`, `.cursor/skills//`) is a skill, full stop. To make
the common "override an upstream skill" pattern discoverable, Specnaut recognises two optional
fields in `SKILL.md` frontmatter:
| Field | Meaning |
| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `alias_of: ` | This skill is a thin wrapper that delegates to the named upstream skill. Dotted notation (e.g. `specnaut.tag-version`) makes the distribution explicit. |
| `overlays:` | A list of pre/post hooks. Each entry carries `when: before \| after` and `path: ./scripts/.sh` relative to the SKILL.md. |
The Specnaut binary itself **never resolves or dispatches** aliases / overlays — the harness (Claude
Code, Cursor, Codex, …) is responsible for honouring the frontmatter at invocation time. Specnaut's
role is to standardise the contract.
To see what's installed and which aliases / overlays are active:
```bash
/specnaut list-skills
```
The phase walks your harness's skills directory, parses every SKILL.md frontmatter, and renders a
`NAME · KIND · ALIAS OF · OVERLAYS · DESCRIPTION` table. Skills without `alias_of` show
`KIND = skill`; aliases show `KIND = alias` and the target.
A reference example lives at
[`templates/core/skills/alias-example/SKILL.md`](https://github.com/specnaut/specnaut-cli/blob/main/templates/core/skills/alias-example/SKILL.md)
in the Specnaut source tree. It is **not** installed by `specnaut init` — copy it manually when you
want to introduce your first alias.
## What makes Specnaut different from upstream Spec Kit
Specnaut is a fork of the official `specify` CLI with the following additions:
### 1. Auto-chained pipeline
The generated `specify` skill chains `clarify → plan → tasks → analyze → implement → review → merge`
in a single session. Upstream stops at every step and asks the human to invoke the next one.
Specnaut only stops twice: when clarification is genuinely required, and once before merging.
The chain is invoked through the bundled `/specnaut` skill:
```
/specnaut specify ""
```
When the idea is still fuzzy and you can't yet write that one-line description, start one phase
earlier with the optional **step 0**:
```
/specnaut brainstorm ""
```
`brainstorm` runs a discovery dialogue (one question at a time, 2–3 approaches, design approval),
then chains into `specify` with the agreed brief — so `brainstorm → specify → clarify → …` flows in
one session. It reuses the bundled `brainstorming` skill for the dialogue; when your brief is
already clear, skip it and start at `specify`.
Two checkpoints inside the chain:
- **STOP #1 — clarify** runs after `clarify`. If `spec.md` still has `[NEEDS CLARIFICATION]`
markers, the model surfaces the top 3 questions and waits. Once you answer, the chain resumes
automatically. If there are no markers, the chain continues silently.
- **STOP #2 — pre-merge** runs after `review`. The model summarises the work (files changed, tests,
open risks, business outcome) and asks `Ready to merge?` before invoking `merge`. Reply `yes` to
finish.
#### Linking a feature to a backlog issue
Pass `--issue ` to `/specnaut specify` (or to the bundled `create-new-feature.sh`) to record the
originating backlog issue in `.specnaut/feature.json`:
```
/specnaut specify "Fix the off-by-one in pagination" --issue 42
```
After `/specnaut merge` fast-forwards the branch onto `main` and you push, the merge phase reads
`feature.json.linked_issue`, runs `cascade-check.sh` (github / gitlab) to confirm no sub-issues
block the close, asks `Close issue #42 on the board now? (yes/no)`, and on `yes` flips the project
column to `Done` via `move.sh` then dispatches the `product-owner` agent to post a close comment
with the merged commit range and `gh issue close --reason completed`. The board stays in sync with
`main` instead of drifting.
`--issue` is opt-in; existing feature trees without the field skip the auto-close silently.
To opt out of the chain entirely (run only `specify` and stop):
```
/specnaut specify --manual ""
```
#### Mid-chain re-entry
Any phase other than `specify` can also enter the chain when invoked mid-flow — useful for two real
workflows:
- **Manual review between early phases** — read `spec.md` after `specify` lands, then
`/specnaut clarify N` resumes the chain through `plan → tasks → … → STOP #2`.
- **Context-budget recovery** — open a fresh session after compaction and run
`/specnaut implement N` to pick up the tail (`→ review → STOP #2`).
The default is **context-aware**: if downstream artefacts under `.specnaut/specs//` are
missing, the chain fires; if they exist, the invocation is treated as a single-phase re-run (so
regenerating `plan.md` doesn't accidentally cascade through the rest). Two explicit overrides when
the default guesses wrong:
- `/specnaut N --continue` — force the chain regardless of artefact state.
- `/specnaut N --once` — force one-shot regardless.
#### Lite chain (small-feature shortcut)
For small, single-file features (markdown documentation, agent definitions, README / AGENTS / CLAUDE
/ CHANGELOG tweaks), the chain runs in a lighter shape that skips `clarify` and `tasks`:
```
specify → plan → analyze → implement → review → merge
```
Selection happens once, in `phases/specify.md`:
- The router's `--lite` / `--full` flag forces the shape and skips the heuristic.
- Otherwise the brief is scored against `phases/lite-heuristic.md` (file-path hints like `.md` /
`AGENTS.md`, verb hints like `write` / `document`, subject hints like `doc` / `paragraph`, brief
length, suppressors like `API` / `migration` / `auth`). On a positive score the user is prompted
once: `This brief looks small — run the lite chain? [Y/n]`.
- The chosen shape persists to `.specnaut/feature.json` as `workflow_shape: "lite" | "full"`.
Downstream phases consult it at every chain transition. Backward-compat: absent field treated as
`"full"`.
In lite mode STOP #1 is n/a (no `clarify` phase runs) — `specify` makes informed guesses for
ambiguities and records them in the spec's Assumptions section. STOP #2 (pre-merge) behaves
identically to the full chain.
```
/specnaut specify --lite "Document the OSS/proprio boundary in AGENTS.md"
/specnaut specify --full "Add OAuth2 login" # opt out of auto-detected lite
```
The `/specnaut-auto` slash command is kept for one release as a deprecation alias and will be
removed in the next major version.
### 2. `review` phase post-implement
After `implement`, the generated workflow runs a dedicated `review` phase that checks structure
(architecture boundaries, silent error swallowing, leaked internal IDs, cache layering, test
coverage) and the quality gates (format, lint, typecheck, tests). If `review` flags something, the
loop is `implement → review → fix → re-review` — also automatic.
### 3. Developer agent doctrine
Every scaffold ships a `developer` agent that implements tasks from `tasks.md`. The agent operates
under a strict doctrine that applies to every task, regardless of project stack:
**Domain Model gate (NON-NEGOTIABLE)** — before writing a single line of code, the developer reads
the `## Domain Model` block in `spec.md` (spec path) or in the Product Owner's `/backlog brief`
output (direct-implementation path). If the block is absent, empty, or still contains template
placeholders, the agent halts and returns `BLOCKED` with reason
`awaiting:product-owner-domain-brief`. The `implement` phase skill enforces the same gate — it reads
the section at step 3 and surfaces the same BLOCKED report with a recommendation to run
`/specnaut clarify` first. The `clarify` phase cannot advance while the Domain Model is incomplete,
and `specify` step 5.7 is responsible for populating the full block (Bounded context, Vocabulary,
Entities, Value objects, Invariants, Out of scope) rather than just listing key entities.
**Test-Driven Development (NON-NEGOTIABLE)** — red → green → refactor on every implementation. No
business logic ships untested. If the project has no test infrastructure, the developer bootstraps
the language-idiomatic test runner (Vitest for TS/JS, Pytest for Python, JUnit for Java, `go test`
for Go, `cargo test` for Rust, PHPUnit for PHP, RSpec for Ruby, etc.) as part of the task and
records it explicitly in the `Decisions` block of the completion report.
**Domain-Driven Design (NON-NEGOTIABLE)** — every change respects the project's domain boundaries.
Domain layer stays pure (no I/O, no framework). Application layer holds use cases and ports.
Infrastructure layer holds adapters. Presentation talks only to use cases. Cross-bounded-context
bleed-through is forbidden — split or use an anti-corruption layer.
**Boy Scout Rule with escalation** — small in-scope cleanups (≤ 1 file, ~15 lines of diff, no public
API change) are done in the same PR and noted in `Decisions`. Larger out-of-scope cleanups are
logged in a `Tech debt surfaced` block of the completion report rather than ballooning the PR. The
Product Owner reads that block and opens a classified tech-debt ticket (`tech-debt` label, default
Size XS/S, Priority P3, bumped to P2 on correctness/security risk) for each item.
**SOLID / DRY / KISS / YAGNI** — explicitly required. Framework-specific patterns (Repository, DI,
React hooks, MVC controllers) come from the constitution's Back-end and Front-end pattern blocks.
**No silent catches** — every `catch` either logs at ERROR/WARN or re-throws. **In-code
documentation** — doc-comments on every function, method, or class encoding a business rule or
non-obvious design decision, in the idiomatic format for the language.
### 4. Backlog as product source of truth
A Product Owner agent gates every mutation, and supports three backends:
- **Local Markdown** (`--backlog local`, default) — index at `.specnaut/backlog.md`, task files at
`.specnaut/backlog/NNN-slug.md` (typed frontmatter: id, title, category, priority, complexity,
status, parent, depends_on, spec, tags, created). Sub-tasks reference their parent via
`parent: "#NNN"`.
- **GitHub Issues + Projects** (`--backlog github`) — the agent talks directly to the backend via
`gh` CLI; epics use the native sub-issues API. Read paths use
`gh issue list/view --json
projectItems` (REST-ish CLI projection of Project V2 fields, ~1–2
GraphQL points per call), and raw `gh api graphql` is reserved for the one operation with no CLI
equivalent (`gh project
item-edit`'s underlying `updateProjectV2ItemFieldValue` mutation). Keeps
backlog grooming under the shared 5,000-points-per-hour GitHub API quota. No local mirror, no sync
command — the remote is the source of truth.
- **GitLab Issues** (`--backlog gitlab`) — the agent talks to GitLab via `glab` CLI. Status is
tracked via scoped `Status::*` labels rather than a native column field; sub-tasks use a
`parent::#NNN` scoped label (Free-tier compatible — native GitLab Epics are Premium-only).
Otherwise the model mirrors the GitHub backend (no local mirror, no sync command).
The user picks one backend per project. The chosen backend is recorded in `.specnaut/installed.lock`
so the PO knows which one to use without auto-detection.
**Semantic label bootstrap.** For GitHub and GitLab backends, `specnaut init` scaffolds
`.specnaut/scripts/backlog/ensure-labels.sh`. Run it once to seed seven canonical labels —
`security`, `refactor`, `docs`, `tech-debt`, `dx`, `performance`, `dependency` — into the remote
repo. Idempotent; never edits or deletes existing labels. The GitHub default `bug` label is verified
but never re-created. The full reference lives in `.specnaut/LABELS.md` next to the install —
including a guidance note for local backend users on tagging via task-file frontmatter.
**Mandatory classification — every groomed item is sized, prioritised, typed, and labelled.** The PO
classifies every item it creates or clarifies along four axes — Size, Priority, Issue Type (`Task` /
`Bug` / `Feature`), and at least one label — before the item is done; classification is a gate, not
optional polish. On a GitHub project with native `Priority` / `Size` single-select fields and native
Issue Types, the bundled
`.specnaut/scripts/backlog/set-field.sh ` writes each to
its native field or type — and the PO **never** also applies a `priority:*` / `size:*` / `type:*`
label on an item that already carries the native value. Labels are a strict fallback for projects or
orgs without the native field/type; they are not a peer signal. `set-field.sh` exit codes tell the
caller which path applies: `0` = set, `10` = field/type absent (fall back to label), `11` = value
unrecognised, `12` = issue not on the project / not in the repo. `detect-fields.sh` (run once per
groom) emits the field/option IDs into env vars for case-insensitive matching. On GitLab the four
axes are scoped labels via `glab`; on the local Markdown backend they live in task-file frontmatter.
**Bounded context (soft fifth axis)** — a `domain:` label (e.g. `domain:checkout`) is
optional on mono-domain projects but the `## Domain Model` block in every `/backlog brief` output is
always mandatory. Items touching ≥ 2 bounded contexts automatically trigger the epic detection
heuristic with reason "cross-bounded-context".
**`/backlog brief` — Domain Model is mandatory.** Every brief the PO generates for a developer MUST
include a `## Domain Model` block with: Bounded context, Vocabulary (ubiquitous language), Entities
(with aggregate root flag), Value objects, Invariants, and Out of scope. A brief without this block
is incomplete — the PO clarifies with the user before issuing it. If a `spec.md` is attached, the
block is written into the spec too (the spec template carries the section).
**Tech-debt intake protocol.** When a developer's completion report carries a `Tech debt surfaced`
block, the PO parses it, deduplicates against the current backlog, and opens one classified ticket
per surfaced item: `tech-debt` label, default Size XS or S, Priority P3 (bumped to P2 when the item
involves correctness or security risk). This is automatically triggered — no manual step required.
**Epics & sub-tasks.** Big work that needs decomposition lives as a parent **epic** with one or more
**sub-tasks**. The link mechanism differs per backend, but the contract is the same: parents cannot
close while any child is still open.
| Backend | Parent → child link |
| ------- | ------------------------------------------------------------------------------------------------------------------ |
| local | `parent: "#NNN"` in the child's frontmatter, plus a `## Sub-tasks` cross-link in the parent file |
| github | Native sub-issues API — children render automatically under the parent's "Sub-issues progress" field on Project V2 |
| gitlab | Scoped label `parent::#NNN` on the child (Free-tier compatible) |
Create a child on any backend with the bundled `add.sh --parent ` flag — the script writes the
link, attaches to the project/board, and refuses (exit 3) when the named parent doesn't exist:
```bash
.specnaut/scripts/backlog/add.sh "Child title" "Child body" "" --parent 42
```
The bundled `cascade-check.sh ` (github + gitlab) is the close gate — exits 11 with the open
children listed when close is unsafe, exits 12 (informational) when the parent is already closed so
callers don't issue a redundant `close` and 422, exits 3 when the parent doesn't exist, exits 0 when
all children are closed. The PO runs it before `gh issue close` / `glab issue close`. The local
backend uses an inline `grep` equivalent.
A companion `propagate-parent-status.sh` keeps the parent's board column honest as children move: a
child entering **In progress** or **In review** promotes a stalled parent (Backlog or Ready → In
progress), and once every child reaches **Done** the parent rolls up to Done. A child moving to
**Ready** is deliberately a no-op — Ready means groomed-and-waiting, not active work, so it must not
promote the parent.
The Product Owner agent **proactively** proposes epic decomposition during `/backlog add` and during
grooming whenever a request crosses ≥2 subsystems, has more than 5 acceptance-criteria bullets, or
carries trigger phrases like "break down", "phased", "rewrite", "end-to-end". Obvious splits get
auto-created; ambiguous ones get a concrete sub-task list back as a question. You don't have to ask
for the breakdown — the PO surfaces it on its own.
### 5. Claude Code plugin distribution
Specnaut ships a first-class Claude Code plugin (`specnaut-plugin`) available via the Claude Code
marketplace:
```
/plugin install specnaut/specnaut-cli-plugin
```
The plugin gives any Claude Code user instant access to the full Specnaut slash-command suite and
sub-agents — no binary, no `specnaut init` required. The plugin assets (the consolidated `specnaut`
router skill with 19 phase docs including the five `audit-*` axes — `security` / `performance` /
`accessibility` from Epic #302 and `architecture` / `dependencies` from Epic #320, the
`specnaut-review` auto-invoke alias, the deprecated `specnaut-auto` alias, and 15 sub-agents
including the manual-only `performance-auditor`, `a11y-auditor`, `architecture-auditor`, and
`dependency-auditor` introduced with the audit family) are namespaced under `/specnaut-plugin:*` so
they coexist with project-local copies without collision.
When both the plugin and the binary are in use, `specnaut upgrade` detects the plugin and
auto-migrates vanilla on-disk agents and command files (backed up, then deleted — the plugin serves
them going forward). `specnaut check --project` warns when covered files are missing and the plugin
is not installed, with a recovery hint.
### 6. Bundled `specnaut-expert` agent
Every scaffold ships a `specnaut-expert` agent that knows Specnaut itself — its commands, harnesses,
backlog backends, and what changed between releases. It auto-triggers on Specnaut-related questions
("how does specnaut X", "what is /specnaut Y", "quoi de neuf") so users on a Specnaut-scaffolded
project can ask the harness about the tool without copy-pasting docs. It uses a vendored knowledge
snapshot for offline / deterministic answers and `WebFetch` against
+ the GitHub Releases API for live "what's new" queries.
Manual dispatch via `/specnaut-expert ` is also supported.
The agent also handles **bug reports**: ask "report this as a bug" (or hit a Specnaut failure) and
it pre-fills a structured GitHub issue against `specnaut/specnaut-cli` with a 6-section template (Summary
/ Repro / Observed / Expected / Environment / Logs), auto-populating the environment block from
`.specnaut/installed.lock` + `specnaut --version` + `uname -srm`, scrubbing common token shapes
(GitHub PATs, GitLab PATs, Anthropic / OpenAI keys, AWS access keys), and handing you a pre-filled
`https://github.com/specnaut/specnaut-cli/issues/new?…` URL to review and submit. The agent never
auto-submits — you always see the body before clicking.
### 7. Bundled `security-auditor` agent — two modes
Every scaffold also ships a `security-auditor` agent with two dispatch shapes:
1. **PR review** — spawned by the `review-coordinator` during `/specnaut review`. Audits the diff
against eight rules (secrets in source, input validation, authz, injection, path traversal, SSRF,
silent catches, internal-ID exposure) and emits a `FINDING` / `VERDICT` report.
2. **Alert triage** — invoked by the maintainer's `/release` flow when the `security-preflight`
workflow surfaces open GitHub-side alerts (secret-scanning, dependabot, code-scanning, private
advisories). The agent decides per-alert: open a backlog ticket via the PO, dismiss via
`gh api -X PATCH` with a documented `resolution=` reason, or escalate to the user.
The triage mode is release-time only and uses a tightly-constrained `Bash` grant — only the three
`gh api` alert-dismissal endpoints are permitted. End users never trigger this mode; PR review
remains the user-facing path.
### 8. Bundled `ui-ux-designer` agent
Every scaffold also ships a `ui-ux-designer` agent that owns a single source of truth — the
project's `DESIGN.md` — that every other agent consults to keep generated UI on-brand. Three modes
auto-select from `DESIGN.md` state:
1. **Discovery** — when `DESIGN.md` is absent. The agent runs a 2-4 question interview (project +
audience, visual mood, brand seed, optional stack hint) and writes a complete first `DESIGN.md`
from a canonical template covering typography, palette (light + dark with WCAG-AA contrast
rules), 4-point spacing scale, radius / shadow tokens, component primitives, and motion.
2. **Edit** — when `DESIGN.md` is present and the dispatch is a refactor request. The agent edits
the spec in place with a one-line rationale per change and a Decision-log append.
3. **Audit** — when the dispatch contains the word `audit`. The agent scans
`**/*.{tsx,jsx,vue,
svelte,html,css,scss}` under `src/` for literal hex colours, off-system
fonts, and off-grid spacing values, reports drift in a
`| File | Line | Found | Expected token | Severity |` table, and emits `clean` / `drift_minor` /
`drift_major`.
The agent is **manual-dispatch only** (`disable-model-invocation: true`) — design decisions are
intentional and the agent never auto-runs. It produces Markdown, never code; the developer agent is
what translates `DESIGN.md` into a Tailwind theme, CSS vars, or component library. `DESIGN.md` is
NOT scaffolded by `specnaut init`; it materialises on the agent's first invocation when the user
actually wants a design system, so backend-only and CLI-only projects don't carry stub spec files
they never read.
## Design principles
- **Agnostic of the user project's language** — Python, TypeScript, Go, PHP, Rust… your project,
your stack.
- **Agnostic of the LLM** — Claude, OpenAI, Gemini, local models, anything your harness supports.
- **Agnostic of the AI harness** — eight first-class targets today, with the same core content for
all.
- **Agnostic of the backlog source** — pick local Markdown or your remote tracker (GitHub Issues +
Projects, GitLab Issues; Bitbucket planned). The PO agent talks to whichever you chose.
- **Single binary** — distributed via `deno compile` for macOS arm64/x64, Linux arm64/x64, and
Windows x64. No Python, no `pip`, no extra runtimes on the user's machine.
## Contributing
### Agent adoption
Every `feat:` PR body must include an `## Agent adoption` section with a `` ```prompt `` fenced
block. The release pipeline extracts these into a structured `### Adoption guide` block on the
GitHub Release; `specnaut-expert review-upgrade` plays them back in the user's project after
`specnaut upgrade`.
See the
[CONTRIBUTING guide](https://github.com/specnaut/specnaut-cli/blob/main/CONTRIBUTING.md#agent-adoption)
for the convention and examples. The CI workflow `pr_adoption_lint.yml` in the CLI repo enforces
presence.
### Release notes shape
GitHub Releases bodies are auto-generated by `scripts/gen-changelog.ts` and follow this structure:
1. `### Features` — `feat:` commits, one bullet per commit.
2. `### Bug fixes` — `fix:` commits.
3. `### Adoption guide` — one block per `feat:` PR that has a `## Agent adoption` section in its PR
body. Format:
````
**#NNN — Feature title**
```prompt
```
````
Consumed by `specnaut-expert review-upgrade` (Phase 4 of the upgrade adoption flow).
4. `### Internal / chores` — `chore:` / `refactor:` / `docs:` / `test:` etc., collapsed under a
`` block.
If a `feat:` PR is merged without `## Agent adoption`, `gen-changelog.ts` emits a stderr warning
during the release workflow. The release still ships; the missing entry can be amended manually in
the GitHub Release body.
## Upgrades & adoption
`specnaut upgrade` updates templates in place and prints a handoff line inviting the user to review
what changed via the `specnaut-expert` agent.
### Files written
- `.specnaut/upgrade-pending.json` — a marker recording the upgrade range:
```json
{
"from": "1.4.0",
"to": "1.6.0",
"at": "2026-05-16T14:33:00.000Z"
}
```
Written on every successful apply. On chained upgrades, the existing marker's `from` is preserved.
Consumed by `specnaut-expert review-upgrade` and by `specnaut reconcile`. Deleted by the agent at
the end of a successful review.
- `.specnaut/upgrade-staging/` — for every file the upgrade preserved (i.e., on-disk version
was customized vs. lock SHA), the upstream (bundled-template) version is written here under the
same relative path. The on-disk project file is untouched. The staging directory is the source for
`specnaut reconcile` (see below).
Both are gitignored (`templates/core/root/.gitignore` ships the lines).
### Handoff line
`specnaut upgrade` ends with:
```
✓ upgraded to templates 1.4.0 → 1.6.0
→ Walk through what's new with your AI:
`@specnaut-expert review-upgrade`
```
An AI agent that sees `.specnaut/upgrade-pending.json` in a project should proactively suggest
running `@specnaut-expert review-upgrade`.
### `specnaut reconcile`
Per-file post-upgrade reconciliation. Run after `specnaut upgrade` for each file that was preserved
(customized locally — see the `Upgrades & adoption` section for context).
```
specnaut reconcile --status
Print JSON listing files currently pending reconciliation. Reads
`.specnaut/upgrade-staging/`. Output:
{
"pending": [".claude/agents/developer.md", ...],
"stagingDir": ".specnaut/upgrade-staging" | null
}
specnaut reconcile --accept-upstream
Take the new template version for . Backs up the local file to
`.specnaut.bak`, copies upstream content from
`.specnaut/upgrade-staging/` into place, and updates the lock
SHA. Removes the staging entry.
specnaut reconcile --accept-current
Keep the local customized version. Re-stamps the lock SHA to match
on-disk content, so the next upgrade does not re-flag this file as
preserved. Removes the staging entry.
```
`specnaut-expert review-upgrade` is the recommended way to walk through reconciliation interactively
— it surfaces a `keep / take / merge / view / skip` choice per file and dispatches the `developer`
subagent for intelligent merges.
### `specnaut-expert review-upgrade`
Dispatching `@specnaut-expert review-upgrade` triggers a 7-step guided workflow inside the
`specnaut-expert` agent:
1. **Read marker** — reads `.specnaut/upgrade-pending.json`; exits with instructions if absent.
2. **Fetch releases** — fetches GitHub Release bodies for every tag in `(from, to]` and parses each
`### Adoption guide` section into structured adoption prompts. Falls back to the vendored
snapshot if the GitHub API is unreachable.
3. **Present plan** — shows releases in range, adoption prompts count, and files pending
reconciliation (`specnaut reconcile --status`). Offers to create a branch
`specnaut-upgrade-v{to}` for review-as-PR.
4. **Walk adoption prompts** — presents each prompt one by one with four options: `[a]` run it
(dispatches `developer` agent), `[s]` skip, `[c]` show raw prompt, `[q]` quit.
5. **Reconcile customized files** — for each file pending reconciliation, shows a diff summary and
offers `[k]` keep local, `[t]` take upstream, `[m]` intelligent merge (dispatches `developer`),
`[v]` view full diff, `[s]` skip.
6. **Cleanup** — when both walks complete with nothing skipped, deletes the marker and (if on the
review branch) commits a summary. Skipped items are left on disk for the next `review-upgrade`
run.
Trigger keyword: `review-upgrade` in the dispatch message.
## Repository
Source, releases, and issue tracker:
**[github.com/specnaut/specnaut-cli](https://github.com/specnaut/specnaut-cli)**.
The `AGENTS.md` file at the repo root is the canonical context document for any future Claude Code,
Codex, or other agent session contributing to the project itself.
## Recent releases
- [v1.13.1](https://github.com/specnaut/specnaut-cli/releases/tag/v1.13.1) — Thin orchestrator skill + extracted preflight/postflight contracts (#364)
- [v1.13.0](https://github.com/specnaut/specnaut-cli/releases/tag/v1.13.0) — Specflow Cloud funnel — landing CTA + CLI nudges (#361)
- [v1.12.0](https://github.com/specnaut/specnaut-cli/releases/tag/v1.12.0) — Drop Gemini CLI support (Closes #341) (#344)
- [v1.11.1](https://github.com/specnaut/specnaut-cli/releases/tag/v1.11.1) — Add.sh rejects --help and unknown flags instead of creating junk issues (#334)
- [v1.11.0](https://github.com/specnaut/specnaut-cli/releases/tag/v1.11.0) — Don't auto-promote parent Epic on child→Ready (#328)