How many of the spec's requirements were verified for this tool.
See /coverage for the full matrix.
Level
Total
Verified
Unverified
MUST
28
19
9
SHOULD
21
13
8
MAY
10
10
0
Top Issues
FAILEach subcommand's `--help` ships at least one invocation exampleProgressive Help Discoverysubcommands missing example invocations in their `--help`: auth, run, init, add, remove, version, sync, lock, export, tree, format, audit, tool, python, pip, venv, build, publish, workspace, cache, self. Examples teach agents the call shape faster than option tables; use clap's `after_help` or a dedicated `Examples:` block.
FAILDestructive subcommands require `--force` or `--yes`Safe Retries & Mutation Boundariesdestructive subcommand(s) without `--force` or `--yes`: format. Irreversible operations must require explicit confirmation so they can't be invoked accidentally.
WARNStructured output supportStructured, Parseable Output--output/--format flag detected but could not validate JSON via safe probes (--help/--version override output flags in most CLIs)
target satisfies P1 via alternative gate (help-on-bare or stdin-primary)
PASS
Flags advertise env-var bindings in --help
PASS
Secret-bearing flags expose stdin or *-file companion
WARN
`--help` advertises default values for flags
no default-value annotations found in --help. SHOULD-tier — agents reading help text need to see what value a flag falls back to when omitted (`[default: <value>]` per clap convention).
WARN
Rich-TUI affordance for TTY contexts
no rich-TUI affordance detected (no `--tui`/`--interactive`/`--ui` flag, no spinner/progress/tui mention in --help). MAY-tier — rich TUI in TTY contexts is a nice-to-have, not required.
`examples` subcommand or `--examples` flag for curated usage patterns
no `examples` subcommand or `--examples` flag found. MAY-tier — a curated usage block keeps agents from hunting through long help text.
WARN
Short `-h` summary differs from `--help` long form
`-h` and `--help` produce byte-identical output. SHOULD-tier — clap renders the short summary on `-h` and the full description on `--help` when `long_about` is set; collapsing them gives agents no concise list-level grep target.
FAIL
Each subcommand's `--help` ships at least one invocation example
subcommands missing example invocations in their `--help`: auth, run, init, add, remove, version, sync, lock, export, tree, format, audit, tool, python, pip, venv, build, publish, workspace, cache, self. Examples teach agents the call shape faster than option tables; use clap's `after_help` or a dedicated `Examples:` block.
WARN
Help text pairs human and `--output json` example invocations
no paired text + `--output json` example found within 5 lines in top-level or any subcommand `--help`. Pairing keeps agents from reverse-engineering the JSON invocation from the text one.
Destructive subcommands require `--force` or `--yes`
destructive subcommand(s) without `--force` or `--yes`: format. Irreversible operations must require explicit confirmation so they can't be invoked accidentally.
WARN
Read and write surfaces are both visible in subcommand list
write-pattern subcommand(s) present (add, remove, format) but no read-pattern surface detected. If the CLI is write-only by design the MUST is satisfied vacuously; otherwise expose the read surface with agent-recognizable verbs (list/get/show/query/find/search).
pager referenced in --help but no --no-pager escape hatch advertised
PASS
Respects NO_COLOR
WARN
Subcommand verbs follow community-standard names
11/22 subcommand(s) follow standard verb names. Non-standard: lock, export, tree, format, tool, python, pip, venv, workspace, cache, self. MAY-tier — community-standard verbs (get/list/create/update/delete) help agents predict subcommand behavior across CLIs.
PASS
`--color` flag for explicit color control
WARN
Input-accepting commands read from stdin when no file is given
input-accepting subcommand present but `--help` does not mention stdin or `-` as a path placeholder. SHOULD-tier — agents piping data into the tool expect stdin to work when no file arg is provided.
WARN
Subcommand naming follows a consistent verb/noun convention
subcommand naming is inconsistent: 5 non-verb subcommand(s) (tool, python, pip, workspace, self) mix verb and non-verb children at the second level, so an agent cannot predict where the action lives. SHOULD-tier: pick a consistent shape (all verb-first, all noun-verb hierarchy, or any combination where each non-verb group's children are uniformly verbs). The verb list is a heuristic; inspect `--help` to confirm.
`--limit` / `--max-results` flag for list operations
no list-style subcommand detected (list/ls/search/query/find/show/get); vacuous skip for the list-only SHOULD.
SKIP
Cursor-based pagination flags for list traversal
no list-style subcommand detected; vacuous skip for the list-only MAY.
WARN
`--timeout` flag for long-running operations
long-running subcommand present but no timeout flag advertised (looked for --timeout, --deadline, --max-time). SHOULD-tier — without a bound, agents that hit a hung operation have to enforce timeouts externally.
WARN
Help text advertises TTY-aware verbosity behavior
no TTY-aware language found in `--help`. MAY-tier — automatic verbosity reduction when stdout is piped or redirected lets agents skip the explicit `--quiet` flag. Behavioral probes cannot simulate a real TTY without a pty crate, so this audit relies on documented intent.