# codex

Lightweight coding agent that runs in your terminal

**Score:** 74% pass rate
**Principles:** 2/8 met

## Embed the badge

This score (74%) clears the [badge floor](https://anc.dev/badge) (70%). Copy this into your README:

```markdown
[![agent-native](https://anc.dev/badge/codex.svg)](https://anc.dev/score/codex)
```

| Status | Audit | Principle | Evidence |
|--------|-------|-----------|----------|
| PASS | Help flag produces useful output | [P3](https://anc.dev/p3) |  |
| PASS | Version flag works (`--version` plus short alias) | [P3](https://anc.dev/p3) |  |
| PASS | Version flag works (`--version` plus short alias) | [P3](https://anc.dev/p3) |  |
| WARN | Structured output support | [P2](https://anc.dev/p2) | --output/--format flag detected but could not validate JSON via safe probes (--help/--version override output flags in most CLIs) |
| PASS | Rejects invalid arguments | [P4](https://anc.dev/p4) |  |
| WARN | Quiet mode available | [P7](https://anc.dev/p7) | no --quiet/-q flag detected in --help output |
| PASS | Handles SIGPIPE gracefully | [P6](https://anc.dev/p6) |  |
| PASS | Non-interactive by default | [P1](https://anc.dev/p1) |  |
| SKIP | Non-interactive gate flag advertised in --help | [P1](https://anc.dev/p1) | target satisfies P1 via alternative gate (help-on-bare or stdin-primary) |
| PASS | Flags advertise env-var bindings in --help | [P1](https://anc.dev/p1) |  |
| SKIP | Pager-using CLI ships --no-pager escape hatch | [P6](https://anc.dev/p6) | no pager signal (less/more/$PAGER/--pager) in --help |
| PASS | Respects NO_COLOR | [P6](https://anc.dev/p6) |  |
| FAIL | Secret-bearing flags expose stdin or *-file companion | [P1](https://anc.dev/p1) | secret-bearing flag(s) without `*-file` companion or stdin path: --remote-auth-token-env. Flag values leak via process tables, shell history, and CI logs; provide stdin support or a `--<flag>-file` variant. |
| SKIP | Structured-output CLI exposes its schema at runtime | [P2](https://anc.dev/p2) | no structured-output indicator (--output / --format / json / jsonl) in --help |
| WARN | --json / --jsonl short aliases for --output | [P2](https://anc.dev/p2) | no --json or --jsonl short alias found. Agents and pipelines benefit from short forms alongside the canonical `--output` enum. |
| WARN | Subcommand verbs follow community-standard names | [P6](https://anc.dev/p6) | 7/24 subcommand(s) follow standard verb names. Non-standard: review, mcp, plugin, mcp-server, app-server, remote-control, completion, sandbox, debug, working, resume, the, fork, most, cloud, exec-server, features. MAY-tier — community-standard verbs (get/list/create/update/delete) help agents predict subcommand behavior across CLIs. |
| PASS | Skill bundle has install path (`tool skill install [<host>]`) | [P8](https://anc.dev/p8) |  |
| PASS | `skill install --all` for multi-runtime install | [P8](https://anc.dev/p8) |  |
| PASS | `skill update` / `skill upgrade` for bundle refresh | [P8](https://anc.dev/p8) |  |
| WARN | `--raw` flag for pipe-safe unformatted output | [P2](https://anc.dev/p2) | no `--raw` flag advertised. MAY-tier — useful for pipelines that want to strip formatting before piping to other tools. |
| SKIP | `--output` advertises additional formats beyond text/json | [P2](https://anc.dev/p2) | no `--output` or `--format` flag advertised; vacuous skip for MAY-tier extra formats. |
| WARN | `examples` subcommand or `--examples` flag for curated usage patterns | [P3](https://anc.dev/p3) | no `examples` subcommand or `--examples` flag found. MAY-tier — a curated usage block keeps agents from hunting through long help text. |
| WARN | `--color` flag for explicit color control | [P6](https://anc.dev/p6) | no `--color` flag advertised. MAY-tier — `auto\|always\|never` lets agents and pipelines override the TTY-based default. |
| WARN | `--verbose` flag for diagnostic escalation | [P7](https://anc.dev/p7) | no `--verbose` / `-v` flag advertised. SHOULD-tier — agents debugging failures need a way to escalate diagnostic detail. |
| SKIP | `--limit` / `--max-results` flag for list operations | [P7](https://anc.dev/p7) | no list-style subcommand detected (list/ls/search/query/find/show/get); vacuous skip for the list-only SHOULD. |
| SKIP | Cursor-based pagination flags for list traversal | [P7](https://anc.dev/p7) | no list-style subcommand detected; vacuous skip for the list-only MAY. |
| WARN | `--help` advertises default values for flags | [P1](https://anc.dev/p1) | no default-value annotations found in --help. SHOULD-tier — agents reading help text need to see what value a flag falls back to when omitted (`[default: <value>]` per clap convention). |
| PASS | Rich-TUI affordance for TTY contexts | [P1](https://anc.dev/p1) |  |
| PASS | Short `-h` summary differs from `--help` long form | [P3](https://anc.dev/p3) |  |
| SKIP | Input-accepting commands read from stdin when no file is given | [P6](https://anc.dev/p6) | no input-accepting subcommand detected (process/parse/convert/transform/analyze/validate/format/lint/audit); vacuous skip for the conditional SHOULD. |
| WARN | Subcommand naming follows a consistent verb/noun convention | [P6](https://anc.dev/p6) | subcommand naming is inconsistent: 7 non-verb subcommand(s) (mcp, plugin, working, the, most, cloud, features) mix verb and non-verb children at the second level, so an agent cannot predict where the action lives. SHOULD-tier: pick a consistent shape (all verb-first, all noun-verb hierarchy, or any combination where each non-verb group's children are uniformly verbs). The verb list is a heuristic; inspect `--help` to confirm. |
| SKIP | `--timeout` flag for long-running operations | [P7](https://anc.dev/p7) | no long-running subcommand detected (serve/daemon/watch/tail/monitor/follow/run/start/stream); vacuous skip for the conditional SHOULD. |
| PASS | Bad invocation exits with structured usage-error code (2) | [P2](https://anc.dev/p2) |  |
| PASS | Error messages include a hint or remediation phrase | [P4](https://anc.dev/p4) |  |
| SKIP | Errors emit JSON envelope with `error`/`kind`/`message` under `--output json` | [P2](https://anc.dev/p2) | binary does not advertise `--output json` in --help; MUST applies only to CLIs that opt into the JSON contract. |
| SKIP | `--output json` produces JSON-formatted errors | [P4](https://anc.dev/p4) | binary does not advertise `--output json` in --help; SHOULD applies only to CLIs that opt into the JSON contract. |
| SKIP | JSON success and error envelopes share their non-payload key set | [P2](https://anc.dev/p2) | binary does not advertise `--output json` in --help; envelope-consistency only applies to CLIs that opt into the JSON contract. |
| PASS | Each subcommand's `--help` ships at least one invocation example | [P3](https://anc.dev/p3) |  |
| WARN | Help text pairs human and `--output json` example invocations | [P3](https://anc.dev/p3) | no paired text + `--output json` example found within 5 lines in top-level or any subcommand `--help`. Pairing keeps agents from reverse-engineering the JSON invocation from the text one. |
| WARN | Operations are subcommands, not verb-shaped flags | [P6](https://anc.dev/p6) | top-level verb-shaped flag(s) found: --search. Operations belong under the `Commands:` block (`tool search "q"`), not on the flag namespace where they fight the `--help` filtering agents rely on. |
| SKIP | Destructive subcommands require `--force` or `--yes` | [P5](https://anc.dev/p5) | no destructive subcommands detected; MUST applies conditionally to CLIs with destructive operations. |
| WARN | Read and write surfaces are both visible in subcommand list | [P5](https://anc.dev/p5) | write-pattern subcommand(s) present (update) but no read-pattern surface detected. If the CLI is write-only by design the MUST is satisfied vacuously; otherwise expose the read surface with agent-recognizable verbs (list/get/show/query/find/search). |
| WARN | Help text advertises TTY-aware verbosity behavior | [P7](https://anc.dev/p7) | no TTY-aware language found in `--help`. MAY-tier — automatic verbosity reduction when stdout is piped or redirected lets agents skip the explicit `--quiet` flag. Behavioral probes cannot simulate a real TTY without a pty crate, so this audit relies on documented intent. |

**Repo:** [openai/codex](https://github.com/openai/codex)
**Language:** Rust
**Version scored:** 0.135.0
**Audit date:** 2026-06-01 17:35:49 UTC
**Duration:** 3.1s
**Platform:** `linux/x86_64`
**Mode:** command
**Anc build:** 0.5.0
**Install:** `bun add -g @openai/codex`

## Reproduce locally

```bash
anc audit --command codex --output json
```
