Codex CLI, fast.

A working cheat sheet for engineers who already drive agentic CLIs (Claude Code, Aider, Cursor CLI…) and want the shortest path to being productive in OpenAI’s Codex CLI. Focused on what changes how you work — not an exhaustive flag dump.

Current as of June 2026 (codex-cli ≥ 0.130) gpt-5.5 / gpt-5.4 / gpt-5.4-mini / codex-spark Terminal · Desktop · IDE · Cloud

01 Install & first session

Open-source Rust binary (openai/codex on GitHub). Sign in with your ChatGPT plan (Plus/Pro/Business/Enterprise) or an API key.

Install & launch core

# npm
npm install -g @openai/codex
# macOS Homebrew
brew install --cask codex

cd your-repo
codex             # interactive TUI in this repo
codex doctor      # diagnostic report
codex update      # check for updates
codex completion zsh   # shell completions

Run from the repo root — the working directory is the workspace the sandbox protects, and where AGENTS.md is read from.

First 10 minutes in a new repo

codex login       # ChatGPT plan, or login --api-key
codex
> /init          # scaffold AGENTS.md from the codebase
> /model         # pick model + reasoning effort
> /permissions   # tune what runs without asking
> /status        # model, sandbox, token usage

Then ask a real question before asking for code: “give me an overview of this codebase”. Codex explores with its own tools — don’t paste files in; @-mention them if you must.

Billing: plan usage by default; /status shows limits. API-key auth bills per token.

02 Coming from Claude Code

Same genre, different philosophy: Codex leans on an OS-level sandbox (Seatbelt/Landlock) plus an approval policy, where Claude Code leans on per-tool permission rules. Codex defaults to more autonomy inside the workspace, less outside it.

You know (Claude Code)Codex equivalentNotes
claudecodexInteractive TUI in cwd.
claude -p "…"codex exec "…"Headless mode; --json for machine-readable events.
CLAUDE.mdAGENTS.mdRepo root + nested dirs + ~/.codex/AGENTS.md global. Same idea, open standard.
Shift+Tab permission modes/permissions · --sandbox · -aRead Only / Auto / Full Access presets; fine-grained via approval_policy + sandbox_mode in config.toml.
/clear/newGotcha: Codex’s /clear only resets the visible UI. /new starts a fresh conversation.
/rewind · Esc Esc checkpointsEsc (backtrack) · /forkEsc steps back through the transcript to edit an earlier turn; /fork branches the thread. No file-state restore — lean on git.
claude -c / -rcodex resume --last / codex resumePicker or by session ID; codex fork branches from an old one.
Tab thinking · “ultrathink”model_reasoning_effortminimal → low → medium → high → xhigh, set via /model or -c model_reasoning_effort=high.
~/.claude/skills~/.codex/skillsSame SKILL.md format — skills are portable across both tools.
settings.json hooks[[hooks.*]] in config.tomlSame lifecycle-hook concept, TOML syntax. Not yet Windows-compatible.
WebSearch toolbuilt-in (cached) · --search for liveWeb search is on by default against a cached index.
The big behavioral difference: Codex’s default “Auto” mode edits and runs commands inside the workspace without asking, but the sandbox blocks network and out-of-tree writes until you approve. Trust the sandbox; review the diff with /diff and /review.

03 Slash commands that matter

~40 built-ins; these are the ones that earn their keystrokes. Type / to fuzzy-search them all.

Session hygiene daily

  • /new — fresh conversation between unrelated tasks. The highest-leverage habit. (/clear only wipes the screen.)
  • /compact — summarize history to free tokens; Codex also auto-compacts on long-horizon work.
  • /resume — reopen a past thread (also codex resume --last from shell).
  • /fork — branch the conversation to explore an alternative without losing the original.
  • /side — ephemeral side conversation for a quick tangent; main thread stays clean.
  • /status — model, sandbox, approvals, token usage at a glance.

Setup & steering daily

  • /init — scaffold AGENTS.md for the repo.
  • /model — switch model and reasoning effort mid-session.
  • /fast — toggle the Fast service tier (same model, faster tokens) when available.
  • /permissions — adjust what runs without asking, in-session.
  • /plan — demand an execution plan before any implementation.
  • /goal — pin a persistent objective the agent keeps optimizing for.
  • /mention / /ide — attach files / pull editor context into the next prompt.

Review & repo power

  • /review — dedicated reviewer over uncommitted changes, a base branch, or a specific commit. Run it before every push.
  • /diff — git diff including untracked files, without leaving the TUI.
  • /approve — retry something the auto-reviewer denied.

Extending power

  • /mcp — list configured MCP servers and tools.
  • /skills — inspect local skills (~/.codex/skills).
  • /plugins — installed + discoverable plugins (skills + MCP + apps bundles).
  • /hooks — review lifecycle hooks.
  • /apps — insert ChatGPT apps into a prompt as $app-slug.
  • /memories — toggle memory injection/generation.
  • /experimental — opt into feature flags (also codex features).

Parallel & background 2026

  • /agent — switch between agent threads (multi-agent sessions).
  • /ps — background terminals + recent output (dev servers, tails).
  • /stop — kill all background terminal work.

TUI quality-of-life

  • /statusline / /title — model/branch/tokens in the footer & terminal title.
  • /theme — syntax-highlight theme; /keymap — remap any shortcut; /vim — vim composer.
  • /copy — copy last response; /raw — raw scrollback for clean terminal selection.
  • /personality — communication style; /debug-config — which config layer won; /feedback — file an issue with diagnostics.

04 Keyboard & input tricks

Everything is remappable via /keymap (persisted under [tui.keymap]). Defaults below.

While Codex is working

  • Escinterrupt; press again to backtrack the transcript and edit an earlier message. Use early and often.
  • Tabqueue a follow-up instruction while a turn runs.
  • Approve / reject inline — plan steps and command requests get y/n prompts in default approval modes.
  • Ctrl+O — copy the latest output (same as /copy).
  • Alt+R — raw scrollback mode for clean copy/paste.

Composing prompts

  • @ — fuzzy file search (@src/auth/jwt.ts) to attach files.
  • $app-slug — mention a connected app (via /apps).
  • Ctrl+R — search prompt history.
  • Ctrl+G — open $EDITOR for long prompts.
  • Paste images straight into the composer (or -i mock.png from the shell) for UI work.
  • — draft/message history; Ctrl+C ×2 or /quit — exit.

05 Approvals & sandbox — pick your autonomy

Two orthogonal dials: sandbox (what the OS lets Codex touch) and approval policy (when it asks you). The presets combine both.

PresetBehaviorUse when
Read Only -s read-onlyBrowse and answer; no edits, no commands with side effects.Code spelunking, audits, “explain this” sessions.
Auto (default)Reads, edits, and runs commands freely inside the workspace; asks before network access or touching anything outside it (sandbox_mode = "workspace-write" + approval_policy = "on-request").The 90% case. The sandbox is the safety net.
Full Access -s danger-full-access + -a neverNo sandbox, no prompts, machine-wide.Containers/CI only. The flag name is honest.
Fine-tune instead of clicking: /permissions in-session, or in ~/.codex/config.toml: approval_policy = "untrusted" | "on-request" | "never", sandbox_mode = "read-only" | "workspace-write" | "danger-full-access", and [sandbox_workspace_write] network_access = true if your tests need the network. Save combos as profiles (codex --profile ci).

06 CLI flags, exec & scripting

Codex is also a Unix tool: pipe data in, get JSON events out, wire it into CI. codex exec (alias codex e) is the workhorse.

Flags you’ll actually use

codex "fix the flaky retry test"     # prompt as arg
codex resume --last                  # continue last session
codex resume                         # session picker
codex fork                           # branch an old session
codex -m gpt-5.4                     # model override
codex -c model_reasoning_effort=high # any config key inline
codex -s workspace-write -a on-request
codex --profile ci                   # named config layer
codex -i mock.png "build this screen"
codex --search "…"                   # live web search
codex sandbox -- npm test            # run any cmd under the sandbox

Headless / exec mode CI

# one-shot, non-interactive
codex exec "update the changelog for the next release"

# pipe anything in
cat error.log | codex exec "root-cause this stack trace"

# machine-readable event stream
codex exec --json "lint my commit messages"

# CI: full autonomy inside a container
codex exec -s danger-full-access -a never \
  "fix the failing tests and commit"

Building something bigger? The Codex SDK (TypeScript/Python) and codex app-server expose the same agent programmatically; codex cloud manages cloud tasks from the terminal, and codex remote-control runs a daemon you can drive remotely.

07 AGENTS.md, memories & context

Four complementary layers: AGENTS.md shapes behavior, memories carry context forward between sessions, skills package repeatable processes, MCP connects external systems.

AGENTS.md hierarchy

~/.codex/AGENTS.md      # you, everywhere (style, tools)
./AGENTS.md             # repo root — commit it
./packages/api/AGENTS.md # nested — closest file wins

What belongs there: build/test commands, conventions the model gets wrong, repo etiquette (“never commit to main”), gotchas. What doesn’t: anything derivable from the code. It’s loaded every session — bloat costs context and attention.

It’s an open standard (agents.md) — the same file serves Codex, Claude Code, Cursor, and friends.

Memories & the context window 2026

  • Memories — Codex distills useful context from past sessions and injects it into new ones. Toggle with /memories; enable via [features] memories = true.
  • /new between tasks. Stale context degrades quality more than anything else.
  • Auto-compaction keeps long-horizon work alive, but a deliberate /compact at a natural breakpoint beats an automatic one mid-thought.
  • /side and /fork keep tangents and experiments out of the main thread.
  • /status shows token usage when things feel sluggish; prune unused MCP servers — tool schemas are a silent context hog.

08 Workflows that actually work

Same field-tested patterns as any serious agent CLI: give the model a target it can verify itself against — tests, a screenshot, a plan.

Plan → implement → review default

For anything non-trivial: /plan (or start in -s read-only) and have Codex research the change. Read the plan, push back, approve. Implement in Auto mode, then /review against the base branch before committing.

Skipping the plan step is the #1 cause of “it confidently did the wrong thing.”

TDD with an agent default

Ask for failing tests first, from the spec — explicitly not implementation. Commit them. Then: “make these pass; don’t modify the tests.” The red/green loop gives Codex an objective target it will grind against autonomously.

Visual iteration

For UI: paste a mock (or codex -i mock.png), give it eyes — a Playwright/Chrome DevTools MCP server for screenshots — and say “iterate until it matches.” Agents with visual feedback converge in 2–3 rounds; agents without don’t.

Parallel sessions with worktrees power

git worktree add ../proj-auth feature/auth
cd ../proj-auth && codex

One Codex per worktree = zero collisions. In-session, /agent switches between agent threads, and gpt-5.4-mini makes cheap subagents.

Local ↔ cloud handoff 2026

Start locally, delegate the long grind: codex cloud manages Codex Cloud tasks from the terminal, and the /apps→Desktop handoff moves a CLI thread into the Codex app. @codex on a GitHub PR or issue triggers cloud work and review there.

Safe full-auto loops

Mechanical sweeps (lint, codemods, dependency bumps): run inside a container with -s danger-full-access -a never and let it grind. Outside a container, prefer Auto mode — workspace-write with no network is already “safe YOLO.” Checkpoint with git; there’s no /rewind, so commits are your undo.

Branch the conversation

Two plausible designs? /fork and try both, then keep the winner. Quick tangent (“what does this env var do?”) — /side so the main thread’s context stays on task. Wrong direction mid-turn — Esc, edit the earlier message, resend.

Review before merge

/review runs a dedicated reviewer model over your diff — uncommitted work, a branch delta, or one commit — and it doesn’t modify files, so it’s safe to run constantly. Pair with /diff for your own pass. You still own the merge.

CI & automation

codex exec --json in a pipeline step covers triage, changelog generation, failure analysis. Use a ci profile pinning model, sandbox, and approval policy so behavior is reproducible. GitHub-side, @codex mentions and Codex code review handle PR flow.

09 Skills & plugins — extend the agent

A skill is a folder with a SKILL.md (frontmatter + instructions, optional scripts/resources) that loads only when relevant. Plugins (first-class since v0.117, Mar 2026) bundle skills, MCP servers, and app connectors into one versioned, installable unit — skills are the authoring primitive, plugins the distribution primitive.

Anatomy & install

~/.codex/skills/deploy-checks/SKILL.md

# SKILL.md
---
name: deploy-checks
description: Pre-deploy validation for our k8s services.
  Use before any production deploy.
---
1. Run ./scripts/preflight.sh
2. Verify migrations are reversible…
codex plugin           # install / manage plugins
/plugins               # browse discoverable plugins in-session
/skills                # inspect what's loaded

The description is the trigger — write it as a “when to use” sentence. Restart Codex after installing or editing a skill so metadata reloads.

Third-party picks & why curated

  • ComposioHQ/awesome-codex-skills — the curated community index; the fastest way to see what exists before writing your own.
  • Agensi marketplace — security-scanned skill downloads (unzip into ~/.codex/skills/). Why it matters: skills ship executable scripts that run with your permissions — provenance is the whole game.
  • anthropics/skills document suite (pdf, docx, xlsx, pptx) — the SKILL.md format is a cross-tool standard, so the best Claude-ecosystem skills work in Codex too. Real Office files via scripts instead of hallucinated markup.
  • Your own — the highest-ROI skill encodes your release process, migration checklist, or test conventions. Explained it twice? Make it a skill.

Audit GitHub-sourced skills before installing — read the SKILL.md and every script it calls.

10 MCP & apps — connect your stack

MCP servers give Codex tools beyond the filesystem; ChatGPT apps (mentioned as $app-slug) bring connected services straight into prompts. Web search needs neither — it’s built in.

Setup

# CLI
codex mcp add github -- npx -y @modelcontextprotocol/server-github
codex mcp list

# or config.toml — with per-tool control
[mcp_servers.playwright]
command = "npx"
args = ["-y", "@playwright/mcp@latest"]

[mcp_servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
disabled_tools = ["delete_repository"]
default_tools_approval_mode = "auto"

/mcp in-session lists servers and tools. enabled_tools/disabled_tools trim context cost and risk per server.

Worth the context cost

  • Playwright / Chrome DevTools — lets Codex drive a browser and see your app. The single biggest upgrade for frontend work.
  • GitHub — PRs, issues, CI runs as tools (plain gh via shell is often enough).
  • Context7 — version-correct library docs on demand; kills “trained on the old API” bugs.
  • Sentry — the actual stack trace and breadcrumbs for the bug you’re fixing.
  • Postgres (read-only) — schema-aware queries while debugging data issues.
  • Apps (/apps) — Figma, Linear, and other ChatGPT app connectors as $mentions, no MCP config needed.

Every server’s tool schemas consume context. Connect what you use; prune when /status shows bloat.

11 Hooks & config.toml

One TOML file rules everything. Precedence: CLI flags → profile → project .codex/config.toml → user ~/.codex/config.toml → system /etc/codex/config.toml. /debug-config shows which layer won.

Hooks power

# ~/.codex/config.toml
[[hooks.PreToolUse]]
matcher = { tool_name = "shell" }
[[hooks.PreToolUse.hooks]]
command = "./scripts/block-prod-db.sh"

[[hooks.PostCompact]]
[[hooks.PostCompact.hooks]]
command = "notify-send 'History compacted'"

Deterministic shell commands at lifecycle points — guarantees, not suggestions. Classic uses: auto-format after edits, block dangerous commands, notify on long-running turns. /hooks to inspect. Caveat: not yet Windows-compatible.

A useful baseline config

model = "gpt-5.5"
model_reasoning_effort = "medium"   # minimal…xhigh
approval_policy = "on-request"      # untrusted|on-request|never
sandbox_mode = "workspace-write"
web_search = "cached"               # disabled|cached|live
file_opener = "vscode"
notify = ["terminal-notifier", "-title", "Codex", "-message"]

[sandbox_workspace_write]
network_access = false

[features]
memories = true

[tui]
notifications = true
theme = "github-light"

Profiles: ~/.codex/ci.config.toml overlays the base when you pass --profile ci — pin a model/sandbox/approval combo per context.

12 Models & reasoning effort (Jun 2026)

ModelIDUse for
GPT-5.5gpt-5.5Current flagship and recommended default — complex coding, computer use, research workflows.
GPT-5.4gpt-5.4Strong agentic workhorse; slightly cheaper/faster than 5.5.
GPT-5.4 minigpt-5.4-miniFast + cheap: responsive edits, subagents, headless pipelines.
Codex Sparkgpt-5.3-codex-sparkNear-instant real-time iteration (research preview, Pro plans).
Reasoning effort is the thinking dial: minimal → low → medium → high → xhigh, set per-session via /model or per-run with -c model_reasoning_effort=high. More effort = slower + costlier — keep it at medium for routine edits, crank it for architecture. /fast toggles the Fast service tier where available. The older gpt-5.2-codex / gpt-5.3-codex models are deprecated for ChatGPT sign-in — don’t pin them in configs.

13 Ten habits that separate power users

  • /plan for anything non-trivial. Cheap insurance against confident nonsense.
  • /new between tasks — and remember /clear doesn’t do what you think.
  • Interrupt early. Esc in the first minute saves ten later.
  • Give it a feedback loop. Tests, a browser, a compiler — verifiable targets make the agent self-correcting.
  • Commit constantly. Git is your rewind button; there isn’t another one.
  • Invest in AGENTS.md like onboarding a new hire — and keep it short.
  • Encode repeated explanations as skills; distribute team setups as plugins.
  • Tune the sandbox, don’t fight itnetwork_access, writable_roots, profiles. Approval fatigue means your config is wrong.
  • Match effort to the taskxhigh for architecture, minimal/mini/Spark for chores.
  • /review like it’s a teammate’s PR — then read the diff yourself. You still own the merge.