Architecture

red-run has two layers: a platform layer that provides capabilities, and a strategy layer that decides how to use them.

Platform vs Strategy

Platform layer (stable)

The platform is the set of reusable components that any engagement can use:

Teammates — persistent domain teammates (enum/ops pairs) spawned by the orchestrator
Skills — 67+ technique-specific methodology files loaded on demand
MCP servers — nmap scanning, shell management, browser automation, skill routing, state tracking
Engagement state — SQLite database tracking targets, credentials, access, vulns, and pivot paths
Dashboard — Real-time engagement monitoring with access chain graph

These components don't change based on engagement type. A CTF lab and a client engagement use the same teammates, skills, and servers.

Strategy layer (swappable)

The orchestrator is the strategy layer. It reads engagement state, decides which skill to invoke next, assigns it to the right teammate, and records findings. The default orchestrator (/red-run-ctf) is a CTF/lab orchestrator — it chains aggressively, routes to technique skills, and treats everything in scope as fair game.

A different orchestrator could use the same platform with different decision logic:

Client engagement orchestrator — mandatory operator approval before technique execution, stricter scope gates, OPSEC-first routing
Red team orchestrator — stealth-focused, avoids detection signatures, operates within rules of engagement windows
Training orchestrator — explains each decision, pauses for student input, provides hints

The orchestrator contract is simple: read state, pick a skill, assign it to a teammate, record findings. Everything else is implementation choice.

Architecture Overview

Architecture diagram: Operator → Orchestrator → Agents → MCP Servers → engagement/

Prompt Architecture

red-run controls behavior through layered prompts, not code. Each layer adds specificity:

Layer	File	Loaded When	What It Provides
Project	`CLAUDE.md`	Every conversation	Architecture rules, conventions, skill routing mandate
Teammate	`teammates/<name>.md`	Teammate spawns	Role definition, scope constraints, hard stops, state-mgr messaging protocol
Skill	`skills/<cat>/<name>/SKILL.md`	`get_skill()` call	Technique methodology, payloads, troubleshooting
Dynamic	Lead's task assignment	Each task	Target info, credential/access IDs, engagement-specific context

The project layer sets universal rules. The teammate layer constrains to a domain (web-enum only discovers, web-ops only executes techniques). The skill layer provides technique depth. The dynamic prompt carries live context from the lead.

Teammate → MCP Access

All teammates inherit MCP servers from the lead session. In agent teams, MCP servers are shared — a shell session created by one teammate is visible to all others (shell-server runs as a shared SSE service).

Teammate	Domain	MCP Servers Used
state-mgr	State management	state (sole writer)
net-enum	Network recon	skill-router, nmap-server, shell-server, state
web-enum	Web discovery	skill-router, shell-server, browser-server, state
web-ops	Web techniques	skill-router, shell-server, browser-server, state
ad-enum	AD discovery	skill-router, shell-server, state
ad-ops	AD techniques	skill-router, shell-server, state
lin-enum / lin-ops	Linux host	skill-router, shell-server, state
win-enum / win-ops	Windows host	skill-router, shell-server, rdp-server, state
pivot, bypass, spray, recover, research	On-demand specialists	varies

All state writes are centralized through state-mgr — the sole writer to state.db. Other teammates message state-mgr with structured [action] messages instead of calling write tools directly. State reads are direct (any teammate, any time).

Task Lifecycle

What happens when the lead assigns a task to a teammate:

Lead assigns a skill and target to a specific teammate via messaging
Teammate loads the skill via get_skill() from the skill-router MCP
Teammate reads state via get_state_summary() for current context
Teammate executes the skill methodology, messaging state-mgr with findings as they occur
Teammate messages lead with a structured summary on completion
Lead runs post-task checkpoint — audits state, updates vuln statuses, routes next actions
Hard stops fire when applicable (new access → execution achieved, new creds → credential enum, etc.)

Engagement Directory

engagement/
├── config.yaml       # Operator preferences (scan type, proxy, spray, cracking)
├── scope.md          # Target scope, credentials, rules of engagement
├── state.db          # SQLite engagement state (managed via state-server MCP)
├── dump-state.sh     # Export state.db as markdown
├── web-proxy.json    # Machine-readable web proxy config
├── web-proxy.sh      # Shell env vars for web proxy
└── evidence/         # Saved output, responses, dumps
    └── logs/         # Teammate JSONL transcripts

The lead creates this directory during engagement setup. State-mgr is the sole writer to state.db. Teammates write evidence files to evidence/. The TeammateIdle hook captures teammate transcripts to evidence/logs/.

See Engagement State for the database schema and Running an Engagement for the full workflow.

Data Flow

State flows through the system via state-mgr:

Teammates discover findings during skill execution
Teammates message state-mgr with structured [action] messages (not direct DB writes)
State-mgr applies LLM-level dedup, validates provenance links, writes to state.db
State-mgr notifies lead of new findings ([new-vuln], [new-cred], [new-access])
Lead runs decision logic — routes findings to the right teammate
Any teammate can read state directly via get_state_summary() at any time

Centralizing writes through state-mgr provides dedup judgment that DB-level constraints can't (e.g., "LFI file read" vs "LFI via absolute path" are the same vuln with different wording). It also enforces the technique-vuln linkage rule: credentials from active techniques must have a corresponding vuln record.

Privilege Boundaries

Claude Code never gets sudo. This is a deliberate design decision — an LLM with root access to your machine is an unnecessary risk, and red-run is architected so it's never needed.

The tools that require elevated privileges are isolated behind MCP servers and Docker containers:

What needs privilege	How red-run handles it	Why not just sudo
`nmap` SYN scans	nmap-server runs nmap inside a Docker container with `--network=host` and minimal capabilities	SYN scans need raw sockets, but Claude doesn't need root — Docker provides the capability isolation
Responder, mitm6, tcpdump	shell-server's `privileged=True` runs commands in the `red-run-shell` Docker container with `NET_RAW`/`NET_ADMIN` capabilities	These daemons need raw sockets for poisoning/sniffing, but the privilege stays inside the container
`/etc/hosts` changes	Orchestrator hits a hard stop — presents the hostnames and asks the operator to add them manually	DNS resolution changes affect the entire system, not just the engagement
Clock skew correction	Orchestrator hits a hard stop — shows the required `ntpdate` or `faketime` command for the operator to run	System clock changes affect every process on the machine
The pattern is consistent: if something needs elevated privilege, either it runs inside a container that has the specific capability, or the orchestrator stops and asks the operator to do it. Claude never runs `sudo` itself.

This also means red-run works without adding Claude Code to sudoers or NOPASSWD entries for privilege escalation on the host. The attack surface is the target, not your machine.

You can enforce this at the Claude Code level by adding Bash(sudo *) to the deny list in ~/.claude/settings.json. This makes Claude Code refuse any Bash command starting with sudo, regardless of what an agent or skill tries to do:

{
  "permissions": {
    "deny": [
      "Bash(sudo *)"
    ]
  }
}

This comes from the Trail of Bits Claude Code hardening guide, which has other useful deny rules for destructive commands (rm -rf, git push --force, dd, etc.). See Installation for the recommended setup.