Architecture
red-run has two layers: a platform layer that provides capabilities, and a strategy layer that decides how to use them.
Platform vs Strategy
Platform layer (stable)
The platform is the set of reusable components that any engagement can use:
- Teammates — persistent domain teammates (enum/ops pairs) spawned by the orchestrator
- Skills — 67+ technique-specific methodology files loaded on demand
- MCP servers — nmap scanning, shell management, browser automation, skill routing, state tracking
- Engagement state — SQLite database tracking targets, credentials, access, vulns, and pivot paths
- Dashboard — Real-time engagement monitoring with access chain graph
These components don't change based on engagement type. A CTF lab and a client engagement use the same teammates, skills, and servers.
Strategy layer (swappable)
The orchestrator is the strategy layer. It reads engagement state, decides which skill to invoke next, assigns it to the right teammate, and records findings. The default orchestrator (/red-run-ctf) is a CTF/lab orchestrator — it chains aggressively, routes to technique skills, and treats everything in scope as fair game.
A different orchestrator could use the same platform with different decision logic:
- Client engagement orchestrator — mandatory operator approval before technique execution, stricter scope gates, OPSEC-first routing
- Red team orchestrator — stealth-focused, avoids detection signatures, operates within rules of engagement windows
- Training orchestrator — explains each decision, pauses for student input, provides hints
The orchestrator contract is simple: read state, pick a skill, assign it to a teammate, record findings. Everything else is implementation choice.
Architecture Overview
Prompt Architecture
red-run controls behavior through layered prompts, not code. Each layer adds specificity:
| Layer | File | Loaded When | What It Provides |
|---|---|---|---|
| Project | CLAUDE.md |
Every conversation | Architecture rules, conventions, skill routing mandate |
| Teammate | teammates/<name>.md |
Teammate spawns | Role definition, scope constraints, hard stops, state-mgr messaging protocol |
| Skill | skills/<cat>/<name>/SKILL.md |
get_skill() call |
Technique methodology, payloads, troubleshooting |
| Dynamic | Lead's task assignment | Each task | Target info, credential/access IDs, engagement-specific context |
The project layer sets universal rules. The teammate layer constrains to a domain (web-enum only discovers, web-ops only executes techniques). The skill layer provides technique depth. The dynamic prompt carries live context from the lead.
Teammate → MCP Access
All teammates inherit MCP servers from the lead session. In agent teams, MCP servers are shared — a shell session created by one teammate is visible to all others (shell-server runs as a shared SSE service).
| Teammate | Domain | MCP Servers Used |
|---|---|---|
| state-mgr | State management | state (sole writer) |
| net-enum | Network recon | skill-router, nmap-server, shell-server, state |
| web-enum | Web discovery | skill-router, shell-server, browser-server, state |
| web-ops | Web techniques | skill-router, shell-server, browser-server, state |
| ad-enum | AD discovery | skill-router, shell-server, state |
| ad-ops | AD techniques | skill-router, shell-server, state |
| lin-enum / lin-ops | Linux host | skill-router, shell-server, state |
| win-enum / win-ops | Windows host | skill-router, shell-server, rdp-server, state |
| pivot, bypass, spray, recover, research | On-demand specialists | varies |
All state writes are centralized through state-mgr — the sole writer to state.db. Other teammates message state-mgr with structured [action] messages instead of calling write tools directly. State reads are direct (any teammate, any time).
Task Lifecycle
What happens when the lead assigns a task to a teammate:
- Lead assigns a skill and target to a specific teammate via messaging
- Teammate loads the skill via
get_skill()from the skill-router MCP - Teammate reads state via
get_state_summary()for current context - Teammate executes the skill methodology, messaging state-mgr with findings as they occur
- Teammate messages lead with a structured summary on completion
- Lead runs post-task checkpoint — audits state, updates vuln statuses, routes next actions
- Hard stops fire when applicable (new access → execution achieved, new creds → credential enum, etc.)
Engagement Directory
engagement/
├── config.yaml # Operator preferences (scan type, proxy, spray, cracking)
├── scope.md # Target scope, credentials, rules of engagement
├── state.db # SQLite engagement state (managed via state-server MCP)
├── dump-state.sh # Export state.db as markdown
├── web-proxy.json # Machine-readable web proxy config
├── web-proxy.sh # Shell env vars for web proxy
└── evidence/ # Saved output, responses, dumps
└── logs/ # Teammate JSONL transcripts
The lead creates this directory during engagement setup. State-mgr is the sole writer to state.db. Teammates write evidence files to evidence/. The TeammateIdle hook captures teammate transcripts to evidence/logs/.
See Engagement State for the database schema and Running an Engagement for the full workflow.
Data Flow
State flows through the system via state-mgr:
- Teammates discover findings during skill execution
- Teammates message state-mgr with structured
[action]messages (not direct DB writes) - State-mgr applies LLM-level dedup, validates provenance links, writes to state.db
- State-mgr notifies lead of new findings (
[new-vuln],[new-cred],[new-access]) - Lead runs decision logic — routes findings to the right teammate
- Any teammate can read state directly via
get_state_summary()at any time
Centralizing writes through state-mgr provides dedup judgment that DB-level constraints can't (e.g., "LFI file read" vs "LFI via absolute path" are the same vuln with different wording). It also enforces the technique-vuln linkage rule: credentials from active techniques must have a corresponding vuln record.
Privilege Boundaries
Claude Code never gets sudo. This is a deliberate design decision — an LLM with root access to your machine is an unnecessary risk, and red-run is architected so it's never needed.
The tools that require elevated privileges are isolated behind MCP servers and Docker containers:
| What needs privilege | How red-run handles it | Why not just sudo |
|---|---|---|
nmap SYN scans |
nmap-server runs nmap inside a Docker container with --network=host and minimal capabilities |
SYN scans need raw sockets, but Claude doesn't need root — Docker provides the capability isolation |
| Responder, mitm6, tcpdump | shell-server's privileged=True runs commands in the red-run-shell Docker container with NET_RAW/NET_ADMIN capabilities |
These daemons need raw sockets for poisoning/sniffing, but the privilege stays inside the container |
/etc/hosts changes |
Orchestrator hits a hard stop — presents the hostnames and asks the operator to add them manually | DNS resolution changes affect the entire system, not just the engagement |
| Clock skew correction | Orchestrator hits a hard stop — shows the required ntpdate or faketime command for the operator to run |
System clock changes affect every process on the machine |
The pattern is consistent: if something needs elevated privilege, either it runs inside a container that has the specific capability, or the orchestrator stops and asks the operator to do it. Claude never runs sudo itself. |
This also means red-run works without adding Claude Code to sudoers or NOPASSWD entries for privilege escalation on the host. The attack surface is the target, not your machine.
You can enforce this at the Claude Code level by adding Bash(sudo *) to the deny list in ~/.claude/settings.json. This makes Claude Code refuse any Bash command starting with sudo, regardless of what an agent or skill tries to do:
{
"permissions": {
"deny": [
"Bash(sudo *)"
]
}
}
This comes from the Trail of Bits Claude Code hardening guide, which has other useful deny rules for destructive commands (rm -rf, git push --force, dd, etc.). See Installation for the recommended setup.