Architecture
How model slots, the orchestrator, bridges, and episodes work together.
Overview
Iron Rain is organized in three layers:
┌─────────────────────────────────────────────────┐
│ CLI / TUI User interface layer │
│ @howlerops/iron-rain-cli │
│ @howlerops/iron-rain-tui │
├─────────────────────────────────────────────────┤
│ Orchestrator Dispatch + routing layer │
│ OrchestratorKernel │
│ ├── ModelSlotManager │
│ ├── ToolRouter │
│ └── SlotWorker (per slot) │
├─────────────────────────────────────────────────┤
│ Context @ references + injection │
│ parseReferences │ DispatchController │
├─────────────────────────────────────────────────┤
│ Bridges Provider abstraction layer │
│ AnthropicBridge │ OllamaBridge │ OpenAICompat │
│ ClaudeCodeBridge │ CodexBridge │ GeminiBridge │
└─────────────────────────────────────────────────┘
Model Slots
The ModelSlotManager holds three named slots, each assigned to a provider + model pair:
- Cortex (main) — Strategy, planning, high-level reasoning
- Scout (explore) — Search, read, codebase understanding
- Forge (execute) — Code edits, bash commands, file writes
Slots can be reassigned at runtime via slots.setSlot() or by modifying the config. The manager is the single source of truth for which model handles which category of work.
Key methods
class ModelSlotManager {
getSlot(name: SlotName): SlotConfig;
setSlot(name: SlotName, config: SlotConfig): void;
cycleSlot(name: SlotName, available: SlotConfig[]): SlotConfig;
getSlotForTool(toolType: ToolType): SlotName;
serialize(): SlotAssignment;
}
Tool Router
The tool router automatically maps tool types to the correct slot:
| Tool type | Routed to | Rationale |
|---|---|---|
edit, write, bash | Forge | Needs precise instruction following |
grep, glob, read, search | Scout | Read-only, can use a cheaper model |
strategy, plan, conversation | Cortex | Needs the most capable model |
You can override routing by setting targetSlot explicitly on an OrchestratorTask.
Orchestrator Kernel
The OrchestratorKernel is the central dispatch system. It receives tasks, routes them to the correct slot worker, and records episode summaries.
Task dispatch flow
1. Task comes in with a toolType (e.g., "edit")
2. Tool router maps "edit" → execute slot
3. Kernel finds the SlotWorker for execute
4. Worker calls the bridge (e.g., OpenAICompatBridge)
5. Bridge makes the API call and returns a BridgeResult
6. Worker wraps result into a WorkerResult
7. Kernel converts it to an EpisodeSummary and logs it
Key methods
class OrchestratorKernel {
// Dispatch a single task to the appropriate slot
dispatch(task: OrchestratorTask): Promise<EpisodeSummary>;
// Dispatch multiple tasks in parallel
orchestrate(tasks: OrchestratorTask[]): Promise<EpisodeSummary[]>;
// Merge external episodes into the log
integrate(episodes: EpisodeSummary[]): void;
// Get all recorded episodes
getEpisodes(): ReadonlyArray<EpisodeSummary>;
// Re-initialize workers after config change
refreshWorkers(): void;
}
Parallel orchestration
The orchestrate() method dispatches all tasks in parallel using Promise.allSettled, so tasks on different slots run concurrently. Failed tasks still produce an episode with status: 'failure'.
Bridges
Bridges are the abstraction layer between slots and providers. Every bridge implements the CLIBridge interface:
interface CLIBridge {
name: string;
available(): Promise<boolean>;
execute(prompt: string, options?: BridgeOptions): Promise<BridgeResult>;
stream(prompt: string, options?: BridgeOptions): AsyncIterable<BridgeChunk>;
}
Streaming chunks
The stream() method yields BridgeChunk objects with structured tool call data:
interface BridgeChunk {
type: "text" | "thinking" | "tool_use" | "error" | "done";
content: string;
tokens?: { input: number; output: number };
toolCall?: { id: string; name: string; status: "start" | "end" };
}
Tool call chunks carry enriched display names (e.g. Read schema.ts instead of just Read) derived from tool input arguments like file_path, pattern, or command. The pipeline flows: Bridge → SlotWorker.stream() → OrchestratorKernel.dispatchStreaming() → DispatchController → UI signals.
Bridge types
| Bridge | Type | How it works |
|---|---|---|
AnthropicBridge | API | Direct HTTP to Anthropic Messages API |
OpenAICompatBridge | API | HTTP to any OpenAI-compatible endpoint |
OllamaBridge | API | HTTP to local Ollama server |
GeminiBridge | API | HTTP to Google Generative Language API |
ClaudeCodeBridge | CLI | Spawns claude -p subprocess |
CodexBridge | CLI | Spawns codex exec subprocess |
GeminiCLIBridge | CLI | Spawns gemini -p subprocess |
Episodes
Every task dispatch produces an EpisodeSummary — the unit of observability in Iron Rain.
interface EpisodeSummary {
id: string; // Unique episode ID
slot: SlotName; // Which slot ran this task
task: string; // The original prompt
result: string; // The model's response
tokens: number; // Total tokens (input + output)
duration: number; // Time in milliseconds
filesModified?: string[]; // Files changed
status: 'success' | 'failure' | 'partial';
}
Episodes accumulate in the kernel and can be inspected for debugging, cost tracking, or fed into context management systems.
Context References
The context reference system lets users inject external context into prompts using @ prefixes. This is handled by the parseReferences() function in the core package.
┌─────────────────────────────────────────────────┐
│ User Input │
│ "@./file.ts @git:diff explain changes" │
├─────────────────────────────────────────────────┤
│ parseReferences() │
│ ├── Detect slot routing (@cortex, @scout...) │
│ ├── Resolve file refs → read + wrap in tags │
│ ├── Resolve dir refs → listing │
│ ├── Resolve git refs → exec whitelisted cmds │
│ └── Resolve image refs → base64 encode │
├─────────────────────────────────────────────────┤
│ DispatchController.buildTask() │
│ Injects resolved content into system prompt │
└─────────────────────────────────────────────────┘
Reference types
| Type | Syntax | Resolved as |
|---|---|---|
| File | @./path or @file:path | File contents in <file> tags (100KB max) |
| Directory | @dir:path | Directory listing in <directory> tags |
| Git | @git:cmd | Git command output in <git> tags (5s timeout) |
| Image | @image:path or @./img.png | Base64 data for multimodal models (20MB max) |
Multimodal Support
Bridges support multimodal ChatMessage content. The MessageContent type is a union of string and Array<TextContent | ImageContent>. Image references are encoded as base64 and passed through to providers that support vision (Anthropic, OpenAI, Gemini). Other providers receive the text-only portion via the getTextContent() helper.
Planner
The planner subsystem (packages/core/src/planner/) provides two high-level execution patterns:
Plan Generation & Execution
1. /plan "Add auth" → PlanGenerator
2. Cortex generates PRD (system prompt pass)
3. Cortex breaks PRD into tasks (JSON array)
4. User reviews: approve / reject / edit
5. PlanExecutor runs tasks sequentially via Forge
6. Each task gets prior results as context
7. Optional auto-commit after each task
Plans are stored in .iron-rain/plans/<id>/ and can be paused, resumed, or listed.
Ralph Loop (iterative execution)
The RalphLoop runs a task repeatedly until a completion condition is met. Each iteration includes prior actions as context, enabling progressive refinement. If stuck for 3+ iterations, it auto-suggests a different approach.
1. /loop "Fix tests" --until "ALL TESTS PASSING"
2. Iteration 1: Forge attempts fix → check condition
3. Iteration 2: Forge sees prior result → tries again
4. ... until condition is met or max iterations reached
Context Compaction (RLM)
Long conversations are automatically managed using a Retrieval-augmented Language Model pattern:
- Hot window — last 6 messages kept verbatim
- Cold archive — older messages summarized into a compact form
- RLM retrieval — keyword-based relevance scoring pulls relevant old messages back into context
Compaction triggers when the message count exceeds 8 (configurable). Token estimation uses ~4 chars = 1 token.
Error Handling & Resilience
All bridges use a shared error handling layer:
- BridgeError — typed errors with status code and provider name
- Exponential backoff — with jitter (50-100% of computed delay), retries on 429 and 5xx errors
- Circuit breaker — opens after 5 consecutive failures, auto-resets after 60 seconds
Skills System
Skills are markdown files with YAML frontmatter that extend Iron Rain with custom slash commands.
- Auto-discovery — scans
.iron-rain/skills/,~/.iron-rain/skills/, and.claude/skills/ - Execution — skill instructions are injected into the system prompt for the dispatch
- Registration — skills appear in the slash command menu and
/skillslist
Session Persistence
The TUI uses a SQLite database (~/.iron-rain/sessions.db) for persistence:
- Sessions — stored with model, timestamps, and message history
- Lessons — persistent cross-session memory with optional TTL and full-text search
- Activities — subagent activities per message (slot, task, status, duration, tokens)
Falls back to a no-op NullSessionDB in non-Bun environments.
Mid-Stream Context Injection
The DispatchController supports pausing a streaming response to inject additional user context:
- User submits text while the agent is streaming
injectAndContinue()aborts the current stream- The partial response is saved as an assistant message
- A continuation prompt is dispatched with the full updated history
This enables iterative refinement without losing the agent's partial work.
Subagent Display
When tasks run across multiple slots, the TUI shows a responsive SubagentGrid — a flex-wrap grid of cards showing:
- Status dot (running/done/error/interrupted) with slot label
- Task title and tool-call tree with checkmark/error markers
- Footer with duration, token count, and cost
Cards adapt to terminal width via flexGrow + minWidth.
Streaming Agent Card
During active streaming, a StreamingAgentCard replaces the simple spinner with a live-updating bordered card that shows:
- Animated spinner with slot label and truncated task description
- Tool call tree with
├/└connectors and✓/✗/→status indicators - Lifecycle phases: "System prompt loaded", "Thinking..." entries
- Truncated content preview as text streams in
- Live elapsed timer with cancel/inject hints
The DispatchController accumulates tool calls from tool_use chunks during streaming and copies them into the final SlotActivity.toolCalls for the completed message display.
Packages
| Package | Purpose |
|---|---|
@howlerops/iron-rain | Core library — slots, orchestrator, bridges, context references, config. Zero UI deps. |
@howlerops/iron-rain-tui | Terminal UI components built with SolidJS — session view, subagent grid, settings |
@howlerops/iron-rain-cli | CLI binary — launches TUI or runs headless |
@howlerops/iron-rain-plugin | Plugin SDK — hooks and type definitions |