Architecture

How model slots, the orchestrator, bridges, and episodes work together.

Overview

Iron Rain is organized in three layers:

┌─────────────────────────────────────────────────┐
  CLI / TUI           User interface layer         
  @howlerops/iron-rain-cli                        
  @howlerops/iron-rain-tui                        
├─────────────────────────────────────────────────┤
  Orchestrator        Dispatch + routing layer    
  OrchestratorKernel                              
  ├── ModelSlotManager                            
  ├── ToolRouter                                  
  └── SlotWorker (per slot)                       
├─────────────────────────────────────────────────┤
  Context             @ references + injection   
  parseReferences │ DispatchController            
├─────────────────────────────────────────────────┤
  Bridges             Provider abstraction layer  
  AnthropicBridge │ OllamaBridge │ OpenAICompat   
  ClaudeCodeBridge │ CodexBridge │ GeminiBridge   
└─────────────────────────────────────────────────┘

Model Slots

The ModelSlotManager holds three named slots, each assigned to a provider + model pair:

Slots can be reassigned at runtime via slots.setSlot() or by modifying the config. The manager is the single source of truth for which model handles which category of work.

Key methods

class ModelSlotManager {
  getSlot(name: SlotName): SlotConfig;
  setSlot(name: SlotName, config: SlotConfig): void;
  cycleSlot(name: SlotName, available: SlotConfig[]): SlotConfig;
  getSlotForTool(toolType: ToolType): SlotName;
  serialize(): SlotAssignment;
}

Tool Router

The tool router automatically maps tool types to the correct slot:

Tool typeRouted toRationale
edit, write, bashForgeNeeds precise instruction following
grep, glob, read, searchScoutRead-only, can use a cheaper model
strategy, plan, conversationCortexNeeds the most capable model

You can override routing by setting targetSlot explicitly on an OrchestratorTask.

Orchestrator Kernel

The OrchestratorKernel is the central dispatch system. It receives tasks, routes them to the correct slot worker, and records episode summaries.

Task dispatch flow

1. Task comes in with a toolType (e.g., "edit")
2. Tool router maps "edit" → execute slot
3. Kernel finds the SlotWorker for execute
4. Worker calls the bridge (e.g., OpenAICompatBridge)
5. Bridge makes the API call and returns a BridgeResult
6. Worker wraps result into a WorkerResult
7. Kernel converts it to an EpisodeSummary and logs it

Key methods

class OrchestratorKernel {
  // Dispatch a single task to the appropriate slot
  dispatch(task: OrchestratorTask): Promise<EpisodeSummary>;

  // Dispatch multiple tasks in parallel
  orchestrate(tasks: OrchestratorTask[]): Promise<EpisodeSummary[]>;

  // Merge external episodes into the log
  integrate(episodes: EpisodeSummary[]): void;

  // Get all recorded episodes
  getEpisodes(): ReadonlyArray<EpisodeSummary>;

  // Re-initialize workers after config change
  refreshWorkers(): void;
}

Parallel orchestration

The orchestrate() method dispatches all tasks in parallel using Promise.allSettled, so tasks on different slots run concurrently. Failed tasks still produce an episode with status: 'failure'.

Bridges

Bridges are the abstraction layer between slots and providers. Every bridge implements the CLIBridge interface:

interface CLIBridge {
  name: string;
  available(): Promise<boolean>;
  execute(prompt: string, options?: BridgeOptions): Promise<BridgeResult>;
  stream(prompt: string, options?: BridgeOptions): AsyncIterable<BridgeChunk>;
}

Streaming chunks

The stream() method yields BridgeChunk objects with structured tool call data:

interface BridgeChunk {
  type: "text" | "thinking" | "tool_use" | "error" | "done";
  content: string;
  tokens?: { input: number; output: number };
  toolCall?: { id: string; name: string; status: "start" | "end" };
}

Tool call chunks carry enriched display names (e.g. Read schema.ts instead of just Read) derived from tool input arguments like file_path, pattern, or command. The pipeline flows: Bridge → SlotWorker.stream()OrchestratorKernel.dispatchStreaming()DispatchController → UI signals.

Bridge types

BridgeTypeHow it works
AnthropicBridgeAPIDirect HTTP to Anthropic Messages API
OpenAICompatBridgeAPIHTTP to any OpenAI-compatible endpoint
OllamaBridgeAPIHTTP to local Ollama server
GeminiBridgeAPIHTTP to Google Generative Language API
ClaudeCodeBridgeCLISpawns claude -p subprocess
CodexBridgeCLISpawns codex exec subprocess
GeminiCLIBridgeCLISpawns gemini -p subprocess

Episodes

Every task dispatch produces an EpisodeSummary — the unit of observability in Iron Rain.

interface EpisodeSummary {
  id: string;            // Unique episode ID
  slot: SlotName;        // Which slot ran this task
  task: string;          // The original prompt
  result: string;        // The model's response
  tokens: number;        // Total tokens (input + output)
  duration: number;      // Time in milliseconds
  filesModified?: string[];  // Files changed
  status: 'success' | 'failure' | 'partial';
}

Episodes accumulate in the kernel and can be inspected for debugging, cost tracking, or fed into context management systems.

Context References

The context reference system lets users inject external context into prompts using @ prefixes. This is handled by the parseReferences() function in the core package.

┌─────────────────────────────────────────────────┐
  User Input                                      
  "@./file.ts @git:diff explain changes"          
├─────────────────────────────────────────────────┤
  parseReferences()                               
  ├── Detect slot routing (@cortex, @scout...)    
  ├── Resolve file refs → read + wrap in tags     
  ├── Resolve dir refs → listing                  
  ├── Resolve git refs → exec whitelisted cmds    
  └── Resolve image refs → base64 encode          
├─────────────────────────────────────────────────┤
  DispatchController.buildTask()                  
  Injects resolved content into system prompt     
└─────────────────────────────────────────────────┘

Reference types

TypeSyntaxResolved as
File@./path or @file:pathFile contents in <file> tags (100KB max)
Directory@dir:pathDirectory listing in <directory> tags
Git@git:cmdGit command output in <git> tags (5s timeout)
Image@image:path or @./img.pngBase64 data for multimodal models (20MB max)

Multimodal Support

Bridges support multimodal ChatMessage content. The MessageContent type is a union of string and Array<TextContent | ImageContent>. Image references are encoded as base64 and passed through to providers that support vision (Anthropic, OpenAI, Gemini). Other providers receive the text-only portion via the getTextContent() helper.

Planner

The planner subsystem (packages/core/src/planner/) provides two high-level execution patterns:

Plan Generation & Execution

1. /plan "Add auth" → PlanGenerator
2. Cortex generates PRD (system prompt pass)
3. Cortex breaks PRD into tasks (JSON array)
4. User reviews: approve / reject / edit
5. PlanExecutor runs tasks sequentially via Forge
6. Each task gets prior results as context
7. Optional auto-commit after each task

Plans are stored in .iron-rain/plans/<id>/ and can be paused, resumed, or listed.

Ralph Loop (iterative execution)

The RalphLoop runs a task repeatedly until a completion condition is met. Each iteration includes prior actions as context, enabling progressive refinement. If stuck for 3+ iterations, it auto-suggests a different approach.

1. /loop "Fix tests" --until "ALL TESTS PASSING"
2. Iteration 1: Forge attempts fix → check condition
3. Iteration 2: Forge sees prior result → tries again
4. ... until condition is met or max iterations reached

Context Compaction (RLM)

Long conversations are automatically managed using a Retrieval-augmented Language Model pattern:

Compaction triggers when the message count exceeds 8 (configurable). Token estimation uses ~4 chars = 1 token.

Error Handling & Resilience

All bridges use a shared error handling layer:

Skills System

Skills are markdown files with YAML frontmatter that extend Iron Rain with custom slash commands.

Session Persistence

The TUI uses a SQLite database (~/.iron-rain/sessions.db) for persistence:

Falls back to a no-op NullSessionDB in non-Bun environments.

Mid-Stream Context Injection

The DispatchController supports pausing a streaming response to inject additional user context:

  1. User submits text while the agent is streaming
  2. injectAndContinue() aborts the current stream
  3. The partial response is saved as an assistant message
  4. A continuation prompt is dispatched with the full updated history

This enables iterative refinement without losing the agent's partial work.

Subagent Display

When tasks run across multiple slots, the TUI shows a responsive SubagentGrid — a flex-wrap grid of cards showing:

Cards adapt to terminal width via flexGrow + minWidth.

Streaming Agent Card

During active streaming, a StreamingAgentCard replaces the simple spinner with a live-updating bordered card that shows:

The DispatchController accumulates tool calls from tool_use chunks during streaming and copies them into the final SlotActivity.toolCalls for the completed message display.

Packages

PackagePurpose
@howlerops/iron-rainCore library — slots, orchestrator, bridges, context references, config. Zero UI deps.
@howlerops/iron-rain-tuiTerminal UI components built with SolidJS — session view, subagent grid, settings
@howlerops/iron-rain-cliCLI binary — launches TUI or runs headless
@howlerops/iron-rain-pluginPlugin SDK — hooks and type definitions