Are you an LLM? Read llms.txt for a summary of the docs, or llms-full.txt for the full context.
Skip to content

Runtime Wrapper

Standardised metric collection across pi, claude, and codex runtimes.

Each runtime emits structured JSONL when invoked in non-interactive mode, but with different event schemas. The runtime wrapper normalises these into a common SessionResult type (defined in packages/core/src/runtime.ts) so the daemon, skill-evolve, and any other consumer can work uniformly.

Interfaces

See packages/core/src/runtime.ts for the full TypeScript types.

The key types:

  • SessionResult — the main output: usage, cost, tool calls, file operations, turns, status
  • RuntimeAdapter — interface each runtime implements: parse(lines, meta) → SessionResult
  • TokenUsage / CostBreakdown — normalised token and cost accounting
  • ToolCall — name, args, result, status, duration
  • Turn — one model round-trip with per-turn usage breakdown

Invocation Modes

RuntimeCommandOutput mode flagEvent format
pipi -p--mode jsonNewline-delimited JSON events
claudeclaude -p--output-format json (summary) or --output-format stream-json (streaming)Single JSON object or newline-delimited JSON events
codexcodex exec --experimental-jsonBuilt-inNewline-delimited JSON events

Event Schemas

Pi (--mode json)

Pi emits a rich event stream with lifecycle events and per-message usage:

{type: "session", version, id, timestamp, cwd}
{type: "agent_start"}
{type: "turn_start"}
{type: "message_start", message: {role: "user", content: [...]}}
{type: "message_end",   message: {role: "user", content: [...]}}
{type: "message_start", message: {role: "assistant", content: [...], model, usage, stopReason}}
  ... message_update events (text deltas) ...
{type: "message_end",   message: {role: "assistant", ..., usage: {input, output, cacheRead, cacheWrite, totalTokens, cost: {input, output, cacheRead, cacheWrite, total}}}}
{type: "tool_execution_start", toolCallId, toolName, args}
{type: "tool_execution_end",   toolCallId, toolName, result, isError}
{type: "message_start", message: {role: "toolResult", content: [...]}}
{type: "message_end",   message: {role: "toolResult", ...}}
{type: "turn_end", message: {role: "assistant", ..., usage}, toolResults: [...]}
{type: "agent_end", messages: [...]}
Key fields for metric extraction:
MetricLocation
Session IDsession.id
Modelmessage.model (on assistant messages)
Input tokensmessage.usage.input
Output tokensmessage.usage.output
Cache read tokensmessage.usage.cacheRead
Cache write tokensmessage.usage.cacheWrite
Total tokensmessage.usage.totalTokens
Cost (total)message.usage.cost.total
Cost (breakdown)message.usage.cost.{input,output,cacheRead,cacheWrite}
Stop reasonmessage.stopReason
Tool nametool_execution_start.toolName
Tool argstool_execution_start.args
Tool resulttool_execution_end.result
Tool errortool_execution_end.isError
TurnsCount turn_start / turn_end pairs
Text outputConcatenate text content blocks from assistant messages

Claude (--output-format json)

Claude's JSON mode returns a single summary object after completion:

{
  "type": "result",
  "subtype": "success",
  "is_error": false,
  "duration_ms": 2956,
  "duration_api_ms": 2890,
  "num_turns": 1,
  "result": "Hello!",
  "stop_reason": "end_turn",
  "session_id": "...",
  "total_cost_usd": 0.07,
  "usage": {
    "input_tokens": 5,
    "output_tokens": 8,
    "cache_creation_input_tokens": 9984,
    "cache_read_input_tokens": 14857,
    "server_tool_use": {"web_search_requests": 0}
  },
  "modelUsage": {
    "claude-opus-4-7[1m]": {
      "inputTokens": 5,
      "outputTokens": 8,
      "cacheReadInputTokens": 14857,
      "cacheCreationInputTokens": 9984,
      "costUSD": 0.07
    }
  }
}
Key fields for metric extraction:
MetricLocation
Session IDsession_id
ModelFirst key in modelUsage
Input tokensusage.input_tokens
Output tokensusage.output_tokens
Cache read tokensusage.cache_read_input_tokens
Cache write tokensusage.cache_creation_input_tokens
Cost (total)total_cost_usd
Cost (per-model)modelUsage[model].costUSD
Durationduration_ms
API durationduration_api_ms
Turnsnum_turns
Stop reasonstop_reason
Text outputresult
Tool callsNot in JSON mode — need stream-json for per-tool detail

Note: Claude's --output-format json gives aggregated metrics only. For per-tool-call detail (what meta-harness uses), --output-format stream-json emits per-event JSONL with assistant and user (tool_result) events, similar to the Anthropic Messages API format.

Claude (--output-format stream-json)

Streaming mode emits events matching the Anthropic Messages API shape:

{type: "system", ...}
{type: "assistant", message: {content: [{type: "text", text: "..."} | {type: "tool_use", id, name, input}], usage: {input_tokens, output_tokens, ...}}}
{type: "user", message: {content: [{type: "tool_result", tool_use_id, content, is_error}]}}
{type: "result", session_id, total_cost_usd, usage: {...}}

This is richer than JSON mode — each assistant message has individual content blocks that can be text or tool_use, and the following user message carries the tool_result. The final result event has the summary.

Codex (exec --experimental-json)

Codex uses a thread/item model via its TypeScript SDK:

{type: "thread.started", thread_id: "..."}
{type: "turn.started"}
{type: "item.started", item: {id, type: "agent_message" | "command_execution" | "file_change" | "mcp_tool_call" | ...}}
{type: "item.updated", item: {...}}
{type: "item.completed", item: {...}}
{type: "turn.completed", usage: {input_tokens, cached_input_tokens, output_tokens, reasoning_output_tokens}}
Item types (from codex SDK items.ts):
Item typeDescriptionKey fields
agent_messageText responsetext
reasoningChain-of-thoughttext
command_executionShell commandcommand, aggregated_output, exit_code, status
file_changeFile patchchanges: [{path, kind}], status
mcp_tool_callMCP tool invocationserver, tool, arguments, result, error, status
web_searchWeb searchquery
todo_listAgent's task listitems: [{text, completed}]
errorNon-fatal errormessage
Key fields for metric extraction:
MetricLocation
Session IDthread.started.thread_id
ModelNot in events (passed as config)
Input tokensturn.completed.usage.input_tokens
Output tokensturn.completed.usage.output_tokens
Cache tokensturn.completed.usage.cached_input_tokens
Reasoning tokensturn.completed.usage.reasoning_output_tokens
CostNot in events (must compute from token counts + pricing)
TurnsCount turn.started / turn.completed pairs
Stop reasonInfer from last event type
Tool callsitem.completed where item.type === "command_execution" or "mcp_tool_call"
File changesitem.completed where item.type === "file_change"
Text outputConcatenate item.completed where item.type === "agent_message"
Codex gaps:
  • No cost data in events — must be computed externally from token counts and model pricing
  • No explicit model field in events — passed as config, not echoed back
  • reasoning_output_tokens is a distinct field (OpenAI counts reasoning separately)

Field Mapping Summary

How each runtime field maps to the common SessionResult:

SessionResult fieldPiClaude (json)Codex
sessionIdsession.idsession_idthread.started.thread_id
modelmessage.modelmodelUsage keyconfig input
statusinfer from agent_endsubtypeinfer from exit
exitCodeprocess exitprocess exitprocess exit
usage.inputΣ message.usage.inputusage.input_tokensΣ turn.completed.usage.input_tokens
usage.outputΣ message.usage.outputusage.output_tokensΣ turn.completed.usage.output_tokens
usage.cacheReadΣ message.usage.cacheReadusage.cache_read_input_tokensΣ turn.completed.usage.cached_input_tokens
usage.cacheWriteΣ message.usage.cacheWriteusage.cache_creation_input_tokens
usage.reasoningΣ turn.completed.usage.reasoning_output_tokens
cost.totallast message.usage.cost.totaltotal_cost_usdcomputed
durationMscompute from timestampsduration_mscompute from timestamps
turnscount turn_end eventsnum_turnscount turn.completed events
toolCallstool_execution_{start,end}stream-json tool_use blockscommand_execution + mcp_tool_call items
fileChangesinfer from tool argsinfer from tool argsfile_change items (native)
textassistant text blocksresultagent_message items

Implementation Plan

Each runtime gets an adapter that implements RuntimeAdapter:

packages/core/src/
  runtime.ts          ← interfaces (done)
  adapters/
    pi.ts             ← parses pi --mode json output
    claude.ts         ← parses claude --output-format json (or stream-json)
    codex.ts          ← parses codex exec --experimental-json output

The daemon's executePipeline currently does raw Bun.spawnstdout.log. It would instead:

  1. Spawn the runtime with the appropriate JSON output flag
  2. Collect JSONL lines
  3. Pass them through the runtime's adapter → SessionResult
  4. Store the SessionResult for querying, logging, and skill-evolve

References