paperclip/doc/spec/agent-runs.md

# Agent Runs Subsystem Spec

Status: Draft
Date: 2026-02-17
Audience: Product + Engineering
Scope: Agent execution runtime, adapter protocol, wakeup orchestration, and live status delivery

## 1. Document Role

This spec defines how Paperclip actually runs agents while staying runtime-agnostic.

- `doc/SPEC-implementation.md` remains the V1 baseline contract.
- This document adds concrete subsystem detail for agent execution, including local CLI adapters, runtime state persistence, wakeup scheduling, and browser live updates.
- If this doc conflicts with current runtime behavior in code, this doc is the target behavior for upcoming implementation.

## 2. Captured Intent (From Request)

The following intentions are explicitly preserved in this spec:

1. Paperclip is adapter-agnostic. The key is a protocol, not a specific runtime.
2. We still need default built-ins to make the system useful immediately.
3. First two built-ins are `claude-local` and `codex-local`.
4. Those adapters run local CLIs directly on the host machine, unsandboxed.
5. Agent config includes working directory and initial/default prompt.
6. Heartbeats run the configured adapter process, Paperclip manages lifecycle, and on exit Paperclip parses JSON output and updates state.
7. Session IDs and token usage must be persisted so later heartbeats can resume.
8. Adapters should support status updates (short message + color) and optional streaming logs.
9. UI should support prompt template "pills" for variable insertion.
10. CLI errors must be visible in full (or as much as possible) in the UI.
11. Status changes must live-update across task and agent views via server push.
12. Wakeup triggers should be centralized by a heartbeat/wakeup service with at least:
   - timer interval
   - wake on task assignment
   - explicit ping/request

## 3. Goals and Non-Goals

### 3.1 Goals

1. Define a stable adapter protocol that supports multiple runtimes.
2. Ship production-usable local adapters for Claude CLI and Codex CLI.
3. Persist adapter runtime state (session IDs, token/cost usage, last errors).
4. Centralize wakeup decisions and queueing in one service.
5. Provide realtime run/task/agent updates to the browser.
6. Support deployment-specific full-log storage without bloating Postgres.
7. Preserve company scoping and existing governance invariants.

### 3.2 Non-Goals (for this subsystem phase)

1. Distributed execution workers across multiple hosts.
2. Third-party adapter marketplace/plugin SDK.
3. Perfect cost accounting for providers that do not emit cost.
4. Long-term log archival strategy beyond basic retention.

## 4. Baseline and Gaps (As of 2026-02-17)

Current code already has:

- `agents` with `adapterType` + `adapterConfig`.
- `heartbeat_runs` with basic status tracking.
- in-process `heartbeatService` that invokes `process` and `http`.
- cancellation endpoints for active runs.

Current gaps this spec addresses:

1. No persistent per-agent runtime state for session resume.
2. No queue/wakeup abstraction (invoke is immediate).
3. No assignment-triggered or timer-triggered centralized wakeups.
4. No websocket/SSE push path to browser.
5. No persisted run event timeline or external full-log storage contract.
6. No typed local adapter contracts for Claude/Codex session and usage extraction.
7. No prompt-template variable/pill system in agent setup.
8. No deployment-aware adapter for full run log storage (disk/object store/etc).

## 5. Architecture Overview

The subsystem introduces six cooperating components:

1. `Adapter Registry`
   - Maps `adapter_type` to implementation.
   - Exposes capability metadata and config validation.

2. `Wakeup Coordinator`
   - Single entrypoint for all wakeups (`timer`, `assignment`, `on_demand`, `automation`).
   - Applies dedupe/coalescing and queue rules.

3. `Run Executor`
   - Claims queued wakeups.
   - Creates `heartbeat_runs`.
   - Spawns/monitors child processes for local adapters.
   - Handles timeout/cancel/graceful kill.

4. `Runtime State Store`
   - Persists resumable adapter state per agent.
   - Persists run usage summaries and lightweight run-event timeline.

5. `Run Log Store`
   - Persists full stdout/stderr streams via pluggable storage adapter.
   - Returns stable `logRef` for retrieval (local path, object key, or DB reference).

6. `Realtime Event Hub`
   - Publishes run/agent/task updates over websocket.
   - Supports selective subscription by company.

Control flow (happy path):

1. Trigger arrives (`timer`, `assignment`, `on_demand`, or `automation`).
2. Wakeup coordinator enqueues/merges wake request.
3. Executor claims request, creates run row, marks agent `running`.
4. Adapter executes, emits status/log/usage events.
5. Full logs stream to `RunLogStore`; metadata/events are persisted to DB and pushed to websocket subscribers.
6. Process exits, output parser updates run result + runtime state.
7. Agent returns to `idle` or `error`; UI updates in real time.

## 6. Agent Run Protocol (Version `agent-run/v1`)

This protocol is runtime-agnostic and implemented by all adapters.

```ts
type RunOutcome = "succeeded" | "failed" | "cancelled" | "timed_out";
type StatusColor = "neutral" | "blue" | "green" | "yellow" | "red";

interface TokenUsage {
  inputTokens: number;
  outputTokens: number;
  cachedInputTokens?: number;
  cachedOutputTokens?: number;
}

interface AdapterInvokeInput {
  protocolVersion: "agent-run/v1";
  companyId: string;
  agentId: string;
  runId: string;
  wakeupSource: "timer" | "assignment" | "on_demand" | "automation";
  triggerDetail?: "manual" | "ping" | "callback" | "system";
  cwd: string;
  prompt: string;
  adapterConfig: Record<string, unknown>;
  runtimeState: Record<string, unknown>;
  env: Record<string, string>;
  timeoutSec: number;
}

interface AdapterHooks {
  status?: (update: { message: string; color?: StatusColor }) => Promise<void>;
  log?: (event: { stream: "stdout" | "stderr" | "system"; chunk: string }) => Promise<void>;
  usage?: (usage: TokenUsage) => Promise<void>;
  event?: (eventType: string, payload: Record<string, unknown>) => Promise<void>;
}

interface AdapterInvokeResult {
  outcome: RunOutcome;
  exitCode: number | null;
  errorMessage?: string | null;
  summary?: string | null;
  sessionId?: string | null;
  usage?: TokenUsage | null;
  provider?: string | null;
  model?: string | null;
  costUsd?: number | null;
  runtimeStatePatch?: Record<string, unknown>;
  rawResult?: Record<string, unknown> | null;
}

interface AgentRunAdapter {
  type: string;
  protocolVersion: "agent-run/v1";
  capabilities: {
    resumableSession: boolean;
    statusUpdates: boolean;
    logStreaming: boolean;
    tokenUsage: boolean;
  };
  validateConfig(config: unknown): { ok: true } | { ok: false; errors: string[] };
  invoke(input: AdapterInvokeInput, hooks: AdapterHooks, signal: AbortSignal): Promise<AdapterInvokeResult>;
}
```

### 6.1 Required Behavior

1. `validateConfig` runs before saving or invoking.
2. `invoke` must be deterministic for a given config + runtime state + prompt.
3. Adapter must not mutate DB directly; it returns data via result/events only.
4. Adapter must emit enough context for errors to be debuggable.
5. If `invoke` throws, executor records run as `failed` with captured error text.

### 6.2 Optional Behavior

Adapters may omit status/log hooks. If omitted, runtime still emits system lifecycle statuses (`queued`, `running`, `finished`).

### 6.3 Run log storage protocol

Full run logs are managed by a separate pluggable store (not by the agent adapter).

```ts
type RunLogStoreType = "local_file" | "object_store" | "postgres";

interface RunLogHandle {
  store: RunLogStoreType;
  logRef: string; // opaque provider reference (path, key, uri, row id)
}

interface RunLogStore {
  begin(input: { companyId: string; agentId: string; runId: string }): Promise<RunLogHandle>;
  append(
    handle: RunLogHandle,
    event: { stream: "stdout" | "stderr" | "system"; chunk: string; ts: string },
  ): Promise<void>;
  finalize(
    handle: RunLogHandle,
    summary: { bytes: number; sha256?: string; compressed: boolean },
  ): Promise<void>;
  read(
    handle: RunLogHandle,
    opts?: { offset?: number; limitBytes?: number },
  ): Promise<{ content: string; nextOffset?: number }>;
  delete?(handle: RunLogHandle): Promise<void>;
}
```

V1 deployment defaults:

1. Dev/local default: `local_file` (write to `data/run-logs/...`).
2. Cloud/serverless default: `object_store` (S3/R2/GCS compatible).
3. Optional fallback: `postgres` with strict size caps.

### 6.4 Adapter identity and compatibility

For V1 rollout, adapter identity is explicit:

- `claude_local`
- `codex_local`
- `process` (generic existing behavior)
- `http` (generic existing behavior)

`claude_local` and `codex_local` are not wrappers around arbitrary `process`; they are typed adapters with known parser/resume semantics.

## 7. Built-in Adapters (Phase 1)

## 7.1 `claude-local`

Runs local `claude` CLI directly.

### Config

```json
{
  "cwd": "/absolute/or/relative/path",
  "promptTemplate": "You are agent {{agent.id}} ...",
  "model": "optional-model-id",
  "maxTurnsPerRun": 300,
  "dangerouslySkipPermissions": true,
  "env": {"KEY": "VALUE"},
  "extraArgs": [],
  "timeoutSec": 1800,
  "graceSec": 20
}
```

### Invocation

- Base command: `claude --print <prompt> --output-format json`
- Resume: add `--resume <sessionId>` when runtime state has session ID
- Unsandboxed mode: add `--dangerously-skip-permissions` when enabled

### Output parsing

1. Parse stdout JSON object.
2. Extract `session_id` for resume.
3. Extract usage fields:
   - `usage.input_tokens`
   - `usage.cache_read_input_tokens` (if present)
   - `usage.output_tokens`
4. Extract `total_cost_usd` when present.
5. On non-zero exit: still attempt parse; if parse succeeds keep extracted state and mark run failed unless adapter explicitly reports success.

## 7.2 `codex-local`

Runs local `codex` CLI directly.

### Config

```json
{
  "cwd": "/absolute/or/relative/path",
  "promptTemplate": "You are agent {{agent.id}} ...",
  "model": "optional-model-id",
  "search": false,
  "dangerouslyBypassApprovalsAndSandbox": true,
  "env": {"KEY": "VALUE"},
  "extraArgs": [],
  "timeoutSec": 1800,
  "graceSec": 20
}
```

### Invocation

- Base command: `codex exec --json <prompt>`
- Resume form: `codex exec --json resume <sessionId> <prompt>`
- Unsandboxed mode: add `--dangerously-bypass-approvals-and-sandbox` when enabled
- Optional search mode: add `--search`

### Output parsing

Codex emits JSONL events. Parse line-by-line and extract:

1. `thread.started.thread_id` -> session ID
2. `item.completed` where item type is `agent_message` -> output text
3. `turn.completed.usage`:
   - `input_tokens`
   - `cached_input_tokens`
   - `output_tokens`

Codex JSONL currently may not include cost; store token usage and leave cost null/unknown unless available.

## 7.3 Common local adapter process handling

Both local adapters must:

1. Use `spawn(command, args, { shell: false, stdio: "pipe" })`.
2. Capture stdout/stderr in stream chunks and forward to `RunLogStore`.
3. Maintain rolling stdout/stderr tail excerpts in memory for DB diagnostic fields.
4. Emit live log events to websocket subscribers (optional to throttle/chunk).
5. Support graceful cancel: `SIGTERM`, then `SIGKILL` after `graceSec`.
6. Enforce timeout using adapter `timeoutSec`.
7. Return exit code + parsed result + diagnostic stderr.

## 8. Heartbeat and Wakeup Coordinator

## 8.1 Wakeup sources

Supported sources:

1. `timer`: periodic heartbeat per agent.
2. `assignment`: issue assigned/reassigned to agent.
3. `on_demand`: explicit wake request path (board/manual click or API ping).
4. `automation`: non-interactive wake path (external callback or internal system automation).

## 8.2 Central API

All sources call one internal service:

```ts
enqueueWakeup({
  companyId,
  agentId,
  source,
  triggerDetail, // optional: manual|ping|callback|system
  reason,
  payload,
  requestedBy,
  idempotencyKey?
})
```

No source invokes adapters directly.

## 8.3 Queue semantics

1. Max active run per agent remains `1`.
2. If agent already has `queued`/`running` run:
   - coalesce duplicate wakeups
   - increment `coalescedCount`
   - preserve latest reason/source metadata
3. Queue is DB-backed for restart safety.
4. Coordinator uses FIFO by `requested_at`, with optional priority:
   - `on_demand` > `assignment` > `timer`/`automation`

## 8.4 Agent heartbeat policy fields

Agent-level control-plane settings (not adapter-specific):

```json
{
  "heartbeat": {
    "enabled": true,
    "intervalSec": 300,
    "wakeOnAssignment": true,
    "wakeOnOnDemand": true,
    "wakeOnAutomation": true,
    "cooldownSec": 10
  }
}
```

Defaults:

- `enabled: true`
- `intervalSec: null` (no timer until explicitly set) or product default `300` if desired globally
- `wakeOnAssignment: true`
- `wakeOnOnDemand: true`
- `wakeOnAutomation: true`

## 8.5 Trigger integration rules

1. Timer checks run on server worker interval and enqueue due agents.
2. Issue assignment mutation enqueues wakeup when assignee changes and target agent has `wakeOnAssignment=true`.
3. On-demand endpoint enqueues wakeup with `source=on_demand` and `triggerDetail=manual|ping` when `wakeOnOnDemand=true`.
4. Callback/system automations enqueue wakeup with `source=automation` and `triggerDetail=callback|system` when `wakeOnAutomation=true`.
5. Paused/terminated agents do not receive new wakeups.
6. Hard budget-stopped agents do not receive new wakeups.

## 9. Persistence Model

All tables remain company-scoped.

## 9.0 Changes to `agents`

1. Extend `adapter_type` domain to include `claude_local` and `codex_local` (alongside existing `process`, `http`).
2. Keep `adapter_config` as adapter-owned config (CLI flags, cwd, prompt templates, env overrides).
3. Add `runtime_config` jsonb for control-plane scheduling policy:
   - heartbeat enable/interval
   - wake-on-assignment
   - wake-on-on-demand
   - wake-on-automation
   - cooldown

This separation keeps adapter config runtime-agnostic while allowing the heartbeat service to apply consistent scheduling logic.

## 9.1 New table: `agent_runtime_state`

One row per agent for aggregate runtime counters and legacy compatibility.

- `agent_id` uuid pk fk `agents.id`
- `company_id` uuid fk not null
- `adapter_type` text not null
- `session_id` text null
- `state_json` jsonb not null default `{}`
- `last_run_id` uuid fk `heartbeat_runs.id` null
- `last_run_status` text null
- `total_input_tokens` bigint not null default `0`
- `total_output_tokens` bigint not null default `0`
- `total_cached_input_tokens` bigint not null default `0`
- `total_cost_cents` bigint not null default `0`
- `last_error` text null
- `updated_at` timestamptz not null

Invariant: exactly one runtime state row per agent.

## 9.1.1 New table: `agent_task_sessions`

One row per `(company_id, agent_id, adapter_type, task_key)` for resumable session state.

- `id` uuid pk
- `company_id` uuid fk not null
- `agent_id` uuid fk not null
- `adapter_type` text not null
- `task_key` text not null
- `session_params_json` jsonb null (adapter-defined shape)
- `session_display_id` text null (for UI/debug)
- `last_run_id` uuid fk `heartbeat_runs.id` null
- `last_error` text null
- `created_at` timestamptz not null
- `updated_at` timestamptz not null

Invariant: unique `(company_id, agent_id, adapter_type, task_key)`.

## 9.2 New table: `agent_wakeup_requests`

Queue + audit for wakeups.

- `id` uuid pk
- `company_id` uuid fk not null
- `agent_id` uuid fk not null
- `source` text not null (`timer|assignment|on_demand|automation`)
- `trigger_detail` text null (`manual|ping|callback|system`)
- `reason` text null
- `payload` jsonb null
- `status` text not null (`queued|claimed|coalesced|skipped|completed|failed|cancelled`)
- `coalesced_count` int not null default `0`
- `requested_by_actor_type` text null (`user|agent|system`)
- `requested_by_actor_id` text null
- `idempotency_key` text null
- `run_id` uuid fk `heartbeat_runs.id` null
- `requested_at` timestamptz not null
- `claimed_at` timestamptz null
- `finished_at` timestamptz null
- `error` text null

## 9.3 New table: `heartbeat_run_events`

Append-only per-run lightweight event timeline (no full raw log chunks).

- `id` bigserial pk
- `company_id` uuid fk not null
- `run_id` uuid fk `heartbeat_runs.id` not null
- `agent_id` uuid fk `agents.id` not null
- `seq` int not null
- `event_type` text not null (`lifecycle|status|usage|error|structured`)
- `stream` text null (`system|stdout|stderr`) (summarized events only, not full stream chunks)
- `level` text null (`info|warn|error`)
- `color` text null
- `message` text null
- `payload` jsonb null
- `created_at` timestamptz not null

## 9.4 Changes to `heartbeat_runs`

Add fields required for result and diagnostics:

- `wakeup_request_id` uuid fk `agent_wakeup_requests.id` null
- `exit_code` int null
- `signal` text null
- `usage_json` jsonb null
- `result_json` jsonb null
- `session_id_before` text null
- `session_id_after` text null
- `log_store` text null (`local_file|object_store|postgres`)
- `log_ref` text null (opaque provider reference; path/key/uri/row id)
- `log_bytes` bigint null
- `log_sha256` text null
- `log_compressed` boolean not null default false
- `stderr_excerpt` text null
- `stdout_excerpt` text null
- `error_code` text null

This keeps per-run diagnostics queryable without storing full logs in Postgres.

## 9.5 Log storage adapter configuration

Runtime log storage is deployment-configured (not per-agent by default).

```json
{
  "runLogStore": {
    "type": "local_file | object_store | postgres",
    "basePath": "./data/run-logs",
    "bucket": "paperclip-run-logs",
    "prefix": "runs/",
    "compress": true,
    "maxInlineExcerptBytes": 32768
  }
}
```

Rules:

1. `log_ref` must be opaque and provider-neutral at API boundaries.
2. UI/API must not assume local filesystem semantics.
3. Provider-specific secrets/credentials stay in server config, never in agent config.

## 10. Prompt Template and Pill System

## 10.1 Template format

- Mustache-style placeholders: `{{path.to.value}}`
- No arbitrary code execution.
- Unknown variable on save = validation error.

## 10.2 Initial variable catalog

- `company.id`
- `company.name`
- `agent.id`
- `agent.name`
- `agent.role`
- `agent.title`
- `run.id`
- `run.source`
- `run.startedAt`
- `heartbeat.reason`
- `paperclip.skill` (shared Paperclip skill text block)
- `credentials.apiBaseUrl`
- `credentials.apiKey` (optional, sensitive)

## 10.3 Prompt fields

1. `promptTemplate`
   - Used on every wakeup (first run and resumed runs).
   - Can include run source/reason pills.

## 10.4 UI requirements

1. Agent setup/edit form includes prompt editors with pill insertion.
2. Variables are shown as clickable pills for fast insertion.
3. Save-time validation indicates unknown/missing variables.
4. Sensitive pills (`credentials.*`) show explicit warning badge.

## 10.5 Security notes for credentials

1. Credentials in prompt are allowed for initial simplicity but discouraged.
2. Preferred transport is env vars (`PAPERCLIP_*`) injected at runtime.
3. Prompt preview and logs must redact sensitive values.

## 11. Realtime Status Delivery

## 11.1 Transport

Primary transport: websocket channel per company.

- Endpoint: `GET /api/companies/:companyId/events/ws`
- Auth: board session or agent API key (company-bound)

## 11.2 Event envelope

```json
{
  "eventId": "uuid-or-monotonic-id",
  "companyId": "uuid",
  "type": "heartbeat.run.status",
  "entityType": "heartbeat_run",
  "entityId": "uuid",
  "occurredAt": "2026-02-17T12:00:00Z",
  "payload": {}
}
```

## 11.3 Required event types

1. `agent.status.changed`
2. `heartbeat.run.queued`
3. `heartbeat.run.started`
4. `heartbeat.run.status` (short color+message updates)
5. `heartbeat.run.log` (optional live chunk stream; full persistence handled by `RunLogStore`)
6. `heartbeat.run.finished`
7. `issue.updated`
8. `issue.comment.created`
9. `activity.appended`

## 11.4 UI behavior

1. Agent detail view updates run timeline live.
2. Task board reflects assignment/status/comment changes from agent activity without refresh.
3. Org/agent list reflects status changes live.
4. If websocket disconnects, client falls back to short polling until reconnect.

## 12. Error Handling and Diagnostics

## 12.1 Error classes

- `adapter_not_installed`
- `invalid_working_directory`
- `spawn_failed`
- `timeout`
- `cancelled`
- `nonzero_exit`
- `output_parse_error`
- `resume_session_invalid`
- `budget_blocked`

## 12.2 Logging requirements

1. Persist full stdout/stderr stream to configured `RunLogStore`.
2. Persist only lightweight run metadata/events in Postgres (`heartbeat_runs`, `heartbeat_run_events`).
3. Persist bounded `stdout_excerpt` and `stderr_excerpt` in Postgres for quick diagnostics.
4. Mark truncation explicitly when excerpts are capped.
5. Redact secrets from logs, excerpts, and websocket payloads.

## 12.3 Log retention and lifecycle

1. `RunLogStore` retention is configurable by deployment (for example 7/30/90 days).
2. Postgres run metadata can outlive full log objects.
3. Deletion/pruning jobs must handle orphaned metadata/log-object references safely.
4. If full log object is gone, APIs still return metadata and excerpts with `log_unavailable` status.

## 12.4 Restart recovery

On server startup:

1. Find stale `queued`/`running` runs.
2. Mark as `failed` with `error_code=control_plane_restart`.
3. Set affected non-paused/non-terminated agents to `error` (or `idle` based on policy).
4. Emit recovery events to websocket and activity log.

## 13. API Surface Changes

## 13.1 New/updated endpoints

1. `POST /agents/:agentId/wakeup`
   - enqueue wakeup with source/reason
2. `POST /agents/:agentId/heartbeat/invoke`
   - backward-compatible alias to wakeup API
3. `GET /agents/:agentId/runtime-state`
   - board-only debug view
4. `GET /agents/:agentId/task-sessions`
   - board-only list of task-scoped adapter sessions
5. `POST /agents/:agentId/runtime-state/reset-session`
   - clears all task sessions for the agent, or one when `taskKey` is provided
6. `GET /heartbeat-runs/:runId/events?afterSeq=:n`
   - fetch persisted lightweight timeline
7. `GET /heartbeat-runs/:runId/log`
   - reads full log stream via `RunLogStore` (or redirects/presigned URL for object store)
8. `GET /api/companies/:companyId/events/ws`
   - websocket stream

## 13.2 Mutation logging

All wakeup/run state mutations must create `activity_log` entries:

- `wakeup.requested`
- `wakeup.coalesced`
- `heartbeat.started`
- `heartbeat.finished`
- `heartbeat.failed`
- `heartbeat.cancelled`
- `runtime_state.updated`

## 14. Heartbeat Service Implementation Plan

## Phase 1: Contracts and schema

1. Add new DB tables/columns (`agent_runtime_state`, `agent_wakeup_requests`, `heartbeat_run_events`, `heartbeat_runs.log_*` fields).
2. Add `RunLogStore` interface and configuration wiring.
3. Add shared types/constants/validators.
4. Keep existing routes functional during migration.

## Phase 2: Wakeup coordinator

1. Implement DB-backed wakeup queue.
2. Convert invoke/wake routes to enqueue with `source=on_demand` and appropriate `triggerDetail`.
3. Add worker loop to claim and execute queued wakeups.

## Phase 3: Local adapters

1. Implement `claude-local` adapter.
2. Implement `codex-local` adapter.
3. Parse and persist session IDs and token usage.
4. Wire cancel/timeout/grace behavior.

## Phase 4: Realtime push

1. Implement company websocket hub.
2. Publish run/agent/issue events.
3. Update UI pages to subscribe and invalidate/update relevant data.

## Phase 5: Prompt pills and config UX

1. Add adapter-specific config editor with prompt templates.
2. Add pill insertion and variable validation.
3. Add sensitive-variable warnings and redaction.

## Phase 6: Hardening

1. Add failure/restart recovery sweeps.
2. Add metadata/full-log retention policies and pruning jobs.
3. Add integration/e2e coverage for wakeup triggers and live updates.

## 15. Acceptance Criteria

1. Agent with `claude-local` or `codex-local` can run, exit, and persist run result.
2. Session parameters are persisted per task scope and reused automatically for same-task resumes.
3. Token usage is persisted per run and accumulated per agent runtime state.
4. Timer, assignment, on-demand, and automation wakeups all enqueue through one coordinator.
5. Pause/terminate interrupts running local process and prevents new wakeups.
6. Browser receives live websocket updates for run status/logs and task/agent changes.
7. Failed runs expose rich CLI diagnostics in UI with excerpts immediately available and full log retrievable via `RunLogStore`.
8. All actions remain company-scoped and auditable.

## 16. Open Questions

1. Should timer default be `null` (off until enabled) or `300` seconds by default?
2. What should the default retention policy be for full log objects vs Postgres metadata?
3. Should agent API credentials be allowed in prompt templates by default, or require explicit opt-in toggle?
4. Should websocket be the only realtime channel, or should we also expose SSE for simpler clients?