Files
paperclip/doc/spec/agent-runs.md
Forgotten 2acf28a51a Implement task-scoped sessions, queued run chaining, and session reset API
Heartbeat service now resolves session state per-task using agentTaskSessions,
with resolveNextSessionState handling codec-based serialization and fallback
to legacy sessionId. Queued runs are chained — when a run finishes or is reaped,
the next queued run for the same agent starts automatically. Queued runs for
an agent with an already-running run wait instead of failing.

Add task-sessions list endpoint and extend reset-session to accept optional
taskKey for targeted session clearing. Block pending_approval agents from
API key auth. Update agent/company delete cascades to include task sessions.
Update spec docs with task-session architecture.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 14:02:17 -06:00

25 KiB

Agent Runs Subsystem Spec

Status: Draft
Date: 2026-02-17
Audience: Product + Engineering
Scope: Agent execution runtime, adapter protocol, wakeup orchestration, and live status delivery

1. Document Role

This spec defines how Paperclip actually runs agents while staying runtime-agnostic.

  • doc/SPEC-implementation.md remains the V1 baseline contract.
  • This document adds concrete subsystem detail for agent execution, including local CLI adapters, runtime state persistence, wakeup scheduling, and browser live updates.
  • If this doc conflicts with current runtime behavior in code, this doc is the target behavior for upcoming implementation.

2. Captured Intent (From Request)

The following intentions are explicitly preserved in this spec:

  1. Paperclip is adapter-agnostic. The key is a protocol, not a specific runtime.
  2. We still need default built-ins to make the system useful immediately.
  3. First two built-ins are claude-local and codex-local.
  4. Those adapters run local CLIs directly on the host machine, unsandboxed.
  5. Agent config includes working directory and initial/default prompt.
  6. Heartbeats run the configured adapter process, Paperclip manages lifecycle, and on exit Paperclip parses JSON output and updates state.
  7. Session IDs and token usage must be persisted so later heartbeats can resume.
  8. Adapters should support status updates (short message + color) and optional streaming logs.
  9. UI should support prompt template "pills" for variable insertion.
  10. CLI errors must be visible in full (or as much as possible) in the UI.
  11. Status changes must live-update across task and agent views via server push.
  12. Wakeup triggers should be centralized by a heartbeat/wakeup service with at least:
  • timer interval
  • wake on task assignment
  • explicit ping/request

3. Goals and Non-Goals

3.1 Goals

  1. Define a stable adapter protocol that supports multiple runtimes.
  2. Ship production-usable local adapters for Claude CLI and Codex CLI.
  3. Persist adapter runtime state (session IDs, token/cost usage, last errors).
  4. Centralize wakeup decisions and queueing in one service.
  5. Provide realtime run/task/agent updates to the browser.
  6. Support deployment-specific full-log storage without bloating Postgres.
  7. Preserve company scoping and existing governance invariants.

3.2 Non-Goals (for this subsystem phase)

  1. Distributed execution workers across multiple hosts.
  2. Third-party adapter marketplace/plugin SDK.
  3. Perfect cost accounting for providers that do not emit cost.
  4. Long-term log archival strategy beyond basic retention.

4. Baseline and Gaps (As of 2026-02-17)

Current code already has:

  • agents with adapterType + adapterConfig.
  • heartbeat_runs with basic status tracking.
  • in-process heartbeatService that invokes process and http.
  • cancellation endpoints for active runs.

Current gaps this spec addresses:

  1. No persistent per-agent runtime state for session resume.
  2. No queue/wakeup abstraction (invoke is immediate).
  3. No assignment-triggered or timer-triggered centralized wakeups.
  4. No websocket/SSE push path to browser.
  5. No persisted run event timeline or external full-log storage contract.
  6. No typed local adapter contracts for Claude/Codex session and usage extraction.
  7. No prompt-template variable/pill system in agent setup.
  8. No deployment-aware adapter for full run log storage (disk/object store/etc).

5. Architecture Overview

The subsystem introduces six cooperating components:

  1. Adapter Registry

    • Maps adapter_type to implementation.
    • Exposes capability metadata and config validation.
  2. Wakeup Coordinator

    • Single entrypoint for all wakeups (timer, assignment, on_demand, automation).
    • Applies dedupe/coalescing and queue rules.
  3. Run Executor

    • Claims queued wakeups.
    • Creates heartbeat_runs.
    • Spawns/monitors child processes for local adapters.
    • Handles timeout/cancel/graceful kill.
  4. Runtime State Store

    • Persists resumable adapter state per agent.
    • Persists run usage summaries and lightweight run-event timeline.
  5. Run Log Store

    • Persists full stdout/stderr streams via pluggable storage adapter.
    • Returns stable logRef for retrieval (local path, object key, or DB reference).
  6. Realtime Event Hub

    • Publishes run/agent/task updates over websocket.
    • Supports selective subscription by company.

Control flow (happy path):

  1. Trigger arrives (timer, assignment, on_demand, or automation).
  2. Wakeup coordinator enqueues/merges wake request.
  3. Executor claims request, creates run row, marks agent running.
  4. Adapter executes, emits status/log/usage events.
  5. Full logs stream to RunLogStore; metadata/events are persisted to DB and pushed to websocket subscribers.
  6. Process exits, output parser updates run result + runtime state.
  7. Agent returns to idle or error; UI updates in real time.

6. Agent Run Protocol (Version agent-run/v1)

This protocol is runtime-agnostic and implemented by all adapters.

type RunOutcome = "succeeded" | "failed" | "cancelled" | "timed_out";
type StatusColor = "neutral" | "blue" | "green" | "yellow" | "red";

interface TokenUsage {
  inputTokens: number;
  outputTokens: number;
  cachedInputTokens?: number;
  cachedOutputTokens?: number;
}

interface AdapterInvokeInput {
  protocolVersion: "agent-run/v1";
  companyId: string;
  agentId: string;
  runId: string;
  wakeupSource: "timer" | "assignment" | "on_demand" | "automation";
  triggerDetail?: "manual" | "ping" | "callback" | "system";
  cwd: string;
  prompt: string;
  adapterConfig: Record<string, unknown>;
  runtimeState: Record<string, unknown>;
  env: Record<string, string>;
  timeoutSec: number;
}

interface AdapterHooks {
  status?: (update: { message: string; color?: StatusColor }) => Promise<void>;
  log?: (event: { stream: "stdout" | "stderr" | "system"; chunk: string }) => Promise<void>;
  usage?: (usage: TokenUsage) => Promise<void>;
  event?: (eventType: string, payload: Record<string, unknown>) => Promise<void>;
}

interface AdapterInvokeResult {
  outcome: RunOutcome;
  exitCode: number | null;
  errorMessage?: string | null;
  summary?: string | null;
  sessionId?: string | null;
  usage?: TokenUsage | null;
  provider?: string | null;
  model?: string | null;
  costUsd?: number | null;
  runtimeStatePatch?: Record<string, unknown>;
  rawResult?: Record<string, unknown> | null;
}

interface AgentRunAdapter {
  type: string;
  protocolVersion: "agent-run/v1";
  capabilities: {
    resumableSession: boolean;
    statusUpdates: boolean;
    logStreaming: boolean;
    tokenUsage: boolean;
  };
  validateConfig(config: unknown): { ok: true } | { ok: false; errors: string[] };
  invoke(input: AdapterInvokeInput, hooks: AdapterHooks, signal: AbortSignal): Promise<AdapterInvokeResult>;
}

6.1 Required Behavior

  1. validateConfig runs before saving or invoking.
  2. invoke must be deterministic for a given config + runtime state + prompt.
  3. Adapter must not mutate DB directly; it returns data via result/events only.
  4. Adapter must emit enough context for errors to be debuggable.
  5. If invoke throws, executor records run as failed with captured error text.

6.2 Optional Behavior

Adapters may omit status/log hooks. If omitted, runtime still emits system lifecycle statuses (queued, running, finished).

6.3 Run log storage protocol

Full run logs are managed by a separate pluggable store (not by the agent adapter).

type RunLogStoreType = "local_file" | "object_store" | "postgres";

interface RunLogHandle {
  store: RunLogStoreType;
  logRef: string; // opaque provider reference (path, key, uri, row id)
}

interface RunLogStore {
  begin(input: { companyId: string; agentId: string; runId: string }): Promise<RunLogHandle>;
  append(
    handle: RunLogHandle,
    event: { stream: "stdout" | "stderr" | "system"; chunk: string; ts: string },
  ): Promise<void>;
  finalize(
    handle: RunLogHandle,
    summary: { bytes: number; sha256?: string; compressed: boolean },
  ): Promise<void>;
  read(
    handle: RunLogHandle,
    opts?: { offset?: number; limitBytes?: number },
  ): Promise<{ content: string; nextOffset?: number }>;
  delete?(handle: RunLogHandle): Promise<void>;
}

V1 deployment defaults:

  1. Dev/local default: local_file (write to data/run-logs/...).
  2. Cloud/serverless default: object_store (S3/R2/GCS compatible).
  3. Optional fallback: postgres with strict size caps.

6.4 Adapter identity and compatibility

For V1 rollout, adapter identity is explicit:

  • claude_local
  • codex_local
  • process (generic existing behavior)
  • http (generic existing behavior)

claude_local and codex_local are not wrappers around arbitrary process; they are typed adapters with known parser/resume semantics.

7. Built-in Adapters (Phase 1)

7.1 claude-local

Runs local claude CLI directly.

Config

{
  "cwd": "/absolute/or/relative/path",
  "promptTemplate": "You are agent {{agent.id}} ...",
  "bootstrapPromptTemplate": "Initial setup instructions (optional)",
  "model": "optional-model-id",
  "maxTurnsPerRun": 80,
  "dangerouslySkipPermissions": true,
  "env": {"KEY": "VALUE"},
  "extraArgs": [],
  "timeoutSec": 1800,
  "graceSec": 20
}

Invocation

  • Base command: claude --print <prompt> --output-format json
  • Resume: add --resume <sessionId> when runtime state has session ID
  • Unsandboxed mode: add --dangerously-skip-permissions when enabled

Output parsing

  1. Parse stdout JSON object.
  2. Extract session_id for resume.
  3. Extract usage fields:
    • usage.input_tokens
    • usage.cache_read_input_tokens (if present)
    • usage.output_tokens
  4. Extract total_cost_usd when present.
  5. On non-zero exit: still attempt parse; if parse succeeds keep extracted state and mark run failed unless adapter explicitly reports success.

7.2 codex-local

Runs local codex CLI directly.

Config

{
  "cwd": "/absolute/or/relative/path",
  "promptTemplate": "You are agent {{agent.id}} ...",
  "bootstrapPromptTemplate": "Initial setup instructions (optional)",
  "model": "optional-model-id",
  "search": false,
  "dangerouslyBypassApprovalsAndSandbox": true,
  "env": {"KEY": "VALUE"},
  "extraArgs": [],
  "timeoutSec": 1800,
  "graceSec": 20
}

Invocation

  • Base command: codex exec --json <prompt>
  • Resume form: codex exec --json resume <sessionId> <prompt>
  • Unsandboxed mode: add --dangerously-bypass-approvals-and-sandbox when enabled
  • Optional search mode: add --search

Output parsing

Codex emits JSONL events. Parse line-by-line and extract:

  1. thread.started.thread_id -> session ID
  2. item.completed where item type is agent_message -> output text
  3. turn.completed.usage:
    • input_tokens
    • cached_input_tokens
    • output_tokens

Codex JSONL currently may not include cost; store token usage and leave cost null/unknown unless available.

7.3 Common local adapter process handling

Both local adapters must:

  1. Use spawn(command, args, { shell: false, stdio: "pipe" }).
  2. Capture stdout/stderr in stream chunks and forward to RunLogStore.
  3. Maintain rolling stdout/stderr tail excerpts in memory for DB diagnostic fields.
  4. Emit live log events to websocket subscribers (optional to throttle/chunk).
  5. Support graceful cancel: SIGTERM, then SIGKILL after graceSec.
  6. Enforce timeout using adapter timeoutSec.
  7. Return exit code + parsed result + diagnostic stderr.

8. Heartbeat and Wakeup Coordinator

8.1 Wakeup sources

Supported sources:

  1. timer: periodic heartbeat per agent.
  2. assignment: issue assigned/reassigned to agent.
  3. on_demand: explicit wake request path (board/manual click or API ping).
  4. automation: non-interactive wake path (external callback or internal system automation).

8.2 Central API

All sources call one internal service:

enqueueWakeup({
  companyId,
  agentId,
  source,
  triggerDetail, // optional: manual|ping|callback|system
  reason,
  payload,
  requestedBy,
  idempotencyKey?
})

No source invokes adapters directly.

8.3 Queue semantics

  1. Max active run per agent remains 1.
  2. If agent already has queued/running run:
    • coalesce duplicate wakeups
    • increment coalescedCount
    • preserve latest reason/source metadata
  3. Queue is DB-backed for restart safety.
  4. Coordinator uses FIFO by requested_at, with optional priority:
    • on_demand > assignment > timer/automation

8.4 Agent heartbeat policy fields

Agent-level control-plane settings (not adapter-specific):

{
  "heartbeat": {
    "enabled": true,
    "intervalSec": 300,
    "wakeOnAssignment": true,
    "wakeOnOnDemand": true,
    "wakeOnAutomation": true,
    "cooldownSec": 10
  }
}

Defaults:

  • enabled: true
  • intervalSec: null (no timer until explicitly set) or product default 300 if desired globally
  • wakeOnAssignment: true
  • wakeOnOnDemand: true
  • wakeOnAutomation: true

8.5 Trigger integration rules

  1. Timer checks run on server worker interval and enqueue due agents.
  2. Issue assignment mutation enqueues wakeup when assignee changes and target agent has wakeOnAssignment=true.
  3. On-demand endpoint enqueues wakeup with source=on_demand and triggerDetail=manual|ping when wakeOnOnDemand=true.
  4. Callback/system automations enqueue wakeup with source=automation and triggerDetail=callback|system when wakeOnAutomation=true.
  5. Paused/terminated agents do not receive new wakeups.
  6. Hard budget-stopped agents do not receive new wakeups.

9. Persistence Model

All tables remain company-scoped.

9.0 Changes to agents

  1. Extend adapter_type domain to include claude_local and codex_local (alongside existing process, http).
  2. Keep adapter_config as adapter-owned config (CLI flags, cwd, prompt templates, env overrides).
  3. Add runtime_config jsonb for control-plane scheduling policy:
    • heartbeat enable/interval
    • wake-on-assignment
    • wake-on-on-demand
    • wake-on-automation
    • cooldown

This separation keeps adapter config runtime-agnostic while allowing the heartbeat service to apply consistent scheduling logic.

9.1 New table: agent_runtime_state

One row per agent for aggregate runtime counters and legacy compatibility.

  • agent_id uuid pk fk agents.id
  • company_id uuid fk not null
  • adapter_type text not null
  • session_id text null
  • state_json jsonb not null default {}
  • last_run_id uuid fk heartbeat_runs.id null
  • last_run_status text null
  • total_input_tokens bigint not null default 0
  • total_output_tokens bigint not null default 0
  • total_cached_input_tokens bigint not null default 0
  • total_cost_cents bigint not null default 0
  • last_error text null
  • updated_at timestamptz not null

Invariant: exactly one runtime state row per agent.

9.1.1 New table: agent_task_sessions

One row per (company_id, agent_id, adapter_type, task_key) for resumable session state.

  • id uuid pk
  • company_id uuid fk not null
  • agent_id uuid fk not null
  • adapter_type text not null
  • task_key text not null
  • session_params_json jsonb null (adapter-defined shape)
  • session_display_id text null (for UI/debug)
  • last_run_id uuid fk heartbeat_runs.id null
  • last_error text null
  • created_at timestamptz not null
  • updated_at timestamptz not null

Invariant: unique (company_id, agent_id, adapter_type, task_key).

9.2 New table: agent_wakeup_requests

Queue + audit for wakeups.

  • id uuid pk
  • company_id uuid fk not null
  • agent_id uuid fk not null
  • source text not null (timer|assignment|on_demand|automation)
  • trigger_detail text null (manual|ping|callback|system)
  • reason text null
  • payload jsonb null
  • status text not null (queued|claimed|coalesced|skipped|completed|failed|cancelled)
  • coalesced_count int not null default 0
  • requested_by_actor_type text null (user|agent|system)
  • requested_by_actor_id text null
  • idempotency_key text null
  • run_id uuid fk heartbeat_runs.id null
  • requested_at timestamptz not null
  • claimed_at timestamptz null
  • finished_at timestamptz null
  • error text null

9.3 New table: heartbeat_run_events

Append-only per-run lightweight event timeline (no full raw log chunks).

  • id bigserial pk
  • company_id uuid fk not null
  • run_id uuid fk heartbeat_runs.id not null
  • agent_id uuid fk agents.id not null
  • seq int not null
  • event_type text not null (lifecycle|status|usage|error|structured)
  • stream text null (system|stdout|stderr) (summarized events only, not full stream chunks)
  • level text null (info|warn|error)
  • color text null
  • message text null
  • payload jsonb null
  • created_at timestamptz not null

9.4 Changes to heartbeat_runs

Add fields required for result and diagnostics:

  • wakeup_request_id uuid fk agent_wakeup_requests.id null
  • exit_code int null
  • signal text null
  • usage_json jsonb null
  • result_json jsonb null
  • session_id_before text null
  • session_id_after text null
  • log_store text null (local_file|object_store|postgres)
  • log_ref text null (opaque provider reference; path/key/uri/row id)
  • log_bytes bigint null
  • log_sha256 text null
  • log_compressed boolean not null default false
  • stderr_excerpt text null
  • stdout_excerpt text null
  • error_code text null

This keeps per-run diagnostics queryable without storing full logs in Postgres.

9.5 Log storage adapter configuration

Runtime log storage is deployment-configured (not per-agent by default).

{
  "runLogStore": {
    "type": "local_file | object_store | postgres",
    "basePath": "./data/run-logs",
    "bucket": "paperclip-run-logs",
    "prefix": "runs/",
    "compress": true,
    "maxInlineExcerptBytes": 32768
  }
}

Rules:

  1. log_ref must be opaque and provider-neutral at API boundaries.
  2. UI/API must not assume local filesystem semantics.
  3. Provider-specific secrets/credentials stay in server config, never in agent config.

10. Prompt Template and Pill System

10.1 Template format

  • Mustache-style placeholders: {{path.to.value}}
  • No arbitrary code execution.
  • Unknown variable on save = validation error.

10.2 Initial variable catalog

  • company.id
  • company.name
  • agent.id
  • agent.name
  • agent.role
  • agent.title
  • run.id
  • run.source
  • run.startedAt
  • heartbeat.reason
  • paperclip.skill (shared Paperclip skill text block)
  • credentials.apiBaseUrl
  • credentials.apiKey (optional, sensitive)

10.3 Prompt fields

  1. bootstrapPromptTemplate
    • Used when no session exists.
  2. promptTemplate
    • Used on every wakeup.
    • Can include run source/reason pills.

If bootstrapPromptTemplate is omitted, promptTemplate is used for first run.

10.4 UI requirements

  1. Agent setup/edit form includes prompt editors with pill insertion.
  2. Variables are shown as clickable pills for fast insertion.
  3. Save-time validation indicates unknown/missing variables.
  4. Sensitive pills (credentials.*) show explicit warning badge.

10.5 Security notes for credentials

  1. Credentials in prompt are allowed for initial simplicity but discouraged.
  2. Preferred transport is env vars (PAPERCLIP_*) injected at runtime.
  3. Prompt preview and logs must redact sensitive values.

11. Realtime Status Delivery

11.1 Transport

Primary transport: websocket channel per company.

  • Endpoint: GET /api/companies/:companyId/events/ws
  • Auth: board session or agent API key (company-bound)

11.2 Event envelope

{
  "eventId": "uuid-or-monotonic-id",
  "companyId": "uuid",
  "type": "heartbeat.run.status",
  "entityType": "heartbeat_run",
  "entityId": "uuid",
  "occurredAt": "2026-02-17T12:00:00Z",
  "payload": {}
}

11.3 Required event types

  1. agent.status.changed
  2. heartbeat.run.queued
  3. heartbeat.run.started
  4. heartbeat.run.status (short color+message updates)
  5. heartbeat.run.log (optional live chunk stream; full persistence handled by RunLogStore)
  6. heartbeat.run.finished
  7. issue.updated
  8. issue.comment.created
  9. activity.appended

11.4 UI behavior

  1. Agent detail view updates run timeline live.
  2. Task board reflects assignment/status/comment changes from agent activity without refresh.
  3. Org/agent list reflects status changes live.
  4. If websocket disconnects, client falls back to short polling until reconnect.

12. Error Handling and Diagnostics

12.1 Error classes

  • adapter_not_installed
  • invalid_working_directory
  • spawn_failed
  • timeout
  • cancelled
  • nonzero_exit
  • output_parse_error
  • resume_session_invalid
  • budget_blocked

12.2 Logging requirements

  1. Persist full stdout/stderr stream to configured RunLogStore.
  2. Persist only lightweight run metadata/events in Postgres (heartbeat_runs, heartbeat_run_events).
  3. Persist bounded stdout_excerpt and stderr_excerpt in Postgres for quick diagnostics.
  4. Mark truncation explicitly when excerpts are capped.
  5. Redact secrets from logs, excerpts, and websocket payloads.

12.3 Log retention and lifecycle

  1. RunLogStore retention is configurable by deployment (for example 7/30/90 days).
  2. Postgres run metadata can outlive full log objects.
  3. Deletion/pruning jobs must handle orphaned metadata/log-object references safely.
  4. If full log object is gone, APIs still return metadata and excerpts with log_unavailable status.

12.4 Restart recovery

On server startup:

  1. Find stale queued/running runs.
  2. Mark as failed with error_code=control_plane_restart.
  3. Set affected non-paused/non-terminated agents to error (or idle based on policy).
  4. Emit recovery events to websocket and activity log.

13. API Surface Changes

13.1 New/updated endpoints

  1. POST /agents/:agentId/wakeup
    • enqueue wakeup with source/reason
  2. POST /agents/:agentId/heartbeat/invoke
    • backward-compatible alias to wakeup API
  3. GET /agents/:agentId/runtime-state
    • board-only debug view
  4. GET /agents/:agentId/task-sessions
    • board-only list of task-scoped adapter sessions
  5. POST /agents/:agentId/runtime-state/reset-session
    • clears all task sessions for the agent, or one when taskKey is provided
  6. GET /heartbeat-runs/:runId/events?afterSeq=:n
    • fetch persisted lightweight timeline
  7. GET /heartbeat-runs/:runId/log
    • reads full log stream via RunLogStore (or redirects/presigned URL for object store)
  8. GET /api/companies/:companyId/events/ws
    • websocket stream

13.2 Mutation logging

All wakeup/run state mutations must create activity_log entries:

  • wakeup.requested
  • wakeup.coalesced
  • heartbeat.started
  • heartbeat.finished
  • heartbeat.failed
  • heartbeat.cancelled
  • runtime_state.updated

14. Heartbeat Service Implementation Plan

Phase 1: Contracts and schema

  1. Add new DB tables/columns (agent_runtime_state, agent_wakeup_requests, heartbeat_run_events, heartbeat_runs.log_* fields).
  2. Add RunLogStore interface and configuration wiring.
  3. Add shared types/constants/validators.
  4. Keep existing routes functional during migration.

Phase 2: Wakeup coordinator

  1. Implement DB-backed wakeup queue.
  2. Convert invoke/wake routes to enqueue with source=on_demand and appropriate triggerDetail.
  3. Add worker loop to claim and execute queued wakeups.

Phase 3: Local adapters

  1. Implement claude-local adapter.
  2. Implement codex-local adapter.
  3. Parse and persist session IDs and token usage.
  4. Wire cancel/timeout/grace behavior.

Phase 4: Realtime push

  1. Implement company websocket hub.
  2. Publish run/agent/issue events.
  3. Update UI pages to subscribe and invalidate/update relevant data.

Phase 5: Prompt pills and config UX

  1. Add adapter-specific config editor with prompt templates.
  2. Add pill insertion and variable validation.
  3. Add sensitive-variable warnings and redaction.

Phase 6: Hardening

  1. Add failure/restart recovery sweeps.
  2. Add metadata/full-log retention policies and pruning jobs.
  3. Add integration/e2e coverage for wakeup triggers and live updates.

15. Acceptance Criteria

  1. Agent with claude-local or codex-local can run, exit, and persist run result.
  2. Session parameters are persisted per task scope and reused automatically for same-task resumes.
  3. Token usage is persisted per run and accumulated per agent runtime state.
  4. Timer, assignment, on-demand, and automation wakeups all enqueue through one coordinator.
  5. Pause/terminate interrupts running local process and prevents new wakeups.
  6. Browser receives live websocket updates for run status/logs and task/agent changes.
  7. Failed runs expose rich CLI diagnostics in UI with excerpts immediately available and full log retrievable via RunLogStore.
  8. All actions remain company-scoped and auditable.

16. Open Questions

  1. Should timer default be null (off until enabled) or 300 seconds by default?
  2. What should the default retention policy be for full log objects vs Postgres metadata?
  3. Should agent API credentials be allowed in prompt templates by default, or require explicit opt-in toggle?
  4. Should websocket be the only realtime channel, or should we also expose SSE for simpler clients?