Reorganize docs: move specs and plans to doc/ subdirectories

Move doc/specs/ui.md to doc/spec/ui.md. Move plans/module-system.md to doc/plans/. Add doc/spec/agents-runtime.md and docs/ reference specs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 12:24:31 -06:00
parent 774d74bcba
commit 13ef123026
5 changed files with 628 additions and 0 deletions
--- a/doc/plans/module-system.md
+++ b/doc/plans/module-system.md
--- a/doc/spec/agents-runtime.md
+++ b/doc/spec/agents-runtime.md
@@ -0,0 +1,171 @@
+# Agent Runtime Guide
+
+Status: User-facing guide  
+Last updated: 2026-02-17  
+Audience: Operators setting up and running agents in Paperclip
+
+## 1. What this system does
+
+Agents in Paperclip do not run continuously.  
+They run in **heartbeats**: short execution windows triggered by a wakeup.
+
+Each heartbeat:
+
+1. Starts the configured agent adapter (for example, Claude CLI or Codex CLI)
+2. Gives it the current prompt/context
+3. Lets it work until it exits, times out, or is cancelled
+4. Stores results (status, token usage, errors, logs)
+5. Updates the UI live
+
+## 2. When an agent wakes up
+
+An agent can be woken up in four ways:
+
+- `timer`: scheduled interval (for example every 5 minutes)
+- `assignment`: when work is assigned/checked out to that agent
+- `on_demand`: manual wakeup (button/API)
+- `automation`: system-triggered wakeup for future automations
+
+If an agent is already running, new wakeups are merged (coalesced) instead of launching duplicate runs.
+
+## 3. What to configure per agent
+
+## 3.1 Adapter choice
+
+Common choices:
+
+- `claude_local`: runs your local `claude` CLI
+- `codex_local`: runs your local `codex` CLI
+- `process`: generic shell command adapter
+- `http`: calls an external HTTP endpoint
+
+For `claude_local` and `codex_local`, Paperclip assumes the CLI is already installed and authenticated on the host machine.
+
+## 3.2 Runtime behavior
+
+In agent runtime settings, configure heartbeat policy:
+
+- `enabled`: allow scheduled heartbeats
+- `intervalSec`: timer interval (0 = disabled)
+- `wakeOnAssignment`: wake when assigned work
+- `wakeOnOnDemand`: allow ping-style on-demand wakeups
+- `wakeOnAutomation`: allow system automation wakeups
+
+## 3.3 Working directory and execution limits
+
+For local adapters, set:
+
+- `cwd` (working directory)
+- `timeoutSec` (max runtime per heartbeat)
+- `graceSec` (time before force-kill after timeout/cancel)
+- optional env vars and extra CLI args
+
+## 3.4 Prompt templates
+
+You can set:
+
+- `bootstrapPromptTemplate`: used for first run/new session
+- `promptTemplate`: used for subsequent resumed runs
+
+Templates support variables like `{{agent.id}}`, `{{agent.name}}`, and run context values.
+
+## 4. Session resume behavior
+
+Paperclip stores session IDs for resumable adapters.
+
+- Next heartbeat reuses the saved session automatically.
+- This gives continuity across heartbeats.
+- You can reset a session if context gets stale or confused.
+
+Use session reset when:
+
+- you significantly changed prompt strategy
+- the agent is stuck in a bad loop
+- you want a clean restart
+
+## 5. Logs, status, and run history
+
+For each heartbeat run you get:
+
+- run status (`queued`, `running`, `succeeded`, `failed`, `timed_out`, `cancelled`)
+- error text and stderr/stdout excerpts
+- token usage/cost when available from the adapter
+- full logs (stored outside core run rows, optimized for large output)
+
+In local/dev setups, full logs are stored on disk under the configured run-log path.
+
+## 6. Live updates in the UI
+
+Paperclip pushes runtime/activity updates to the browser in real time.
+
+You should see live changes for:
+
+- agent status
+- heartbeat run status
+- task/activity updates caused by agent work
+- dashboard/cost/activity panels as relevant
+
+If the connection drops, the UI reconnects automatically.
+
+## 7. Common operating patterns
+
+## 7.1 Simple autonomous loop
+
+1. Enable timer wakeups (for example every 300s)
+2. Keep assignment wakeups on
+3. Use a focused prompt template
+4. Watch run logs and adjust prompt/config over time
+
+## 7.2 Event-driven loop (less constant polling)
+
+1. Disable timer or set a long interval
+2. Keep wake-on-assignment enabled
+3. Use on-demand wakeups for manual nudges
+
+## 7.3 Safety-first loop
+
+1. Short timeout
+2. Conservative prompt
+3. Monitor errors + cancel quickly when needed
+4. Reset sessions when drift appears
+
+## 8. Troubleshooting
+
+If runs fail repeatedly:
+
+1. Check adapter command availability (`claude`/`codex` installed and logged in).
+2. Verify `cwd` exists and is accessible.
+3. Inspect run error + stderr excerpt, then full log.
+4. Confirm timeout is not too low.
+5. Reset session and retry.
+6. Pause agent if it is causing repeated bad updates.
+
+Typical failure causes:
+
+- CLI not installed/authenticated
+- bad working directory
+- malformed adapter args/env
+- prompt too broad or missing constraints
+- process timeout
+
+## 9. Security and risk notes
+
+Local CLI adapters run unsandboxed on the host machine.
+
+That means:
+
+- prompt instructions matter
+- configured credentials/env vars are sensitive
+- working directory permissions matter
+
+Start with least privilege where possible, and avoid exposing secrets in broad reusable prompts unless intentionally required.
+
+## 10. Minimal setup checklist
+
+1. Choose adapter (`claude_local` or `codex_local`).
+2. Set `cwd` to the target workspace.
+3. Add bootstrap + normal prompt templates.
+4. Configure heartbeat policy (timer and/or assignment wakeups).
+5. Trigger a manual wakeup.
+6. Confirm run succeeds and session/token usage is recorded.
+7. Watch live updates and iterate prompt/config.
--- a/doc/specs/ui.md
+++ b/doc/specs/ui.md
--- a/docs/agents-runtime.md
+++ b/docs/agents-runtime.md
@@ -0,0 +1,171 @@
+# Agent Runtime Guide
+
+Status: User-facing guide  
+Last updated: 2026-02-17  
+Audience: Operators setting up and running agents in Paperclip
+
+## 1. What this system does
+
+Agents in Paperclip do not run continuously.  
+They run in **heartbeats**: short execution windows triggered by a wakeup.
+
+Each heartbeat:
+
+1. Starts the configured agent adapter (for example, Claude CLI or Codex CLI)
+2. Gives it the current prompt/context
+3. Lets it work until it exits, times out, or is cancelled
+4. Stores results (status, token usage, errors, logs)
+5. Updates the UI live
+
+## 2. When an agent wakes up
+
+An agent can be woken up in four ways:
+
+- `timer`: scheduled interval (for example every 5 minutes)
+- `assignment`: when work is assigned/checked out to that agent
+- `on_demand`: manual wakeup (button/API)
+- `automation`: system-triggered wakeup for future automations
+
+If an agent is already running, new wakeups are merged (coalesced) instead of launching duplicate runs.
+
+## 3. What to configure per agent
+
+## 3.1 Adapter choice
+
+Common choices:
+
+- `claude_local`: runs your local `claude` CLI
+- `codex_local`: runs your local `codex` CLI
+- `process`: generic shell command adapter
+- `http`: calls an external HTTP endpoint
+
+For `claude_local` and `codex_local`, Paperclip assumes the CLI is already installed and authenticated on the host machine.
+
+## 3.2 Runtime behavior
+
+In agent runtime settings, configure heartbeat policy:
+
+- `enabled`: allow scheduled heartbeats
+- `intervalSec`: timer interval (0 = disabled)
+- `wakeOnAssignment`: wake when assigned work
+- `wakeOnOnDemand`: allow ping-style on-demand wakeups
+- `wakeOnAutomation`: allow system automation wakeups
+
+## 3.3 Working directory and execution limits
+
+For local adapters, set:
+
+- `cwd` (working directory)
+- `timeoutSec` (max runtime per heartbeat)
+- `graceSec` (time before force-kill after timeout/cancel)
+- optional env vars and extra CLI args
+
+## 3.4 Prompt templates
+
+You can set:
+
+- `bootstrapPromptTemplate`: used for first run/new session
+- `promptTemplate`: used for subsequent resumed runs
+
+Templates support variables like `{{agent.id}}`, `{{agent.name}}`, and run context values.
+
+## 4. Session resume behavior
+
+Paperclip stores session IDs for resumable adapters.
+
+- Next heartbeat reuses the saved session automatically.
+- This gives continuity across heartbeats.
+- You can reset a session if context gets stale or confused.
+
+Use session reset when:
+
+- you significantly changed prompt strategy
+- the agent is stuck in a bad loop
+- you want a clean restart
+
+## 5. Logs, status, and run history
+
+For each heartbeat run you get:
+
+- run status (`queued`, `running`, `succeeded`, `failed`, `timed_out`, `cancelled`)
+- error text and stderr/stdout excerpts
+- token usage/cost when available from the adapter
+- full logs (stored outside core run rows, optimized for large output)
+
+In local/dev setups, full logs are stored on disk under the configured run-log path.
+
+## 6. Live updates in the UI
+
+Paperclip pushes runtime/activity updates to the browser in real time.
+
+You should see live changes for:
+
+- agent status
+- heartbeat run status
+- task/activity updates caused by agent work
+- dashboard/cost/activity panels as relevant
+
+If the connection drops, the UI reconnects automatically.
+
+## 7. Common operating patterns
+
+## 7.1 Simple autonomous loop
+
+1. Enable timer wakeups (for example every 300s)
+2. Keep assignment wakeups on
+3. Use a focused prompt template
+4. Watch run logs and adjust prompt/config over time
+
+## 7.2 Event-driven loop (less constant polling)
+
+1. Disable timer or set a long interval
+2. Keep wake-on-assignment enabled
+3. Use on-demand wakeups for manual nudges
+
+## 7.3 Safety-first loop
+
+1. Short timeout
+2. Conservative prompt
+3. Monitor errors + cancel quickly when needed
+4. Reset sessions when drift appears
+
+## 8. Troubleshooting
+
+If runs fail repeatedly:
+
+1. Check adapter command availability (`claude`/`codex` installed and logged in).
+2. Verify `cwd` exists and is accessible.
+3. Inspect run error + stderr excerpt, then full log.
+4. Confirm timeout is not too low.
+5. Reset session and retry.
+6. Pause agent if it is causing repeated bad updates.
+
+Typical failure causes:
+
+- CLI not installed/authenticated
+- bad working directory
+- malformed adapter args/env
+- prompt too broad or missing constraints
+- process timeout
+
+## 9. Security and risk notes
+
+Local CLI adapters run unsandboxed on the host machine.
+
+That means:
+
+- prompt instructions matter
+- configured credentials/env vars are sensitive
+- working directory permissions matter
+
+Start with least privilege where possible, and avoid exposing secrets in broad reusable prompts unless intentionally required.
+
+## 10. Minimal setup checklist
+
+1. Choose adapter (`claude_local` or `codex_local`).
+2. Set `cwd` to the target workspace.
+3. Add bootstrap + normal prompt templates.
+4. Configure heartbeat policy (timer and/or assignment wakeups).
+5. Trigger a manual wakeup.
+6. Confirm run succeeds and session/token usage is recorded.
+7. Watch live updates and iterate prompt/config.
--- a/docs/specs/agent-config-ui.md
+++ b/docs/specs/agent-config-ui.md
@@ -0,0 +1,286 @@
+# Agent Configuration & Activity UI
+
+## Context
+
+Agents are the employees of a Paperclip company. Each agent has an adapter type (`claude_local`, `codex_local`, `process`, `http`) that determines how it runs, a position in the org chart (who it reports to), a heartbeat policy (how/when it wakes up), and a budget. The UI at `/agents` needs to support creating and configuring agents, viewing their org hierarchy, and inspecting what they've been doing -- their run history, live logs, and accumulated costs.
+
+This spec covers three surfaces:
+
+1. **Agent Creation Dialog** -- the "New Agent" flow
+2. **Agent Detail Page** -- configuration, activity, and logs
+3. **Agents List Page** -- improvements to the existing list
+
+---
+
+## 1. Agent Creation Dialog
+
+Follows the existing `NewIssueDialog` / `NewProjectDialog` pattern: a `Dialog` component with expand/minimize toggle, company badge breadcrumb, and Cmd+Enter submit.
+
+### Fields
+
+**Identity (always visible):**
+
+| Field | Control | Required | Default | Notes |
+|-------|---------|----------|---------|-------|
+| Name | Text input (large, auto-focused) | Yes | -- | e.g. "Alice", "Build Bot" |
+| Title | Text input (subtitle style) | No | -- | e.g. "VP of Engineering" |
+| Role | Chip popover (select) | No | `general` | Values from `AGENT_ROLES`: ceo, cto, cmo, cfo, engineer, designer, pm, qa, devops, researcher, general |
+| Reports To | Chip popover (agent select) | No | -- | Dropdown of existing agents in the company. If this is the first agent, auto-set role to `ceo` and gray out Reports To. Otherwise required unless role is `ceo`. |
+| Capabilities | Text input | No | -- | Free-text description of what this agent can do |
+
+**Adapter (collapsible section, default open):**
+
+| Field | Control | Default | Notes |
+|-------|---------|---------|-------|
+| Adapter Type | Chip popover (select) | `claude_local` | `claude_local`, `codex_local`, `process`, `http` |
+| CWD | Text input | -- | Working directory for local adapters |
+| Prompt Template | Textarea | -- | Supports `{{ agent.id }}`, `{{ agent.name }}` etc. |
+| Bootstrap Prompt | Textarea | -- | Optional, used for first run (no existing session) |
+| Model | Text input | -- | Optional model override |
+
+**Adapter-specific fields (shown/hidden based on adapter type):**
+
+*claude_local:*
+| Field | Control | Default |
+|-------|---------|---------|
+| Max Turns Per Run | Number input | 80 |
+| Skip Permissions | Toggle | true |
+
+*codex_local:*
+| Field | Control | Default |
+|-------|---------|---------|
+| Search | Toggle | false |
+| Bypass Sandbox | Toggle | true |
+
+*process:*
+| Field | Control | Default |
+|-------|---------|---------|
+| Command | Text input | -- |
+| Args | Text input (comma-separated) | -- |
+
+*http:*
+| Field | Control | Default |
+|-------|---------|---------|
+| URL | Text input | -- |
+| Method | Select | POST |
+| Headers | Key-value pairs | -- |
+
+**Runtime (collapsible section, default collapsed):**
+
+| Field | Control | Default |
+|-------|---------|---------|
+| Context Mode | Chip popover | `thin` |
+| Monthly Budget (cents) | Number input | 0 |
+| Timeout (sec) | Number input | 900 |
+| Grace Period (sec) | Number input | 15 |
+| Extra Args | Text input | -- |
+| Env Vars | Key-value pair editor | -- |
+
+**Heartbeat Policy (collapsible section, default collapsed):**
+
+| Field | Control | Default |
+|-------|---------|---------|
+| Enabled | Toggle | true |
+| Interval (sec) | Number input | 300 |
+| Wake on Assignment | Toggle | true |
+| Wake on On-Demand | Toggle | true |
+| Wake on Automation | Toggle | true |
+| Cooldown (sec) | Number input | 10 |
+
+### Behavior
+
+- On submit, calls `agentsApi.create(companyId, data)` where `data` packs identity fields at the top level and adapter-specific fields into `adapterConfig` and heartbeat/runtime into `runtimeConfig`.
+- After creation, navigate to the new agent's detail page.
+- If the company has zero agents, pre-fill role as `ceo` and disable Reports To.
+- The adapter config section updates its visible fields when adapter type changes, preserving any shared field values (cwd, promptTemplate, etc.).
+
+---
+
+## 2. Agent Detail Page
+
+Restructure the existing tabbed layout. Keep the header (name, role, title, status badge, action buttons) and add richer tabs.
+
+### Header
+
+```
+[StatusBadge]  Agent Name                    [Invoke] [Pause/Resume] [...]
+               Role / Title
+```
+
+The `[...]` overflow menu contains: Terminate, Reset Session, Create API Key.
+
+### Tabs
+
+#### Overview Tab
+
+Two-column layout: left column is a summary card, right column is the org position.
+
+**Summary card:**
+- Adapter type + model (if set)
+- Heartbeat interval (e.g. "every 5 min") or "Disabled"
+- Last heartbeat time (relative, e.g. "3 min ago")
+- Session status: "Active (session abc123...)" or "No session"
+- Current month spend / budget with progress bar
+
+**Org position card:**
+- Reports to: clickable agent name (links to their detail page)
+- Direct reports: list of agents who report to this agent (clickable)
+
+#### Configuration Tab
+
+Editable form with the same sections as the creation dialog (Adapter, Runtime, Heartbeat Policy) but pre-populated with current values. Uses inline editing -- click a value to edit, press Enter or blur to save via `agentsApi.update()`.
+
+Sections:
+- **Identity**: name, title, role, reports to, capabilities
+- **Adapter Config**: all adapter-specific fields for the current adapter type
+- **Heartbeat Policy**: enable/disable, interval, wake-on triggers, cooldown
+- **Runtime**: context mode, budget, timeout, grace, env vars, extra args
+
+Each section is a collapsible card. Save happens per-field (PATCH on blur/enter), not a single form submit. Validation errors show inline.
+
+#### Runs Tab
+
+This is the primary activity/history view. Shows a paginated list of heartbeat runs, most recent first.
+
+**Run list item:**
+```
+[StatusIcon] #run-id-short   source: timer     2 min ago     1.2k tokens   $0.03
+             "Reviewed 3 PRs and filed 2 issues"
+```
+
+Fields per row:
+- Status icon (green check = succeeded, red X = failed, yellow spinner = running, gray clock = queued, orange timeout = timed_out, slash = cancelled)
+- Run ID (short, first 8 chars)
+- Invocation source chip (timer, assignment, on_demand, automation)
+- Relative timestamp
+- Token usage summary (total input + output)
+- Cost
+- Result summary (first line of result or error)
+
+**Clicking a run** opens a run detail inline (accordion expand) or a slide-over panel showing:
+
+- Full status timeline (queued -> running -> outcome) with timestamps
+- Session before/after
+- Token breakdown: input, output, cached input
+- Cost breakdown
+- Error message and error code (if failed)
+- Exit code and signal (if applicable)
+
+**Log viewer** within the run detail:
+- Streams `heartbeat_run_events` for the run, ordered by `seq`
+- Each event rendered as a log line with timestamp, level (color-coded), and message
+- Events of type `stdout`/`stderr` shown in monospace
+- System events shown with distinct styling
+- For running runs, auto-scrolls and appends live via WebSocket events (`heartbeat.run.event`, `heartbeat.run.log`)
+- "View full log" link fetches from `heartbeatsApi.log(runId)` and shows in a scrollable monospace container
+- Truncation: show last 200 events by default, "Load more" button to fetch earlier events
+
+#### Issues Tab
+
+Keep as-is: list of issues assigned to this agent with status, clickable to navigate to issue detail.
+
+#### Costs Tab
+
+Expand the existing costs tab:
+
+- **Cumulative totals** from `agent_runtime_state`: total input tokens, total output tokens, total cached tokens, total cost
+- **Monthly budget** progress bar (current month spend vs budget)
+- **Per-run cost table**: date, run ID, tokens in/out/cached, cost -- sortable by date or cost
+- **Chart** (stretch): simple bar chart of daily spend over last 30 days
+
+### Properties Panel (Right Sidebar)
+
+The existing `AgentProperties` panel continues to show the quick-glance info. Add:
+- Session ID (truncated, with copy button)
+- Last error (if any, in red)
+- Link to "View Configuration" (scrolls to / switches to Configuration tab)
+
+---
+
+## 3. Agents List Page
+
+### Current state
+
+Shows a flat list of agents with status badge, name, role, title, and budget bar.
+
+### Improvements
+
+**Add "New Agent" button** in the header (Plus icon + "New Agent"), opens the creation dialog.
+
+**Add view toggle**: List view (current) and Org Chart view.
+
+**Org Chart view:**
+- Tree layout showing reporting hierarchy
+- Each node shows: agent name, role, status badge
+- CEO at the top, direct reports below, etc.
+- Uses the `agentsApi.org(companyId)` endpoint which already returns `OrgNode[]`
+- Clicking a node navigates to agent detail
+
+**List view improvements:**
+- Add adapter type as a small chip/tag on each row
+- Add "last active" relative timestamp
+- Add running indicator (animated dot) if agent currently has a running heartbeat
+
+**Filtering:**
+- Tab filters: All, Active, Paused, Error (similar to Issues page pattern)
+
+---
+
+## 4. Component Inventory
+
+New components needed:
+
+| Component | Purpose |
+|-----------|---------|
+| `NewAgentDialog` | Agent creation form dialog |
+| `AgentConfigForm` | Shared form sections for create + edit (adapter, heartbeat, runtime) |
+| `AdapterConfigFields` | Conditional fields based on adapter type |
+| `HeartbeatPolicyFields` | Heartbeat configuration fields |
+| `EnvVarEditor` | Key-value pair editor for environment variables |
+| `RunListItem` | Single run row in the runs list |
+| `RunDetail` | Expanded run detail with log viewer |
+| `LogViewer` | Streaming log viewer with auto-scroll |
+| `OrgChart` | Tree visualization of agent hierarchy |
+| `AgentSelect` | Reusable agent picker (for Reports To, etc.) |
+
+Reused existing components:
+- `StatusBadge`, `EntityRow`, `EmptyState`, `PropertyRow`
+- shadcn: `Dialog`, `Tabs`, `Button`, `Popover`, `Command`, `Separator`, `Toggle`
+
+---
+
+## 5. API Surface
+
+All endpoints already exist. No new server work needed for V1.
+
+| Action | Endpoint | Used by |
+|--------|----------|---------|
+| List agents | `GET /companies/:id/agents` | List page |
+| Get org tree | `GET /companies/:id/org` | Org chart view |
+| Create agent | `POST /companies/:id/agents` | Creation dialog |
+| Update agent | `PATCH /agents/:id` | Configuration tab |
+| Pause/Resume/Terminate | `POST /agents/:id/{action}` | Header actions |
+| Reset session | `POST /agents/:id/runtime-state/reset-session` | Overflow menu |
+| Create API key | `POST /agents/:id/keys` | Overflow menu |
+| Get runtime state | `GET /agents/:id/runtime-state` | Overview tab, properties panel |
+| Invoke/Wakeup | `POST /agents/:id/heartbeat/invoke` | Header invoke button |
+| List runs | `GET /companies/:id/heartbeat-runs?agentId=X` | Runs tab |
+| Cancel run | `POST /heartbeat-runs/:id/cancel` | Run detail |
+| Run events | `GET /heartbeat-runs/:id/events` | Log viewer |
+| Run log | `GET /heartbeat-runs/:id/log` | Full log view |
+
+---
+
+## 6. Implementation Order
+
+1. **New Agent Dialog** -- unblocks agent creation from the UI
+2. **Agents List improvements** -- add New Agent button, tab filters, adapter chip, running indicator
+3. **Agent Detail: Configuration tab** -- editable adapter/heartbeat/runtime config
+4. **Agent Detail: Runs tab** -- run history list with status, tokens, cost
+5. **Agent Detail: Run Detail + Log Viewer** -- expandable run detail with streaming logs
+6. **Agent Detail: Overview tab** -- summary card, org position
+7. **Agent Detail: Costs tab** -- expanded cost breakdown
+8. **Org Chart view** -- tree visualization on list page
+9. **Properties panel updates** -- session ID, last error
+
+Steps 1-5 are the core. Steps 6-9 are polish.