Reorganize docs and add implementation spec
Move GOAL.md, PRODUCT.md, SPEC.md from repo root into doc/. Add AGENTS.md (contributor guidance), doc/DEVELOPING.md (dev setup), doc/SPEC-implementation.md (V1 implementation contract), and doc/specs/ui.md (UI design spec). Update ClipHub doc. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -8,7 +8,7 @@ ClipHub is the public registry where people share, discover, and download Paperc
|
||||
|
||||
## What It Is
|
||||
|
||||
ClipHub is to Paperclip what a package registry is to a programming language. Paperclip already supports exportable org configs (see SPEC.md §2). ClipHub is the public directory where those exports live.
|
||||
ClipHub is to Paperclip what a package registry is to a programming language. Paperclip already supports exportable org configs (see [SPEC.md](./SPEC.md) §2). ClipHub is the public directory where those exports live.
|
||||
|
||||
A user builds a working company in Paperclip — a dev shop, a marketing agency, a research lab, a content studio — exports the template, and publishes it to ClipHub. Anyone can browse, search, download, and spin up that company on their own Paperclip instance.
|
||||
|
||||
|
||||
58
doc/DEVELOPING.md
Normal file
58
doc/DEVELOPING.md
Normal file
@@ -0,0 +1,58 @@
|
||||
# Developing
|
||||
|
||||
This project can run fully in local dev without setting up PostgreSQL manually.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Node.js 20+
|
||||
- pnpm 9+
|
||||
|
||||
## Start Dev
|
||||
|
||||
From repo root:
|
||||
|
||||
```sh
|
||||
pnpm install
|
||||
pnpm dev
|
||||
```
|
||||
|
||||
This starts:
|
||||
|
||||
- API server: `http://localhost:3100`
|
||||
- UI: `http://localhost:5173`
|
||||
|
||||
## Database in Dev (Auto-Handled)
|
||||
|
||||
For local development, leave `DATABASE_URL` unset.
|
||||
The server will automatically use embedded PGlite and persist data at:
|
||||
|
||||
- `./data/pglite`
|
||||
|
||||
No Docker or external database is required for this mode.
|
||||
|
||||
## Quick Health Checks
|
||||
|
||||
In another terminal:
|
||||
|
||||
```sh
|
||||
curl http://localhost:3100/api/health
|
||||
curl http://localhost:3100/api/companies
|
||||
```
|
||||
|
||||
Expected:
|
||||
|
||||
- `/api/health` returns `{"status":"ok"}`
|
||||
- `/api/companies` returns a JSON array
|
||||
|
||||
## Reset Local Dev Database
|
||||
|
||||
To wipe local dev data and start fresh:
|
||||
|
||||
```sh
|
||||
rm -rf data/pglite
|
||||
pnpm dev
|
||||
```
|
||||
|
||||
## Optional: Use External Postgres
|
||||
|
||||
If you set `DATABASE_URL`, the server will use that instead of embedded PGlite.
|
||||
47
doc/GOAL.md
Normal file
47
doc/GOAL.md
Normal file
@@ -0,0 +1,47 @@
|
||||
# Paperclip
|
||||
|
||||
Build software to manage a zero-human company where all employees are AI agents.
|
||||
|
||||
## The Problem
|
||||
|
||||
Task management software doesn't go far enough. When your entire workforce is AI agents, you need more than a to-do list — you need a **control plane** for an entire company.
|
||||
|
||||
## What This Is
|
||||
|
||||
Paperclip is the command, communication, and control plane for a company of AI agents. It is the single place where you:
|
||||
|
||||
- **Manage agents as employees** — hire, organize, and track who does what
|
||||
- **Define org structure** — org charts that agents themselves operate within
|
||||
- **Track work in real time** — see at any moment what every agent is working on
|
||||
- **Control costs** — token salary budgets per agent, spend tracking, burn rate
|
||||
- **Align to goals** — agents see how their work serves the bigger mission
|
||||
- **Store company knowledge** — a shared brain for the organization
|
||||
|
||||
## Architecture
|
||||
|
||||
Two layers:
|
||||
|
||||
### 1. Control Plane (this software)
|
||||
|
||||
The central nervous system. Manages:
|
||||
|
||||
- Agent registry and org chart
|
||||
- Task assignment and status
|
||||
- Budget and token spend tracking
|
||||
- Company knowledge base
|
||||
- Goal hierarchy (company → team → agent → task)
|
||||
- Heartbeat monitoring — know when agents are alive, idle, or stuck
|
||||
|
||||
### 2. Execution Services (adapters)
|
||||
|
||||
Agents run externally and report into the control plane. An agent is just Python code that gets kicked off and does work. Adapters connect different execution environments:
|
||||
|
||||
- **OpenClaw** — initial adapter target
|
||||
- **Heartbeat loop** — simple custom Python that loops, checks in, does work
|
||||
- **Others** — any runtime that can call an API
|
||||
|
||||
The control plane doesn't run agents. It orchestrates them. Agents run wherever they run and phone home.
|
||||
|
||||
## Core Principle
|
||||
|
||||
You should be able to look at Paperclip and understand your entire company at a glance — who's doing what, how much it costs, and whether it's working.
|
||||
85
doc/PRODUCT.md
Normal file
85
doc/PRODUCT.md
Normal file
@@ -0,0 +1,85 @@
|
||||
# Paperclip — Product Definition
|
||||
|
||||
## What It Is
|
||||
|
||||
Paperclip is the control plane for autonomous AI companies. One instance of Paperclip can run multiple companies. A **company** is a first-order object.
|
||||
|
||||
## Core Concepts
|
||||
|
||||
### Company
|
||||
|
||||
A company has:
|
||||
- A **goal** — the reason it exists ("Create the #1 AI note-taking app that does $1M MRR within 3 months")
|
||||
- **Employees** — every employee is an AI agent
|
||||
- **Org structure** — who reports to whom
|
||||
- **Revenue & expenses** — tracked at the company level
|
||||
- **Task hierarchy** — all work traces back to the company goal
|
||||
|
||||
### Employees & Agents
|
||||
|
||||
Every employee is an agent. When you create a company, you start by defining the CEO, then build out from there.
|
||||
|
||||
Each employee has:
|
||||
- **Adapter type + config** — how this agent runs and what defines its identity/behavior. This is adapter-specific (e.g., an OpenClaw agent might use SOUL.md and HEARTBEAT.md files; a Claude Code agent might use CLAUDE.md; a bare script might use CLI args). Paperclip doesn't prescribe the format — the adapter does.
|
||||
- **Role & reporting** — their title, who they report to, who reports to them
|
||||
- **Capabilities description** — a short paragraph on what this agent does and when they're relevant (helps other agents discover who can help with what)
|
||||
|
||||
Example: A CEO agent's adapter config tells it to "review what your executives are doing, check company metrics, reprioritize if needed, assign new strategic initiatives" on each heartbeat. An engineer's config tells it to "check assigned tasks, pick the highest priority, and work it."
|
||||
|
||||
Then you define who reports to the CEO: a CTO managing programmers, a CMO managing the marketing team, and so on. Every agent in the tree gets their own adapter configuration.
|
||||
|
||||
### Agent Execution
|
||||
|
||||
There are two fundamental modes for running an agent's heartbeat:
|
||||
|
||||
1. **Run a command** — Paperclip kicks off a process (shell command, Python script, etc.) and tracks it. The heartbeat is "execute this and monitor it."
|
||||
2. **Fire and forget a request** — Paperclip sends a webhook/API call to an externally running agent. The heartbeat is "notify this agent to wake up." (OpenClaw hooks work this way.)
|
||||
|
||||
We provide sensible defaults — a default agent that shells out to Claude Code or Codex with your configuration, remembers session IDs, runs basic scripts. But you can plug in anything.
|
||||
|
||||
### Task Management
|
||||
|
||||
Task management is hierarchical. At any moment, every piece of work must trace back to the company's top-level goal through a chain of parent tasks:
|
||||
|
||||
```
|
||||
I am researching the Facebook ads Granola uses (current task)
|
||||
because → I need to create Facebook ads for our software (parent)
|
||||
because → I need to grow new signups by 100 users (parent)
|
||||
because → I need to get revenue to $2,000 this week (parent)
|
||||
because → ...
|
||||
because → We're building the #1 AI note-taking app to $1M MRR in 3 months
|
||||
```
|
||||
|
||||
Tasks have parentage. Every task exists in service of a parent task, all the way up to the company goal. This is what keeps autonomous agents aligned — they can always answer "why am I doing this?"
|
||||
|
||||
More detailed task structure TBD.
|
||||
|
||||
## Principles
|
||||
|
||||
1. **Unopinionated about how you run your agents.** Your agents could be OpenClaw bots, Python scripts, Node scripts, Claude Code sessions, Codex instances — we don't care. Paperclip defines the control plane for communication and provides utility infrastructure for heartbeats. It does not mandate an agent runtime.
|
||||
|
||||
2. **Company is the unit of organization.** Everything lives under a company. One Paperclip instance, many companies.
|
||||
|
||||
3. **Adapter config defines the agent.** Every agent has an adapter type and configuration that controls its identity and behavior. The minimum contract is just "be callable."
|
||||
|
||||
4. **All work traces to the goal.** Hierarchical task management means nothing exists in isolation. If you can't explain why a task matters to the company goal, it shouldn't exist.
|
||||
|
||||
5. **Control plane, not execution plane.** Paperclip orchestrates. Agents run wherever they run and phone home.
|
||||
|
||||
## User Flow (Dream Scenario)
|
||||
|
||||
1. Open Paperclip, create a new company
|
||||
2. Define the company's goal: "Create the #1 AI note-taking app, $1M MRR in 3 months"
|
||||
3. Create the CEO
|
||||
- Choose an adapter (e.g., process adapter for Claude Code, HTTP adapter for OpenClaw)
|
||||
- Configure the adapter (agent identity, loop behavior, execution settings)
|
||||
- CEO proposes strategic breakdown → board approves
|
||||
4. Define the CEO's reports: CTO, CMO, CFO, etc.
|
||||
- Each gets their own adapter config and role definition
|
||||
5. Define their reports: engineers under CTO, marketers under CMO, etc.
|
||||
6. Set budgets, define initial strategic tasks
|
||||
7. Hit go — agents start their heartbeats and the company runs
|
||||
|
||||
## Further Detail
|
||||
|
||||
See [SPEC.md](./SPEC.md) for the full technical specification and [TASKS.md](./TASKS.md) for the task management data model.
|
||||
765
doc/SPEC-implementation.md
Normal file
765
doc/SPEC-implementation.md
Normal file
@@ -0,0 +1,765 @@
|
||||
# Paperclip V1 Implementation Spec
|
||||
|
||||
Status: Implementation contract for first release (V1)
|
||||
Date: 2026-02-17
|
||||
Audience: Product, engineering, and agent-integration authors
|
||||
Source inputs: `GOAL.md`, `PRODUCT.md`, `SPEC.md`, `DATABASE.md`, current monorepo code
|
||||
|
||||
## 1. Document Role
|
||||
|
||||
`SPEC.md` remains the long-horizon product spec.
|
||||
This document is the concrete, build-ready V1 contract.
|
||||
When there is a conflict, `SPEC-implementation.md` controls V1 behavior.
|
||||
|
||||
## 2. V1 Outcomes
|
||||
|
||||
Paperclip V1 must provide a full control-plane loop for autonomous agents:
|
||||
|
||||
1. A human board creates a company and defines goals.
|
||||
2. The board creates and manages agents in an org tree.
|
||||
3. Agents receive and execute tasks via heartbeat invocations.
|
||||
4. All work is tracked through tasks/comments with audit visibility.
|
||||
5. Token/cost usage is reported and budget limits can stop work.
|
||||
6. The board can intervene anywhere (pause agents/tasks, override decisions).
|
||||
|
||||
Success means one operator can run a small AI-native company end-to-end with clear visibility and control.
|
||||
|
||||
## 3. Explicit V1 Product Decisions
|
||||
|
||||
These decisions close open questions from `SPEC.md` for V1.
|
||||
|
||||
| Topic | V1 Decision |
|
||||
|---|---|
|
||||
| Tenancy | Single-tenant deployment, multi-company data model |
|
||||
| Company model | Company is first-order; all business entities are company-scoped |
|
||||
| Board | Single human board operator per deployment |
|
||||
| Org graph | Strict tree (`reports_to` nullable root); no multi-manager reporting |
|
||||
| Visibility | Full visibility to board and all agents in same company |
|
||||
| Communication | Tasks + comments only (no separate chat system) |
|
||||
| Task ownership | Single assignee; atomic checkout required for `in_progress` transition |
|
||||
| Recovery | No automatic reassignment; stale work is surfaced, not silently fixed |
|
||||
| Agent adapters | Built-in `process` and `http` adapters |
|
||||
| Auth | Session auth for board, API keys for agents |
|
||||
| Budget period | Monthly UTC calendar window |
|
||||
| Budget enforcement | Soft alerts + hard limit auto-pause |
|
||||
| Deployment modes | Embedded PGlite default; Docker/hosted Postgres supported |
|
||||
|
||||
## 4. Current Baseline (Repo Snapshot)
|
||||
|
||||
As of 2026-02-17, the repo already includes:
|
||||
|
||||
- Node + TypeScript backend with REST CRUD for `agents`, `projects`, `goals`, `issues`, `activity`
|
||||
- React UI pages for dashboard/agents/projects/goals/issues lists
|
||||
- PostgreSQL schema via Drizzle with embedded PGlite fallback when `DATABASE_URL` is unset
|
||||
|
||||
V1 implementation extends this baseline into a company-centric, governance-aware control plane.
|
||||
|
||||
## 5. V1 Scope
|
||||
|
||||
## 5.1 In Scope
|
||||
|
||||
- Company lifecycle (create/list/get/update/archive)
|
||||
- Goal hierarchy linked to company mission
|
||||
- Agent lifecycle with org structure and adapter configuration
|
||||
- Task lifecycle with parent/child hierarchy and comments
|
||||
- Atomic task checkout and explicit task status transitions
|
||||
- Board approvals for hires and CEO strategy proposal
|
||||
- Heartbeat invocation, status tracking, and cancellation
|
||||
- Cost event ingestion and rollups (agent/task/project/company)
|
||||
- Budget settings and hard-stop enforcement
|
||||
- Board web UI for dashboard, org chart, tasks, agents, approvals, costs
|
||||
- Agent-facing API contract (task read/write, heartbeat report, cost report)
|
||||
- Auditable activity log for all mutating actions
|
||||
|
||||
## 5.2 Out of Scope (V1)
|
||||
|
||||
- Plugin framework and third-party extension SDK
|
||||
- Revenue/expense accounting beyond model/token costs
|
||||
- Knowledge base subsystem
|
||||
- Public marketplace (ClipHub)
|
||||
- Multi-board governance or role-based human permission granularity
|
||||
- Automatic self-healing orchestration (auto-reassign/retry planners)
|
||||
|
||||
## 6. Architecture
|
||||
|
||||
## 6.1 Runtime Components
|
||||
|
||||
- `server/`: REST API, auth, orchestration services
|
||||
- `ui/`: Board operator interface
|
||||
- `packages/db/`: Drizzle schema, migrations, DB clients (Postgres and PGlite)
|
||||
- `packages/shared/`: Shared API types, validators, constants
|
||||
|
||||
## 6.2 Data Stores
|
||||
|
||||
- Primary: PostgreSQL
|
||||
- Local default: embedded PGlite at `./data/pglite`
|
||||
- Optional local prod-like: Docker Postgres
|
||||
- Optional hosted: Supabase/Postgres-compatible
|
||||
|
||||
## 6.3 Background Processing
|
||||
|
||||
A lightweight scheduler/worker in the server process handles:
|
||||
|
||||
- heartbeat trigger checks
|
||||
- stuck run detection
|
||||
- budget threshold checks
|
||||
- stale task reporting generation
|
||||
|
||||
Separate queue infrastructure is not required for V1.
|
||||
|
||||
## 7. Canonical Data Model (V1)
|
||||
|
||||
All core tables include `id`, `created_at`, `updated_at` unless noted.
|
||||
|
||||
## 7.0 Auth Tables
|
||||
|
||||
Human auth tables (`users`, `sessions`, and provider-specific auth artifacts) are managed by the selected auth library. This spec treats them as required dependencies and references `users.id` where user attribution is needed.
|
||||
|
||||
## 7.1 `companies`
|
||||
|
||||
- `id` uuid pk
|
||||
- `name` text not null
|
||||
- `description` text null
|
||||
- `status` enum: `active | paused | archived`
|
||||
|
||||
Invariant: every business record belongs to exactly one company.
|
||||
|
||||
## 7.2 `agents`
|
||||
|
||||
- `id` uuid pk
|
||||
- `company_id` uuid fk `companies.id` not null
|
||||
- `name` text not null
|
||||
- `role` text not null
|
||||
- `title` text null
|
||||
- `status` enum: `active | paused | idle | running | error | terminated`
|
||||
- `reports_to` uuid fk `agents.id` null
|
||||
- `capabilities` text null
|
||||
- `adapter_type` enum: `process | http`
|
||||
- `adapter_config` jsonb not null
|
||||
- `context_mode` enum: `thin | fat` default `thin`
|
||||
- `budget_monthly_cents` int not null default 0
|
||||
- `spent_monthly_cents` int not null default 0
|
||||
- `last_heartbeat_at` timestamptz null
|
||||
|
||||
Invariants:
|
||||
|
||||
- agent and manager must be in same company
|
||||
- no cycles in reporting tree
|
||||
- `terminated` agents cannot be resumed
|
||||
|
||||
## 7.3 `agent_api_keys`
|
||||
|
||||
- `id` uuid pk
|
||||
- `agent_id` uuid fk `agents.id` not null
|
||||
- `company_id` uuid fk `companies.id` not null
|
||||
- `name` text not null
|
||||
- `key_hash` text not null
|
||||
- `last_used_at` timestamptz null
|
||||
- `revoked_at` timestamptz null
|
||||
|
||||
Invariant: plaintext key shown once at creation; only hash stored.
|
||||
|
||||
## 7.4 `goals`
|
||||
|
||||
- `id` uuid pk
|
||||
- `company_id` uuid fk not null
|
||||
- `title` text not null
|
||||
- `description` text null
|
||||
- `level` enum: `company | team | agent | task`
|
||||
- `parent_id` uuid fk `goals.id` null
|
||||
- `owner_agent_id` uuid fk `agents.id` null
|
||||
- `status` enum: `planned | active | achieved | cancelled`
|
||||
|
||||
Invariant: at least one root `company` level goal per company.
|
||||
|
||||
## 7.5 `projects`
|
||||
|
||||
- `id` uuid pk
|
||||
- `company_id` uuid fk not null
|
||||
- `goal_id` uuid fk `goals.id` null
|
||||
- `name` text not null
|
||||
- `description` text null
|
||||
- `status` enum: `backlog | planned | in_progress | completed | cancelled`
|
||||
- `lead_agent_id` uuid fk `agents.id` null
|
||||
- `target_date` date null
|
||||
|
||||
## 7.6 `issues` (core task entity)
|
||||
|
||||
- `id` uuid pk
|
||||
- `company_id` uuid fk not null
|
||||
- `project_id` uuid fk `projects.id` null
|
||||
- `goal_id` uuid fk `goals.id` null
|
||||
- `parent_id` uuid fk `issues.id` null
|
||||
- `title` text not null
|
||||
- `description` text null
|
||||
- `status` enum: `backlog | todo | in_progress | in_review | done | blocked | cancelled`
|
||||
- `priority` enum: `critical | high | medium | low`
|
||||
- `assignee_agent_id` uuid fk `agents.id` null
|
||||
- `created_by_agent_id` uuid fk `agents.id` null
|
||||
- `created_by_user_id` uuid fk `users.id` null
|
||||
- `request_depth` int not null default 0
|
||||
- `billing_code` text null
|
||||
- `started_at` timestamptz null
|
||||
- `completed_at` timestamptz null
|
||||
- `cancelled_at` timestamptz null
|
||||
|
||||
Invariants:
|
||||
|
||||
- single assignee only
|
||||
- task must trace to company goal chain via `goal_id`, `parent_id`, or project-goal linkage
|
||||
- `in_progress` requires assignee
|
||||
- terminal states: `done | cancelled`
|
||||
|
||||
## 7.7 `issue_comments`
|
||||
|
||||
- `id` uuid pk
|
||||
- `company_id` uuid fk not null
|
||||
- `issue_id` uuid fk `issues.id` not null
|
||||
- `author_agent_id` uuid fk `agents.id` null
|
||||
- `author_user_id` uuid fk `users.id` null
|
||||
- `body` text not null
|
||||
|
||||
## 7.8 `heartbeat_runs`
|
||||
|
||||
- `id` uuid pk
|
||||
- `company_id` uuid fk not null
|
||||
- `agent_id` uuid fk not null
|
||||
- `invocation_source` enum: `scheduler | manual | callback`
|
||||
- `status` enum: `queued | running | succeeded | failed | cancelled | timed_out`
|
||||
- `started_at` timestamptz null
|
||||
- `finished_at` timestamptz null
|
||||
- `error` text null
|
||||
- `external_run_id` text null
|
||||
- `context_snapshot` jsonb null
|
||||
|
||||
## 7.9 `cost_events`
|
||||
|
||||
- `id` uuid pk
|
||||
- `company_id` uuid fk not null
|
||||
- `agent_id` uuid fk `agents.id` not null
|
||||
- `issue_id` uuid fk `issues.id` null
|
||||
- `project_id` uuid fk `projects.id` null
|
||||
- `goal_id` uuid fk `goals.id` null
|
||||
- `billing_code` text null
|
||||
- `provider` text not null
|
||||
- `model` text not null
|
||||
- `input_tokens` int not null default 0
|
||||
- `output_tokens` int not null default 0
|
||||
- `cost_cents` int not null
|
||||
- `occurred_at` timestamptz not null
|
||||
|
||||
Invariant: each event must attach to agent and company; rollups are aggregation, never manually edited.
|
||||
|
||||
## 7.10 `approvals`
|
||||
|
||||
- `id` uuid pk
|
||||
- `company_id` uuid fk not null
|
||||
- `type` enum: `hire_agent | approve_ceo_strategy`
|
||||
- `requested_by_agent_id` uuid fk `agents.id` null
|
||||
- `requested_by_user_id` uuid fk `users.id` null
|
||||
- `status` enum: `pending | approved | rejected | cancelled`
|
||||
- `payload` jsonb not null
|
||||
- `decision_note` text null
|
||||
- `decided_by_user_id` uuid fk `users.id` null
|
||||
- `decided_at` timestamptz null
|
||||
|
||||
## 7.11 `activity_log`
|
||||
|
||||
- `id` uuid pk
|
||||
- `company_id` uuid fk not null
|
||||
- `actor_type` enum: `agent | user | system`
|
||||
- `actor_id` uuid/text not null
|
||||
- `action` text not null
|
||||
- `entity_type` text not null
|
||||
- `entity_id` uuid/text not null
|
||||
- `details` jsonb null
|
||||
- `created_at` timestamptz not null default now()
|
||||
|
||||
## 7.12 Required Indexes
|
||||
|
||||
- `agents(company_id, status)`
|
||||
- `agents(company_id, reports_to)`
|
||||
- `issues(company_id, status)`
|
||||
- `issues(company_id, assignee_agent_id, status)`
|
||||
- `issues(company_id, parent_id)`
|
||||
- `issues(company_id, project_id)`
|
||||
- `cost_events(company_id, occurred_at)`
|
||||
- `cost_events(company_id, agent_id, occurred_at)`
|
||||
- `heartbeat_runs(company_id, agent_id, started_at desc)`
|
||||
- `approvals(company_id, status, type)`
|
||||
- `activity_log(company_id, created_at desc)`
|
||||
|
||||
## 8. State Machines
|
||||
|
||||
## 8.1 Agent Status
|
||||
|
||||
Allowed transitions:
|
||||
|
||||
- `idle -> running`
|
||||
- `running -> idle`
|
||||
- `running -> error`
|
||||
- `error -> idle`
|
||||
- `idle -> paused`
|
||||
- `running -> paused` (requires cancel flow)
|
||||
- `paused -> idle`
|
||||
- `* -> terminated` (board only, irreversible)
|
||||
|
||||
## 8.2 Issue Status
|
||||
|
||||
Allowed transitions:
|
||||
|
||||
- `backlog -> todo | cancelled`
|
||||
- `todo -> in_progress | blocked | cancelled`
|
||||
- `in_progress -> in_review | blocked | done | cancelled`
|
||||
- `in_review -> in_progress | done | cancelled`
|
||||
- `blocked -> todo | in_progress | cancelled`
|
||||
- terminal: `done`, `cancelled`
|
||||
|
||||
Side effects:
|
||||
|
||||
- entering `in_progress` sets `started_at` if null
|
||||
- entering `done` sets `completed_at`
|
||||
- entering `cancelled` sets `cancelled_at`
|
||||
|
||||
## 8.3 Approval Status
|
||||
|
||||
- `pending -> approved | rejected | cancelled`
|
||||
- terminal after decision
|
||||
|
||||
## 9. Auth and Permissions
|
||||
|
||||
## 9.1 Board Auth
|
||||
|
||||
- Session-based auth for human operator
|
||||
- Board has full read/write across all companies in deployment
|
||||
- Every board mutation writes to `activity_log`
|
||||
|
||||
## 9.2 Agent Auth
|
||||
|
||||
- Bearer API key mapped to one agent and company
|
||||
- Agent key scope:
|
||||
- read org/task/company context for own company
|
||||
- read/write own assigned tasks and comments
|
||||
- create tasks/comments for delegation
|
||||
- report heartbeat status
|
||||
- report cost events
|
||||
- Agent cannot:
|
||||
- bypass approval gates
|
||||
- modify company-wide budgets directly
|
||||
- mutate auth/keys
|
||||
|
||||
## 9.3 Permission Matrix (V1)
|
||||
|
||||
| Action | Board | Agent |
|
||||
|---|---|---|
|
||||
| Create company | yes | no |
|
||||
| Hire/create agent | yes (direct) | request via approval |
|
||||
| Pause/resume agent | yes | no |
|
||||
| Create/update task | yes | yes |
|
||||
| Force reassign task | yes | limited |
|
||||
| Approve strategy/hire requests | yes | no |
|
||||
| Report cost | yes | yes |
|
||||
| Set company budget | yes | no |
|
||||
| Set subordinate budget | yes | yes (manager subtree only) |
|
||||
|
||||
## 10. API Contract (REST)
|
||||
|
||||
All endpoints are under `/api` and return JSON.
|
||||
|
||||
## 10.1 Companies
|
||||
|
||||
- `GET /companies`
|
||||
- `POST /companies`
|
||||
- `GET /companies/:companyId`
|
||||
- `PATCH /companies/:companyId`
|
||||
- `POST /companies/:companyId/archive`
|
||||
|
||||
## 10.2 Goals
|
||||
|
||||
- `GET /companies/:companyId/goals`
|
||||
- `POST /companies/:companyId/goals`
|
||||
- `GET /goals/:goalId`
|
||||
- `PATCH /goals/:goalId`
|
||||
- `DELETE /goals/:goalId` (soft delete optional, hard delete board-only)
|
||||
|
||||
## 10.3 Agents
|
||||
|
||||
- `GET /companies/:companyId/agents`
|
||||
- `POST /companies/:companyId/agents`
|
||||
- `GET /agents/:agentId`
|
||||
- `PATCH /agents/:agentId`
|
||||
- `POST /agents/:agentId/pause`
|
||||
- `POST /agents/:agentId/resume`
|
||||
- `POST /agents/:agentId/terminate`
|
||||
- `POST /agents/:agentId/keys` (create API key)
|
||||
- `POST /agents/:agentId/heartbeat/invoke`
|
||||
|
||||
## 10.4 Tasks (Issues)
|
||||
|
||||
- `GET /companies/:companyId/issues`
|
||||
- `POST /companies/:companyId/issues`
|
||||
- `GET /issues/:issueId`
|
||||
- `PATCH /issues/:issueId`
|
||||
- `POST /issues/:issueId/checkout`
|
||||
- `POST /issues/:issueId/release`
|
||||
- `POST /issues/:issueId/comments`
|
||||
- `GET /issues/:issueId/comments`
|
||||
|
||||
### 10.4.1 Atomic Checkout Contract
|
||||
|
||||
`POST /issues/:issueId/checkout` request:
|
||||
|
||||
```json
|
||||
{
|
||||
"agentId": "uuid",
|
||||
"expectedStatuses": ["todo", "backlog", "blocked"]
|
||||
}
|
||||
```
|
||||
|
||||
Server behavior:
|
||||
|
||||
1. single SQL update with `WHERE id = ? AND status IN (?) AND (assignee_agent_id IS NULL OR assignee_agent_id = :agentId)`
|
||||
2. if updated row count is 0, return `409` with current owner/status
|
||||
3. successful checkout sets `assignee_agent_id`, `status = in_progress`, and `started_at`
|
||||
|
||||
## 10.5 Projects
|
||||
|
||||
- `GET /companies/:companyId/projects`
|
||||
- `POST /companies/:companyId/projects`
|
||||
- `GET /projects/:projectId`
|
||||
- `PATCH /projects/:projectId`
|
||||
|
||||
## 10.6 Approvals
|
||||
|
||||
- `GET /companies/:companyId/approvals?status=pending`
|
||||
- `POST /companies/:companyId/approvals`
|
||||
- `POST /approvals/:approvalId/approve`
|
||||
- `POST /approvals/:approvalId/reject`
|
||||
|
||||
## 10.7 Cost and Budgets
|
||||
|
||||
- `POST /companies/:companyId/cost-events`
|
||||
- `GET /companies/:companyId/costs/summary`
|
||||
- `GET /companies/:companyId/costs/by-agent`
|
||||
- `GET /companies/:companyId/costs/by-project`
|
||||
- `PATCH /companies/:companyId/budgets`
|
||||
- `PATCH /agents/:agentId/budgets`
|
||||
|
||||
## 10.8 Activity and Dashboard
|
||||
|
||||
- `GET /companies/:companyId/activity`
|
||||
- `GET /companies/:companyId/dashboard`
|
||||
|
||||
Dashboard payload must include:
|
||||
|
||||
- active/running/paused/error agent counts
|
||||
- open/in-progress/blocked/done issue counts
|
||||
- month-to-date spend and budget utilization
|
||||
- pending approvals count
|
||||
- stale task count
|
||||
|
||||
## 10.9 Error Semantics
|
||||
|
||||
- `400` validation error
|
||||
- `401` unauthenticated
|
||||
- `403` unauthorized
|
||||
- `404` not found
|
||||
- `409` state conflict (checkout conflict, invalid transition)
|
||||
- `422` semantic rule violation
|
||||
- `500` server error
|
||||
|
||||
## 11. Heartbeat and Adapter Contract
|
||||
|
||||
## 11.1 Adapter Interface
|
||||
|
||||
```ts
|
||||
interface AgentAdapter {
|
||||
invoke(agent: Agent, context: InvocationContext): Promise<InvokeResult>;
|
||||
status(run: HeartbeatRun): Promise<RunStatus>;
|
||||
cancel(run: HeartbeatRun): Promise<void>;
|
||||
}
|
||||
```
|
||||
|
||||
## 11.2 Process Adapter
|
||||
|
||||
Config shape:
|
||||
|
||||
```json
|
||||
{
|
||||
"command": "string",
|
||||
"args": ["string"],
|
||||
"cwd": "string",
|
||||
"env": {"KEY": "VALUE"},
|
||||
"timeoutSec": 900,
|
||||
"graceSec": 15
|
||||
}
|
||||
```
|
||||
|
||||
Behavior:
|
||||
|
||||
- spawn child process
|
||||
- stream stdout/stderr to run logs
|
||||
- mark run status on exit code/timeout
|
||||
- cancel sends SIGTERM then SIGKILL after grace
|
||||
|
||||
## 11.3 HTTP Adapter
|
||||
|
||||
Config shape:
|
||||
|
||||
```json
|
||||
{
|
||||
"url": "https://...",
|
||||
"method": "POST",
|
||||
"headers": {"Authorization": "Bearer ..."},
|
||||
"timeoutMs": 15000,
|
||||
"payloadTemplate": {"agentId": "{{agent.id}}", "runId": "{{run.id}}"}
|
||||
}
|
||||
```
|
||||
|
||||
Behavior:
|
||||
|
||||
- invoke by outbound HTTP request
|
||||
- 2xx means accepted
|
||||
- non-2xx marks failed invocation
|
||||
- optional callback endpoint allows asynchronous completion updates
|
||||
|
||||
## 11.4 Context Delivery
|
||||
|
||||
- `thin`: send IDs and pointers only; agent fetches context via API
|
||||
- `fat`: include current assignments, goal summary, budget snapshot, and recent comments
|
||||
|
||||
## 11.5 Scheduler Rules
|
||||
|
||||
Per-agent schedule fields in `adapter_config`:
|
||||
|
||||
- `enabled` boolean
|
||||
- `intervalSec` integer (minimum 30)
|
||||
- `maxConcurrentRuns` fixed at `1` for V1
|
||||
|
||||
Scheduler must skip invocation when:
|
||||
|
||||
- agent is paused/terminated
|
||||
- an existing run is active
|
||||
- hard budget limit has been hit
|
||||
|
||||
## 12. Governance and Approval Flows
|
||||
|
||||
## 12.1 Hiring
|
||||
|
||||
1. Agent or board creates `approval(type=hire_agent, status=pending, payload=agent draft)`.
|
||||
2. Board approves or rejects.
|
||||
3. On approval, server creates agent row and initial API key (optional).
|
||||
4. Decision is logged in `activity_log`.
|
||||
|
||||
Board can bypass request flow and create agents directly via UI; direct create is still logged as a governance action.
|
||||
|
||||
## 12.2 CEO Strategy Approval
|
||||
|
||||
1. CEO posts strategy proposal as `approval(type=approve_ceo_strategy)`.
|
||||
2. Board reviews payload (plan text, initial structure, high-level tasks).
|
||||
3. Approval unlocks execution state for CEO-created delegated work.
|
||||
|
||||
Before first strategy approval, CEO may only draft tasks, not transition them to active execution states.
|
||||
|
||||
## 12.3 Board Override
|
||||
|
||||
Board can at any time:
|
||||
|
||||
- pause/resume/terminate any agent
|
||||
- reassign or cancel any task
|
||||
- edit budgets and limits
|
||||
- approve/reject/cancel pending approvals
|
||||
|
||||
## 13. Cost and Budget System
|
||||
|
||||
## 13.1 Budget Layers
|
||||
|
||||
- company monthly budget
|
||||
- agent monthly budget
|
||||
- optional project budget (if configured)
|
||||
|
||||
## 13.2 Enforcement Rules
|
||||
|
||||
- soft alert default threshold: 80%
|
||||
- hard limit: at 100%, trigger:
|
||||
- set agent status to `paused`
|
||||
- block new checkout/invocation for that agent
|
||||
- emit high-priority activity event
|
||||
|
||||
Board may override by raising budget or explicitly resuming agent.
|
||||
|
||||
## 13.3 Cost Event Ingestion
|
||||
|
||||
`POST /companies/:companyId/cost-events` body:
|
||||
|
||||
```json
|
||||
{
|
||||
"agentId": "uuid",
|
||||
"issueId": "uuid",
|
||||
"provider": "openai",
|
||||
"model": "gpt-5",
|
||||
"inputTokens": 1234,
|
||||
"outputTokens": 567,
|
||||
"costCents": 89,
|
||||
"occurredAt": "2026-02-17T20:25:00Z",
|
||||
"billingCode": "optional"
|
||||
}
|
||||
```
|
||||
|
||||
Validation:
|
||||
|
||||
- non-negative token counts
|
||||
- `costCents >= 0`
|
||||
- company ownership checks for all linked entities
|
||||
|
||||
## 13.4 Rollups
|
||||
|
||||
Read-time aggregate queries are acceptable for V1.
|
||||
Materialized rollups can be added later if query latency exceeds targets.
|
||||
|
||||
## 14. UI Requirements (Board App)
|
||||
|
||||
V1 UI routes:
|
||||
|
||||
- `/` dashboard
|
||||
- `/companies` company list/create
|
||||
- `/companies/:id/org` org chart and agent status
|
||||
- `/companies/:id/tasks` task list/kanban
|
||||
- `/companies/:id/agents/:agentId` agent detail
|
||||
- `/companies/:id/costs` cost and budget dashboard
|
||||
- `/companies/:id/approvals` pending/history approvals
|
||||
- `/companies/:id/activity` audit/event stream
|
||||
|
||||
Required UX behaviors:
|
||||
|
||||
- global company selector
|
||||
- quick actions: pause/resume agent, create task, approve/reject request
|
||||
- conflict toasts on atomic checkout failure
|
||||
- clear stale-task indicators
|
||||
- no silent background failures; every failed run visible in UI
|
||||
|
||||
## 15. Operational Requirements
|
||||
|
||||
## 15.1 Environment
|
||||
|
||||
- Node 20+
|
||||
- `DATABASE_URL` optional
|
||||
- if unset, auto-use PGlite and push schema
|
||||
|
||||
## 15.2 Migrations
|
||||
|
||||
- Drizzle migrations are source of truth
|
||||
- no destructive migration in-place for V1 upgrade path
|
||||
- provide migration script from existing minimal tables to company-scoped schema
|
||||
|
||||
## 15.3 Logging and Audit
|
||||
|
||||
- structured logs (JSON in production)
|
||||
- request ID per API call
|
||||
- every mutation writes `activity_log`
|
||||
|
||||
## 15.4 Reliability Targets
|
||||
|
||||
- API p95 latency under 250 ms for standard CRUD at 1k tasks/company
|
||||
- heartbeat invoke acknowledgement under 2 s for process adapter
|
||||
- no lost approval decisions (transactional writes)
|
||||
|
||||
## 16. Security Requirements
|
||||
|
||||
- store only hashed agent API keys
|
||||
- redact secrets in logs (`adapter_config`, auth headers, env vars)
|
||||
- CSRF protection for board session endpoints
|
||||
- rate limit auth and key-management endpoints
|
||||
- strict company boundary checks on every entity fetch/mutation
|
||||
|
||||
## 17. Testing Strategy
|
||||
|
||||
## 17.1 Unit Tests
|
||||
|
||||
- state transition guards (agent, issue, approval)
|
||||
- budget enforcement rules
|
||||
- adapter invocation/cancel semantics
|
||||
|
||||
## 17.2 Integration Tests
|
||||
|
||||
- atomic checkout conflict behavior
|
||||
- approval-to-agent creation flow
|
||||
- cost ingestion and rollup correctness
|
||||
- pause while run is active (graceful cancel then force kill)
|
||||
|
||||
## 17.3 End-to-End Tests
|
||||
|
||||
- board creates company -> hires CEO -> approves strategy -> CEO receives work
|
||||
- agent reports cost -> budget threshold reached -> auto-pause occurs
|
||||
- task delegation across teams with request depth increment
|
||||
|
||||
## 17.4 Regression Suite Minimum
|
||||
|
||||
A release candidate is blocked unless these pass:
|
||||
|
||||
1. auth boundary tests
|
||||
2. checkout race test
|
||||
3. hard budget stop test
|
||||
4. agent pause/resume test
|
||||
5. dashboard summary consistency test
|
||||
|
||||
## 18. Delivery Plan
|
||||
|
||||
## Milestone 1: Company Core and Auth
|
||||
|
||||
- add `companies` and company scoping to existing entities
|
||||
- add board session auth and agent API keys
|
||||
- migrate existing API routes to company-aware paths
|
||||
|
||||
## Milestone 2: Task and Governance Semantics
|
||||
|
||||
- implement atomic checkout endpoint
|
||||
- implement issue comments and lifecycle guards
|
||||
- implement approvals table and hire/strategy workflows
|
||||
|
||||
## Milestone 3: Heartbeat and Adapter Runtime
|
||||
|
||||
- implement adapter interface
|
||||
- ship `process` adapter with cancel semantics
|
||||
- ship `http` adapter with timeout/error handling
|
||||
- persist heartbeat runs and statuses
|
||||
|
||||
## Milestone 4: Cost and Budget Controls
|
||||
|
||||
- implement cost events ingestion
|
||||
- implement monthly rollups and dashboards
|
||||
- enforce hard limit auto-pause
|
||||
|
||||
## Milestone 5: Board UI Completion
|
||||
|
||||
- add company selector and org chart view
|
||||
- add approvals and cost pages
|
||||
- add operational dashboard and stale-task surfacing
|
||||
|
||||
## Milestone 6: Hardening and Release
|
||||
|
||||
- full integration/e2e suite
|
||||
- seed/demo company templates for local testing
|
||||
- release checklist and docs update
|
||||
|
||||
## 19. Acceptance Criteria (Release Gate)
|
||||
|
||||
V1 is complete only when all criteria are true:
|
||||
|
||||
1. A board user can create multiple companies and switch between them.
|
||||
2. A company can run at least one active heartbeat-enabled agent.
|
||||
3. Task checkout is conflict-safe with `409` on concurrent claims.
|
||||
4. Agents can update tasks/comments and report costs with API keys only.
|
||||
5. Board can approve/reject hire and CEO strategy requests in UI.
|
||||
6. Budget hard limit auto-pauses an agent and prevents new invocations.
|
||||
7. Dashboard shows accurate counts/spend from live DB data.
|
||||
8. Every mutation is auditable in activity log.
|
||||
9. App runs with embedded PGlite by default and with external Postgres via `DATABASE_URL`.
|
||||
|
||||
## 20. Post-V1 Backlog (Explicitly Deferred)
|
||||
|
||||
- plugin architecture
|
||||
- richer workflow-state customization per team
|
||||
- milestones/labels/dependency graph depth beyond V1 minimum
|
||||
- realtime transport optimization (SSE/WebSockets)
|
||||
- public template marketplace integration (ClipHub)
|
||||
525
doc/SPEC.md
Normal file
525
doc/SPEC.md
Normal file
@@ -0,0 +1,525 @@
|
||||
# Paperclip Specification
|
||||
|
||||
Target specification for the Paperclip control plane. Living document — updated incrementally during spec interviews.
|
||||
|
||||
---
|
||||
|
||||
## 1. Company Model [DRAFT]
|
||||
|
||||
A Company is a first-order object. One Paperclip instance runs multiple Companies. A Company does not have a standalone "goal" field — its direction is defined by its set of Initiatives (see Task Hierarchy Mapping).
|
||||
|
||||
### Fields (Draft)
|
||||
|
||||
| Field | Type | Notes |
|
||||
| ----------- | ------------- | --------------------------------- |
|
||||
| `id` | uuid | Primary key |
|
||||
| `name` | string | Company name |
|
||||
| `createdAt` | timestamp | |
|
||||
| `updatedAt` | timestamp | |
|
||||
|
||||
### Board Governance [DRAFT]
|
||||
|
||||
Every Company has a **Board** that governs high-impact decisions. The Board is the human oversight layer.
|
||||
|
||||
**V1: Single human Board.** One human operator.
|
||||
|
||||
#### Board Approval Gates (V1)
|
||||
|
||||
- New Agent hires (creating new Agents)
|
||||
- CEO's initial strategic breakdown (CEO proposes, Board approves before execution begins)
|
||||
- [TBD: other governance-gated actions — goal changes, firing Agents?]
|
||||
|
||||
#### Board Powers (Always Available)
|
||||
|
||||
The Board has **unrestricted access** to the entire system at all times:
|
||||
|
||||
- **Set and modify Company budgets** — the Board sets top-level token/LLM cost budgets
|
||||
- **Pause/resume any Agent** — stop an Agent's heartbeat immediately
|
||||
- **Pause/resume any work item** — pause a task, project, subtask tree, milestone. Paused items are not picked up by Agents.
|
||||
- **Full project management access** — create, edit, comment on, modify, delete, reassign any task/project/milestone through the UI
|
||||
- **Override any Agent decision** — reassign tasks, change priorities, modify descriptions
|
||||
- **Manually change any budget** at any level
|
||||
|
||||
The Board is not just an approval gate — it's a live control surface. The human can intervene at any level at any time.
|
||||
|
||||
#### Budget Delegation
|
||||
|
||||
The Board sets Company-level budgets. The CEO can set budgets for Agents below them, and every manager Agent can do the same for their reports. How this cascading budget delegation works in practice is TBD, but the permission structure supports it. The Board can manually override any budget at any level.
|
||||
|
||||
**Future governance models** (not V1):
|
||||
|
||||
- Hiring budgets (auto-approve hires within $X/month)
|
||||
- Multi-member boards
|
||||
- Delegated authority (CEO can hire within limits)
|
||||
|
||||
### Open Questions
|
||||
|
||||
- External revenue/expense tracking — future plugin. Token/LLM cost budgeting is core.
|
||||
- Company-level settings and configuration?
|
||||
- Company lifecycle (pause, archive, delete)?
|
||||
- What governance-gated actions exist beyond hiring and CEO strategy approval?
|
||||
|
||||
---
|
||||
|
||||
## 2. Agent Model [DRAFT]
|
||||
|
||||
Every employee is an agent. Agents are the workforce.
|
||||
|
||||
### Agent Identity (Adapter-Level)
|
||||
|
||||
Concepts like SOUL.md (identity/mission) and HEARTBEAT.md (loop definition) are **not part of the Paperclip protocol**. They are adapter-specific configurations. For example, an OpenClaw adapter might use SOUL.md and HEARTBEAT.md files. A Claude Code adapter might use CLAUDE.md. A bare Python script might use command-line args.
|
||||
|
||||
Paperclip doesn't prescribe how an agent defines its identity or behavior. It provides the control plane; the adapter defines the agent's inner workings.
|
||||
|
||||
### Agent Configuration [DRAFT]
|
||||
|
||||
Each agent has an **adapter type** and an **adapter-specific configuration blob**. The adapter defines what config fields exist.
|
||||
|
||||
#### Paperclip Protocol (What Paperclip Knows)
|
||||
|
||||
At the protocol level, Paperclip tracks:
|
||||
|
||||
- Agent identity (id, name, role, title)
|
||||
- Org position (who they report to, who reports to them)
|
||||
- Adapter type + adapter config
|
||||
- Status (active, paused, terminated)
|
||||
- Cost tracking data (if the agent reports it)
|
||||
|
||||
#### Adapter Configuration (Agent-Specific)
|
||||
|
||||
Each adapter type defines its own config schema. Examples:
|
||||
|
||||
- **OpenClaw adapter**: SOUL.md content, HEARTBEAT.md content, OpenClaw-specific settings
|
||||
- **Process adapter**: command to run, environment variables, working directory
|
||||
- **HTTP adapter**: endpoint URL, auth headers, payload template
|
||||
|
||||
#### Exportable Org Configs
|
||||
|
||||
A key goal: **the entire org's agent configurations are exportable.** You can export a company's complete agent setup — every agent, their adapter configs, org structure — as a portable artifact. This enables:
|
||||
|
||||
- Sharing company templates ("here's a pre-built marketing agency org")
|
||||
- Version controlling your company configuration
|
||||
- Duplicating/forking companies
|
||||
|
||||
#### Context Delivery
|
||||
|
||||
Configurable per agent. Two ends of the spectrum:
|
||||
|
||||
- **Fat payload** — Paperclip bundles relevant context (current tasks, messages, company state, metrics) into the heartbeat invocation. Suited for simple/stateless agents that can't call back to Paperclip.
|
||||
- **Thin ping** — Heartbeat is just a wake-up signal. Agent calls Paperclip's API to fetch whatever context it needs. Suited for sophisticated agents that manage their own state.
|
||||
|
||||
#### Minimum Contract
|
||||
|
||||
The minimum requirement to be a Paperclip agent: **be callable.** That's it. Paperclip can invoke you via command or webhook. No requirement to report back — Paperclip infers basic status from process liveness when it can.
|
||||
|
||||
#### Integration Levels
|
||||
|
||||
Beyond the minimum, Paperclip provides progressively richer integration:
|
||||
|
||||
1. **Callable** (minimum) — Paperclip can start you. That's the only contract.
|
||||
2. **Status reporting** — Agent reports back success/failure/in-progress after execution.
|
||||
3. **Fully instrumented** — Agent reports status, cost/token usage, task updates, and logs. Bidirectional integration with the control plane.
|
||||
|
||||
Paperclip ships **default agents** that demonstrate full integration: progress tracking, cost instrumentation, and a **Paperclip skill** (a Claude Code skill for interacting with the Paperclip API) for task management. These serve as both useful defaults and reference implementations for adapter authors.
|
||||
|
||||
#### Export Formats
|
||||
|
||||
Two export modes:
|
||||
|
||||
1. **Template export** (default) — structure only: agent definitions, org chart, adapter configs, role descriptions. Optionally includes a few seed tasks to help get started. This is the blueprint for spinning up a new company.
|
||||
2. **Snapshot export** — full state: structure + current tasks, progress, agent status. A complete picture you could restore or fork.
|
||||
|
||||
The usual workflow: export a template, create a new company from it, add a couple initial tasks, go.
|
||||
|
||||
---
|
||||
|
||||
## 3. Org Structure [DRAFT]
|
||||
|
||||
Hierarchical reporting structure. CEO at top, reports cascade down.
|
||||
|
||||
### Agent Visibility
|
||||
|
||||
**Full visibility across the org.** Every agent can see the entire org chart, all tasks, all agents. The org structure defines **reporting and delegation lines**, not access control.
|
||||
|
||||
Each agent publishes a short description of their responsibilities and capabilities — almost like skills ("when I'm relevant"). This lets other agents discover who can help with what.
|
||||
|
||||
### Cross-Team Work
|
||||
|
||||
Agents can create tasks and assign them to agents outside their reporting line. This is the mechanism for cross-team collaboration. These rules are primarily encoded in the Paperclip SKILL.md which is recommended for all agents. Paperclip the app enforces the tooling and some light governance, but the cross-team rules below are mainly implemented by agent decisions.
|
||||
|
||||
#### Task Acceptance Rules
|
||||
|
||||
When an agent receives a task from outside their team:
|
||||
|
||||
1. **Agrees it's appropriate + can do it** → complete it directly
|
||||
2. **Agrees it's appropriate + can't do it** → mark as blocked
|
||||
3. **Questions whether it's worth doing** → **cannot cancel it themselves.** Must reassign to their own manager, explain the situation. Manager decides whether to accept, reassign, or escalate.
|
||||
|
||||
#### Manager Escalation Protocol
|
||||
|
||||
It's any manager's responsibility to understand why their subordinates are blocked and resolve it:
|
||||
|
||||
0. **Decide** — as a manager, is this work worth doing?
|
||||
1. **Delegate down** — ask someone under them to help unblock
|
||||
2. **Escalate up** — ask the manager above them for help
|
||||
|
||||
#### Request Depth Tracking
|
||||
|
||||
When a task originates from a cross-team request, track the **depth** as an integer — how many delegation hops from the original requester. This provides visibility into how far work cascades through the org.
|
||||
|
||||
#### Billing Codes
|
||||
|
||||
Tasks carry a **billing code** so that token spend during execution can be attributed upstream to the requesting task/agent. When Agent A asks Agent B to do work, the cost of B's work is tracked against A's request. This enables cost attribution across the org.
|
||||
|
||||
### Open Questions
|
||||
|
||||
- Is this a strict tree or can agents report to multiple managers?
|
||||
- Can org structure change at runtime? (agents reassigned, teams restructured)
|
||||
- Do agents inherit any configuration from their manager?
|
||||
- Billing code format — simple string? Hierarchical?
|
||||
|
||||
---
|
||||
|
||||
## 4. Heartbeat System [DRAFT]
|
||||
|
||||
The heartbeat is a protocol, not a runtime. Paperclip defines how to initiate an agent's cycle. What the agent does with that cycle — how long it runs, whether it's task-scoped or continuous — is entirely up to the agent.
|
||||
|
||||
### Execution Adapters
|
||||
|
||||
Agent configuration includes an **adapter** that defines how Paperclip invokes the agent. Initial adapters:
|
||||
|
||||
| Adapter | Mechanism | Example |
|
||||
| --------- | ----------------------- | --------------------------------------------- |
|
||||
| `process` | Execute a child process | `python run_agent.py --agent-id {id}` |
|
||||
| `http` | Send an HTTP request | `POST https://openclaw.example.com/hook/{id}` |
|
||||
|
||||
The `process` and `http` adapters ship as defaults. Additional adapters can be added via the plugin system (see Plugin / Extension Architecture).
|
||||
|
||||
### Adapter Interface
|
||||
|
||||
Every adapter implements three methods:
|
||||
|
||||
```
|
||||
invoke(agentConfig, context?) → void // Start the agent's cycle
|
||||
status(agentConfig) → AgentStatus // Is it running? finished? errored?
|
||||
cancel(agentConfig) → void // Graceful stop signal (for pause/resume)
|
||||
```
|
||||
|
||||
This is the full adapter contract. `invoke` starts the agent, `status` lets Paperclip check on it, `cancel` enables the board's pause functionality. Everything else (cost reporting, task updates) is optional and flows through the Paperclip REST API.
|
||||
|
||||
### What Paperclip Controls
|
||||
|
||||
- **When** to fire the heartbeat (schedule/frequency, per-agent)
|
||||
- **How** to fire it (adapter selection + config)
|
||||
- **What context** to include (thin ping vs. fat payload, per-agent)
|
||||
|
||||
### What Paperclip Does NOT Control
|
||||
|
||||
- How long the agent runs
|
||||
- What the agent does during its cycle
|
||||
- Whether the agent is task-scoped, time-windowed, or continuous
|
||||
|
||||
### Pause Behavior
|
||||
|
||||
When the board (or system) pauses an agent:
|
||||
|
||||
1. **Signal the current execution** — send a graceful termination signal to the running process/session
|
||||
2. **Grace period** — give the agent time to wrap up, save state, report final status
|
||||
3. **Force-kill after timeout** — if the agent doesn't stop within the grace period, terminate
|
||||
4. **Stop future heartbeats** — no new heartbeat cycles will fire until the agent is resumed
|
||||
|
||||
This is "graceful signal + stop future heartbeats." The current run gets a chance to land cleanly.
|
||||
|
||||
### Open Questions
|
||||
|
||||
- Heartbeat frequency — who controls it? Fixed? Per-agent? Cron-like?
|
||||
- What happens when a heartbeat invocation fails? (process crashes, HTTP 500)
|
||||
- Health monitoring — how does Paperclip distinguish "stuck" from "working on a long task"?
|
||||
- Can agents self-trigger their next heartbeat? ("I'm done, wake me again in 5 min")
|
||||
- Grace period duration — fixed? configurable per agent?
|
||||
|
||||
---
|
||||
|
||||
## 5. Inter-Agent Communication [DRAFT]
|
||||
|
||||
All agent communication flows through the **task system**.
|
||||
|
||||
### Model: Tasks + Comments
|
||||
|
||||
- **Delegation** = creating a task and assigning it to another agent
|
||||
- **Coordination** = commenting on tasks
|
||||
- **Status updates** = updating task status and fields
|
||||
|
||||
There is no separate messaging or chat system. Tasks are the communication channel. This keeps all context attached to the work it relates to and creates a natural audit trail.
|
||||
|
||||
### Implications
|
||||
|
||||
- An agent's "inbox" is: tasks assigned to them + comments on tasks they're involved in
|
||||
- The CEO delegates by creating tasks assigned to the CTO
|
||||
- The CTO breaks those down into sub-tasks assigned to engineers
|
||||
- Discussion happens in task comments, not a side channel
|
||||
- If an agent needs to escalate, they comment on the parent task or reassign
|
||||
|
||||
### Task Hierarchy Mapping
|
||||
|
||||
Full hierarchy: **Initiative** (company goal) → Projects → Milestones → Issues → Sub-issues. Everything traces back to an initiative, and the "company goal" is just the first/primary initiative.
|
||||
|
||||
---
|
||||
|
||||
## 6. Cost Tracking [DRAFT]
|
||||
|
||||
Token/LLM cost budgeting is a core part of Paperclip. External revenue and expense tracking is a future plugin.
|
||||
|
||||
### Cost Reporting
|
||||
|
||||
Fully-instrumented Agents report token/API usage back to Paperclip. Costs are tracked at every level:
|
||||
|
||||
- **Per Agent** — how much is this employee costing?
|
||||
- **Per task** — how much did this unit of work cost?
|
||||
- **Per project** — how much is this deliverable costing?
|
||||
- **Per Company** — total burn rate
|
||||
|
||||
Costs should be denominated in both **tokens and dollars**.
|
||||
|
||||
Billing codes on tasks (see Org Structure) enable cost attribution across teams — when Agent A requests work from Agent B, B's costs roll up to A's request.
|
||||
|
||||
### Budget Controls
|
||||
|
||||
Three tiers:
|
||||
|
||||
1. **Visibility** — dashboards showing spend at every level (Agent, task, project, Company)
|
||||
2. **Soft alerts** — configurable thresholds (e.g. warn at 80% of budget)
|
||||
3. **Hard ceiling** — auto-pause the Agent when budget is hit. Board notified. Board can override/raise the limit.
|
||||
|
||||
Budgets can be set to **unlimited** (no ceiling).
|
||||
|
||||
### Open Questions
|
||||
|
||||
- Cost reporting API — what's the schema for an agent to report costs?
|
||||
- Dashboard design — what metrics matter most at each level?
|
||||
- Budget period — per-day? per-week? per-month? rolling?
|
||||
|
||||
---
|
||||
|
||||
## 7. Default Agents & Bootstrap Flow [DRAFT]
|
||||
|
||||
### Bootstrap Sequence
|
||||
|
||||
How a Company goes from "created" to "running":
|
||||
|
||||
1. Human creates a Company and its initial Initiatives
|
||||
2. Human defines initial top-level tasks
|
||||
3. Human creates the CEO Agent (using the default CEO template or custom)
|
||||
4. CEO's first heartbeat: reviews the Initiatives and tasks, proposes a strategic breakdown (org structure, sub-tasks, hiring plan)
|
||||
5. **Board approves** the CEO's strategic plan
|
||||
6. CEO begins execution — creating tasks, proposing hires (Board-approved), delegating
|
||||
|
||||
### Default Agents
|
||||
|
||||
Paperclip ships default Agent templates:
|
||||
|
||||
- **Default Agent** — a basic Claude Code or Codex loop. Knows the **Paperclip Skill** (SKILL.md) so it can interact with the task system, read Company context, report status.
|
||||
- **Default CEO** — extends the Default Agent with CEO-specific behavior: strategic planning, delegation to reports, progress review, Board communication.
|
||||
|
||||
These are starting points. Users can customize or replace them entirely.
|
||||
|
||||
### Default Agent Behavior
|
||||
|
||||
The default agent's loop is **config-driven**. The adapter config contains the instructions that define what the agent does on each heartbeat cycle. There is no hardcoded standard loop — each agent's config determines its behavior.
|
||||
|
||||
This means the default CEO config tells the CEO to review strategy, check on reports, etc. The default engineer config tells the engineer to check assigned tasks, pick the highest priority, and work it. But these are config choices, not protocol requirements.
|
||||
|
||||
### Paperclip Skill (SKILL.md)
|
||||
|
||||
A skill definition that teaches agents how to interact with Paperclip. Provides:
|
||||
|
||||
- Task CRUD (create, read, update, complete tasks)
|
||||
- Status reporting (check in, report progress)
|
||||
- Company context (read goal, org chart, current state)
|
||||
- Cost reporting (log token/API usage)
|
||||
- Inter-agent communication rules
|
||||
|
||||
This skill is adapter-agnostic — it can be loaded into Claude Code, injected into prompts, or used as API documentation for custom agents.
|
||||
|
||||
---
|
||||
|
||||
## 8. Architecture & Deployment [DRAFT]
|
||||
|
||||
### Deployment Model
|
||||
|
||||
**Single-tenant, self-hostable.** Not a SaaS. One instance = one operator's companies.
|
||||
|
||||
#### Development Path (Progressive Deployment)
|
||||
|
||||
1. **Local dev** — One command to install and run. Embedded Postgres. Everything on your machine. Agents run locally.
|
||||
2. **Hosted** — Deploy to Vercel/Supabase/AWS/anywhere. Remote agents connect to your server with a shared database. The UI is accessible via the web.
|
||||
3. **Open company** — Optionally make parts public (e.g. a job board visible to the public for open companies).
|
||||
|
||||
The key constraint: it must be trivial to go from "I'm trying this on my machine" to "my agents are running on remote servers talking to my Paperclip instance."
|
||||
|
||||
#### Agent Authentication
|
||||
|
||||
When a user creates an Agent, Paperclip generates a **connection string** containing: the server URL, an API key, and instructions for how to authenticate. The Agent is assumed to be capable of figuring out how to call the API with its token/key from there.
|
||||
|
||||
Flow:
|
||||
|
||||
1. Human creates an Agent in the UI
|
||||
2. Paperclip generates a connection string (URL + key + instructions)
|
||||
3. Human provides this string to the Agent (e.g. in its adapter config, environment, etc.)
|
||||
4. Agent uses the key to authenticate API calls to the control plane
|
||||
|
||||
### Tech Stack
|
||||
|
||||
| Layer | Technology |
|
||||
| -------- | ------------------------------------------------------------ |
|
||||
| Frontend | React + Vite |
|
||||
| Backend | TypeScript + Hono (REST API, not tRPC — need non-TS clients) |
|
||||
| Database | PostgreSQL (see [doc/DATABASE.md](./doc/DATABASE.md) for details — PGlite embedded for dev, Docker or hosted Supabase for production) |
|
||||
| Auth | [Better Auth](https://www.better-auth.com/) |
|
||||
|
||||
### Concurrency Model: Atomic Task Checkout
|
||||
|
||||
Tasks use **single assignment** (one agent per task) with **atomic checkout**:
|
||||
|
||||
1. Agent attempts to set a task to `in_progress` (claiming it)
|
||||
2. The API/database enforces this atomically — if another agent already claimed it, the request fails with an error identifying which agent has it
|
||||
3. If the task is already assigned to the requesting agent from a previous session, they can resume
|
||||
|
||||
No optimistic locking or CRDTs needed. The single-assignment model + atomic checkout prevents conflicts at the design level.
|
||||
|
||||
### Human in the Loop
|
||||
|
||||
Agents can create tasks assigned to humans. The board member (or any human with access) can complete these tasks through the UI.
|
||||
|
||||
When a human completes a task, if the requesting agent's adapter supports **pingbacks** (e.g. OpenClaw hooks), Paperclip sends a notification to wake that agent. This keeps humans rare but possible participants in the workflow.
|
||||
|
||||
The agents are discouraged from assigning tasks to humans in the Paperclip SKILL, but sometimes it's unavoidable.
|
||||
|
||||
### API Design
|
||||
|
||||
**Single unified REST API.** The same API serves both the frontend UI and agents. Authentication determines permissions — board auth has full access, agent API keys have scoped access (their own tasks, cost reporting, company context).
|
||||
|
||||
No separate "agent API" vs. "board API." Same endpoints, different authorization levels.
|
||||
|
||||
### Work Artifacts
|
||||
|
||||
Paperclip does **not** manage work artifacts (code repos, file systems, deployments, documents). That's entirely the agent's domain. Paperclip tracks tasks and costs. Where and how work gets done is outside scope.
|
||||
|
||||
### Open Questions
|
||||
|
||||
- Real-time updates to the UI — WebSocket? SSE? Polling?
|
||||
- Agent API key scoping — what exactly can an Agent access? Only their own tasks? Their team's? The whole Company?
|
||||
|
||||
### Crash Recovery: Manual, Not Automatic
|
||||
|
||||
When an agent crashes or disappears mid-task, Paperclip does **not** auto-reassign or auto-release the task. Instead:
|
||||
|
||||
- Paperclip surfaces stale tasks (tasks in `in_progress` with no recent activity) through dashboards and reporting
|
||||
- Paperclip does not fail silently — the auditing and visibility tools make problems obvious
|
||||
- Recovery is handled by humans or by emergent processes (e.g. a project manager agent whose job is to monitor for stale work and surface it)
|
||||
|
||||
**Principle: Paperclip reports problems, it doesn't silently fix them.** Automatic recovery hides failures. Good visibility lets the right entity (human or agent) decide what to do.
|
||||
|
||||
### Plugin / Extension Architecture
|
||||
|
||||
The core Paperclip system must be extensible. Features like knowledge bases, external revenue tracking, and new Agent Adapters should be addable as **plugins** without modifying core. This means:
|
||||
|
||||
- Well-defined API boundaries that plugins can hook into
|
||||
- Event system or hooks for reacting to task/Agent lifecycle events
|
||||
- **Agent Adapter plugins** — new Adapter types can be registered via the plugin system
|
||||
- Plugin-registrable UI components (future)
|
||||
|
||||
This isn't a V1 deliverable (we're not building a plugin framework upfront), but the architecture should not paint us into a corner. Keep boundaries clean so extensions are possible.
|
||||
|
||||
---
|
||||
|
||||
## 9. Frontend / UI [DRAFT]
|
||||
|
||||
### Primary Views
|
||||
|
||||
Each is a distinct page/route:
|
||||
|
||||
1. **Org Chart** — the org tree with live status indicators (running/idle/paused/error) per agent. Real-time activity feed of what agents are doing.
|
||||
2. **Task Board** — Task management. Kanban and list views. Filter by team, agent, project, status.
|
||||
3. **Dashboard** — high-level metrics: agent count, active tasks, costs, goal progress, burn rate. The "glance" view from GOAL.md.
|
||||
4. **Agent Detail** — deep dive on a single agent: their tasks, activity, costs, configuration, status history.
|
||||
5. **Project/Initiative Views** — progress tracking against milestones and goals.
|
||||
6. **Cost Dashboard** — spend visualization at every level (agent, task, project, company).
|
||||
|
||||
### Board Controls (Available Everywhere)
|
||||
|
||||
- Pause/resume agents (any view)
|
||||
- Pause/resume tasks/projects (any view)
|
||||
- Approve/reject pending actions (hiring, strategy proposals)
|
||||
- Direct task creation, editing, commenting
|
||||
|
||||
---
|
||||
|
||||
## 10. V1 Scope (MVP) [DRAFT]
|
||||
|
||||
**Full loop with one adapter.** V1 must demonstrate the complete Paperclip cycle end-to-end, even if narrow.
|
||||
|
||||
### Must Have (V1)
|
||||
|
||||
- [ ] **Company CRUD** — create a Company with Initiatives
|
||||
- [ ] **Agent CRUD** — create/edit/pause/resume Agents with Adapter config
|
||||
- [ ] **Org chart** — define reporting structure, visualize it
|
||||
- [ ] **Process adapter** — invoke(), status(), cancel() for local child processes
|
||||
- [ ] **Task management** — full lifecycle with hierarchy (tasks trace to company goal)
|
||||
- [ ] **Atomic task checkout** — single assignment, in_progress locking
|
||||
- [ ] **Board governance** — human approves hires, pauses Agents, sets budgets, full PM access
|
||||
- [ ] **Cost tracking** — Agents report token usage, per-Agent/task/Company visibility
|
||||
- [ ] **Budget controls** — soft alerts + hard ceiling with auto-pause
|
||||
- [ ] **Default agent** — basic Claude Code/Codex loop with Paperclip skill
|
||||
- [ ] **Default CEO** — strategic planning, delegation, board communication
|
||||
- [ ] **Paperclip skill (SKILL.md)** — teaches agents to interact with the API
|
||||
- [ ] **REST API** — full API for agent interaction (Hono)
|
||||
- [ ] **Web UI** — React/Vite: org chart, task board, dashboard, cost views
|
||||
- [ ] **Agent auth** — connection string generation with URL + key + instructions
|
||||
- [ ] **One-command dev setup** — embedded PGlite, everything local
|
||||
- [ ] **Multiple Adapter types** (HTTP Adapter, OpenClaw Adapter)
|
||||
|
||||
### Not V1
|
||||
|
||||
- Template export/import
|
||||
- Knowledge base - a future plugin
|
||||
- Advanced governance models (hiring budgets, multi-member boards)
|
||||
- Revenue/expense tracking beyond token costs - a future plugin
|
||||
- Public job board / open company features
|
||||
|
||||
---
|
||||
|
||||
## 11. Knowledge Base
|
||||
|
||||
**Anti-goal for core.** The knowledge base is not part of the Paperclip core — it will be a plugin. The task system + comments + agent descriptions provide sufficient shared context.
|
||||
|
||||
The architecture must support adding a knowledge base plugin later (clean API boundaries, hookable lifecycle events) but the core system explicitly does not include one.
|
||||
|
||||
---
|
||||
|
||||
## 12. Anti-Requirements
|
||||
|
||||
Things Paperclip explicitly does **not** do:
|
||||
|
||||
- **Not an Agent runtime** — Paperclip orchestrates, Agents run elsewhere
|
||||
- **Not a knowledge base** — core has no wiki/docs/vector-DB (plugin territory)
|
||||
- **Not a SaaS** — single-tenant, self-hosted
|
||||
- **Not opinionated about Agent implementation** — any language, any framework, any runtime
|
||||
- **Not automatically self-healing** — surfaces problems, doesn't silently fix them
|
||||
- **Does not manage work artifacts** — no repo management, no deployment, no file systems
|
||||
- **Does not auto-reassign work** — stale tasks are surfaced, not silently redistributed
|
||||
- **Does not track external revenue/expenses** — that's a future plugin. Token/LLM cost budgeting is core.
|
||||
|
||||
---
|
||||
|
||||
## 13. Principles (Consolidated)
|
||||
|
||||
1. **Unopinionated about how you run your Agents.** Any language, any framework, any runtime. Paperclip is the control plane, not the execution plane.
|
||||
2. **Company is the unit of organization.** Everything lives under a Company.
|
||||
3. **Tasks are the communication channel.** All Agent communication flows through tasks + comments. No side channels.
|
||||
4. **All work traces to the goal.** Hierarchical task management — nothing exists in isolation.
|
||||
5. **Board governs.** Humans retain control through the Board. Conservative defaults (human approval required).
|
||||
6. **Surface problems, don't hide them.** Good auditing and visibility. No silent auto-recovery.
|
||||
7. **Atomic ownership.** Single assignee per task. Atomic checkout prevents conflicts.
|
||||
8. **Progressive deployment.** Trivial to start local, straightforward to scale to hosted.
|
||||
9. **Extensible core.** Clean boundaries so plugins can add capabilities (Adapters, knowledge base, revenue tracking) without modifying core.
|
||||
1001
doc/specs/ui.md
Normal file
1001
doc/specs/ui.md
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user