Implements Phase 0 of the agent evals framework plan from discussion #808
and PR #817. Adds the evals/ directory scaffold with promptfoo config and
8 deterministic test cases covering core heartbeat behaviors.
Test cases:
- core.assignment_pickup: picks in_progress before todo
- core.progress_update: posts status comment before exiting
- core.blocked_reporting: sets blocked status with explanation
- governance.approval_required: reviews approval before acting
- governance.company_boundary: refuses cross-company actions
- core.no_work_exit: exits cleanly with no assignments
- core.checkout_before_work: always checks out before modifying
- core.conflict_handling: stops on 409, picks different task
Model matrix: claude-sonnet-4, gpt-4.1, codex-5.4, gemini-2.5-pro via
OpenRouter. Run with `pnpm evals:smoke`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Paperclip <noreply@paperclip.ing>
When creating a new issue with a pre-filled assignee or project (e.g. from
a project page), Tab from the title field now skips over fields that already
have values, going directly to the next empty field or description.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
When creating a new issue with a pre-filled assignee or project (e.g. from
a project page), Tab from the title field now skips over fields that already
have values, going directly to the next empty field or description.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Move plans from doc/plan/ into doc/plans/ and add YYYY-MM-DD date
prefixes to all undated plan files based on document headers or
earliest git commit dates.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The sidebar Documentation links were pointing to an internal /docs route.
Updated both mobile and desktop sidebar instances to link to
https://docs.paperclip.ing/ in a new tab instead.
Co-Authored-By: Paperclip <noreply@paperclip.ing>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add `worktree:cleanup <name>` command that safely removes a worktree,
its branch, and instance data. Idempotent and safe by default — refuses
to delete branches with unique unmerged commits or worktrees with
uncommitted changes unless --force is passed.
- Support PAPERCLIP_WORKTREES_DIR env var as default for --home across
all worktree commands.
- Support PAPERCLIP_WORKTREE_START_POINT env var as default for
--start-point on worktree:make.
- Auto-prefix worktree names with "paperclip-" if not already present,
so `worktree:make subissues` creates ~/paperclip-subissues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix malformed try/catch/finally blocks in heartbeat executeRun
- Declare activeRunExecutions Set to track in-flight runs
- Add resumeQueuedRuns function and export from heartbeat service
- Add initdbFlags to EmbeddedPostgresCtor type
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- reapOrphanedRuns() now only scans running runs; queued runs are
legitimately absent from runningProcesses (waiting on concurrency
limits or issue locks) so including them caused false process_lost
failures (closes#90)
- Add module-level activeRunExecutions set so non-child-process adapters
(http, openclaw) are protected from the reaper during execution
- Add resumeQueuedRuns() to restart persisted queued runs after a server
restart, called at startup and each periodic tick
- Add outer catch in executeRun() so setup failures (ensureRuntimeState,
resolveWorkspaceForRun, etc.) are recorded as failed runs instead of
leaving them stuck in running state
- Guard resumeQueuedRuns() against paused/terminated/pending_approval agents
- Increase opencode models discovery timeout from 20s to 45s