Commit Graph

926 Commits

Author SHA1 Message Date
Matt Van Horn
cc40e1f8e9 refactor(evals): split test cases into tests/*.yaml files
Move inline test cases from promptfooconfig.yaml into separate files
organized by category (core.yaml, governance.yaml). Main config now
uses file://tests/*.yaml glob pattern per promptfoo best practices.

This makes it easier to add new test categories without bloating the
main config, and lets contributors add cases by dropping new YAML
files into tests/.
2026-03-15 12:15:51 -07:00
Matt Van Horn
a39579dad3 fix(evals): address Greptile review feedback
- Make company_boundary test adversarial with cross-company stimulus
- Replace fragile not-contains:retry with targeted JS assertion
- Replace not-contains:create with not-contains:POST /api/companies
- Pin promptfoo to 0.103.3 for reproducible eval runs
- Fix npm -> pnpm in README prerequisites
- Add trailing newline to system prompt

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 17:19:25 -07:00
Matt Van Horn
fbb8d10305 feat(evals): bootstrap promptfoo eval framework (Phase 0)
Implements Phase 0 of the agent evals framework plan from discussion #808
and PR #817. Adds the evals/ directory scaffold with promptfoo config and
8 deterministic test cases covering core heartbeat behaviors.

Test cases:
- core.assignment_pickup: picks in_progress before todo
- core.progress_update: posts status comment before exiting
- core.blocked_reporting: sets blocked status with explanation
- governance.approval_required: reviews approval before acting
- governance.company_boundary: refuses cross-company actions
- core.no_work_exit: exits cleanly with no assignments
- core.checkout_before_work: always checks out before modifying
- core.conflict_handling: stops on 409, picks different task

Model matrix: claude-sonnet-4, gpt-4.1, codex-5.4, gemini-2.5-pro via
OpenRouter. Run with `pnpm evals:smoke`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 17:09:51 -07:00
Dotta
bcce5b7ec2 Merge pull request #816 from paperclipai/fix/worktree-seed-and-env-quoting
fix(cli): preserve worktree seed source config and quote special env values
2026-03-13 15:18:03 -05:00
Dotta
8eacc9c697 Merge pull request #817 from paperclipai/docs/agent-evals-framework-plan
docs: add agent evals framework plan
2026-03-13 15:17:40 -05:00
Dotta
db81a06386 docs: add agent evals framework plan 2026-03-13 15:07:56 -05:00
Dotta
626a8f1976 fix(cli): quote env values with special characters 2026-03-13 15:07:49 -05:00
Dotta
aa799bba4c Fix worktree seed source selection
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 15:07:42 -05:00
Dotta
aaadbdc144 Merge pull request #790 from paperclipai/paperclip-token-optimization
Optimize heartbeat token usage
2026-03-13 15:01:45 -05:00
Dotta
a393db78b4 fix: address greptile follow-up
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 14:53:30 -05:00
Dotta
c1430e7b06 docs: add paperclip skill tightening plan
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 14:37:44 -05:00
Dotta
528505a04a fix: isolate codex home in worktrees 2026-03-13 11:53:56 -05:00
Dotta
e2a0347c6d Merge pull request #805 from paperclipai/fix/worktree-ui-branding
Add worktree UI branding
2026-03-13 11:15:11 -05:00
Dotta
cce9941464 Add worktree UI branding 2026-03-13 11:12:43 -05:00
Dotta
d51c4b1a4c fix: tighten token optimization edge cases
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 10:18:00 -05:00
Dotta
3b0d9a93f4 Merge pull request #802 from paperclipai/fix/ui-routing-and-assignee-polish
fix(ui): polish company switching, issue tab order, and assignee filters
2026-03-13 10:11:09 -05:00
Dotta
41eb8e51e3 Fix company switch remembered routes
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 10:01:32 -05:00
Dotta
32ab4f8e47 Add me and unassigned assignee options
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 09:47:01 -05:00
Dotta
6365e03731 feat: skip pre-filled assignee/project fields when tabbing in new issue dialog
When creating a new issue with a pre-filled assignee or project (e.g. from
a project page), Tab from the title field now skips over fields that already
have values, going directly to the next empty field or description.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 09:47:01 -05:00
Dotta
2b9de934e3 Fix manual company switch route sync
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 09:47:01 -05:00
Dotta
4a368f54d5 Delay onboarding starter task creation until launch
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 09:46:36 -05:00
Dotta
7d1748b3a7 feat: optimize heartbeat token usage
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 09:40:43 -05:00
Dotta
2246d5f1eb Add me and unassigned assignee options
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 09:40:43 -05:00
Dotta
575a2fd83f feat: skip pre-filled assignee/project fields when tabbing in new issue dialog
When creating a new issue with a pre-filled assignee or project (e.g. from
a project page), Tab from the title field now skips over fields that already
have values, going directly to the next empty field or description.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 09:40:43 -05:00
Dotta
c9259bbec0 Fix manual company switch route sync
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 09:40:43 -05:00
Dotta
f3c18db7dd Delay onboarding starter task creation until launch
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 09:40:43 -05:00
Dotta
43baf709dd Merge pull request #797 from paperclipai/fix/embedded-postgres-initdbflags
fix: align embedded postgres ctor types with initdbFlags usage
2026-03-13 09:39:27 -05:00
Dotta
24d6e3a543 fix: align embedded postgres ctor types with initdbFlags usage 2026-03-13 09:25:04 -05:00
Dotta
0b8223b8b9 Merge pull request #796 from paperclipai/docs/organize-and-date-plans
docs: consolidate dated plan docs and naming guidance
2026-03-13 09:20:25 -05:00
Dotta
e2f0241533 docs: add dated plan naming rule and align workspace plan 2026-03-13 09:16:28 -05:00
Dotta
89e247b410 Expand workspace plan for migration and cloud execution 2026-03-13 09:15:36 -05:00
Dotta
216cb3fb28 Add workspace product model plan 2026-03-13 09:15:36 -05:00
Dotta
84fc6d4a87 docs: add token optimization plan
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 09:15:36 -05:00
Dotta
9c7d9ded1e docs: organize plans into doc/plans with date prefixes
Move plans from doc/plan/ into doc/plans/ and add YYYY-MM-DD date
prefixes to all undated plan files based on document headers or
earliest git commit dates.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:11:56 -05:00
Dotta
dfe40ffcca Merge pull request #793 from paperclipai/split/docs-sidebar-link
ui: point Documentation sidebar link to docs.paperclip.ing
2026-03-13 09:09:30 -05:00
Dotta
f477f23738 Merge pull request #794 from paperclipai/split/agent-skill-resolution
fix(adapters): resolve local agent skill lookup after .agents migration
2026-03-13 09:09:22 -05:00
Dotta
29a743cb9e fix: keep runtime skills scoped to ./skills
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 08:49:25 -05:00
Dotta
4e759da070 fix: prefer .agents skills and repair codex symlink targets\n\nCo-Authored-By: Paperclip <noreply@paperclip.ing> 2026-03-13 08:49:25 -05:00
Dotta
69b9e45eaf Change sidebar Documentation link to external docs.paperclip.ing
The sidebar Documentation links were pointing to an internal /docs route.
Updated both mobile and desktop sidebar instances to link to
https://docs.paperclip.ing/ in a new tab instead.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 08:49:24 -05:00
Paperclip
5c7d2116e9 Fix local-cli skill install for moved .agents skills
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-03-13 08:49:24 -05:00
Dotta
c32d19415b Merge pull request #783 from paperclipai/docs/product-and-features-plan
docs: update PRODUCT.md and add features plan
2026-03-13 07:25:52 -05:00
Dotta
0a0d74eb94 Merge pull request #782 from paperclipai/feat/worktree-cleanup-and-env
feat(worktree): cleanup command, env var defaults, auto-prefix
2026-03-13 07:25:31 -05:00
Dotta
2c5e48993d docs: update PRODUCT.md and add 2026-03-13 features plan
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 07:25:23 -05:00
Dotta
77af1ae544 feat(worktree): add worktree:cleanup command, env var defaults, and auto-prefix
- Add `worktree:cleanup <name>` command that safely removes a worktree,
  its branch, and instance data. Idempotent and safe by default — refuses
  to delete branches with unique unmerged commits or worktrees with
  uncommitted changes unless --force is passed.
- Support PAPERCLIP_WORKTREES_DIR env var as default for --home across
  all worktree commands.
- Support PAPERCLIP_WORKTREE_START_POINT env var as default for
  --start-point on worktree:make.
- Auto-prefix worktree names with "paperclip-" if not already present,
  so `worktree:make subissues` creates ~/paperclip-subissues.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 07:24:39 -05:00
Dotta
872d2434a9 Merge pull request #251 from mjaverto/fix/process-lost-reaper
fix(heartbeat): prevent false process_lost failures on queued and non-child-process runs
2026-03-13 07:06:10 -05:00
Dotta
fe764cac75 fix: resolve type errors in process-lost-reaper PR
- Fix malformed try/catch/finally blocks in heartbeat executeRun
- Declare activeRunExecutions Set to track in-flight runs
- Add resumeQueuedRuns function and export from heartbeat service
- Add initdbFlags to EmbeddedPostgresCtor type

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 06:56:31 -05:00
Dotta
f81d37fbf7 fix(heartbeat): prevent false process_lost failures on queued and non-child-process runs
- reapOrphanedRuns() now only scans running runs; queued runs are
  legitimately absent from runningProcesses (waiting on concurrency
  limits or issue locks) so including them caused false process_lost
  failures (closes #90)
- Add module-level activeRunExecutions set so non-child-process adapters
  (http, openclaw) are protected from the reaper during execution
- Add resumeQueuedRuns() to restart persisted queued runs after a server
  restart, called at startup and each periodic tick
- Add outer catch in executeRun() so setup failures (ensureRuntimeState,
  resolveWorkspaceForRun, etc.) are recorded as failed runs instead of
  leaving them stuck in running state
- Guard resumeQueuedRuns() against paused/terminated/pending_approval agents
- Increase opencode models discovery timeout from 20s to 45s
2026-03-12 17:24:50 -04:00
Dotta
d14e656ec1 Merge pull request #645 from vkartaviy/fix/embedded-postgres-utf8
fix: ensure embedded PostgreSQL databases use UTF-8 encoding
2026-03-12 14:15:58 -05:00
Dotta
5201222ce7 Merge pull request #711 from paperclipai/revert-pr-707
Revert PR #707
2026-03-12 12:17:43 -05:00
Dotta
b888f92718 Revert "Merge pull request #707 from paperclipai/nm/premerge-lockfile-refresh"
This reverts commit 56df8d3cf0, reversing
changes made to ac82cae39a.
2026-03-12 12:13:39 -05:00