Commit Graph

453 Commits

Author SHA1 Message Date
Trevin Chow 1a67d0044e fix(ce-work-beta): tighten Codex CLI availability check to require absolute path
Same bug class as the resolved ce-setup feedback in this PR: the
availability gate accepted any "non-empty" output from the
`!` command -v codex 2>/dev/null `` pre-resolution, which on a
non-Claude harness that doesn't process `!` pre-resolution can be the
literal command text `command -v codex 2>/dev/null` — non-empty, but
not a real path. That produced a false positive where delegation
proceeded and `codex exec` failed downstream.

Tightened to require an absolute path (starts with `/`) for the "Codex
available" branch. Anything else — empty, unresolved command string, or
any non-path value — falls through to a runtime `command -v codex`
shell call that does the real availability check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:13:36 -07:00
Trevin Chow 4826337c2d fix(ce-update,ce-setup): probe scripts self-locate; tighten ce-setup platform check
Addresses two PR review findings empirically confirmed in a Claude Code
session: `printenv CLAUDE_SKILL_DIR` from the Bash tool returns
`NOT_SET`, proving CLAUDE_SKILL_DIR (and CLAUDE_PLUGIN_ROOT) are
SKILL.md content substitutions only, not environment variables exported
to Bash subprocesses.

ce-update probes (P1, Codex thread):
  Previously, currently-loaded-version.sh and marketplace-name.sh read
  `${CLAUDE_SKILL_DIR:-}` directly from the environment. Since the env
  var is never set in subprocesses, both scripts always emitted
  __CE_UPDATE_NOT_MARKETPLACE__ — meaning ce-update would never actually
  perform version comparison even on real marketplace installs.

  Fix: derive skill_dir from BASH_SOURCE[0] (the script's own location).
  Adds regression tests that copy each script into a fake
  marketplace-shaped path and run it with CLAUDE_SKILL_DIR explicitly
  cleared from the env, asserting the correct version/marketplace
  segments are extracted.

ce-setup platform check (P2, Codex thread):
  The check at line 47 keyed off "non-empty AND not literal
  ${CLAUDE_PLUGIN_ROOT}", which incorrectly accepted unresolved command
  strings like `echo "${CLAUDE_PLUGIN_ROOT}"` left in place by
  non-Claude harnesses that don't process `!` pre-resolution. Tightened
  to "starts with `/` and contains no `${`", which naturally rejects all
  unresolved forms while accepting real absolute paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:03:17 -07:00
Trevin Chow f2c3d772c5 fix(ce-update): probe scripts at runtime via CLAUDE_SKILL_DIR + allowed-tools
The previous commit moved probes from `!` pre-resolution to the runtime
Bash tool, which sidestepped Claude Code's load-time permission gate.
But two empirical issues remained:

1. Bare `bash scripts/<name>.sh` failed with "No such file or directory"
   because the runtime Bash tool runs from the user's project CWD, not
   the skill directory. The AGENTS.md guidance "all platforms resolve
   script paths relative to the skill's directory" is aspirational, not
   reality for the runtime Bash tool path.
2. Falling back to `bash <abs-path>` triggered a runtime permission
   prompt for users without `Bash(bash:*)` allow rules (most users have
   `Bash(bash -c:*)` at most).

Fix:
- Use `${CLAUDE_SKILL_DIR}/scripts/<name>.sh` in each runtime command.
  Claude Code sets that env var to the active skill's directory at
  runtime, so the path resolves correctly in both `--plugin-dir` and
  marketplace-cached installs.
- Declare narrow `allowed-tools` patterns pinned to each script
  filename. The skills docs explicitly state that `allowed-tools`
  grants permission for runtime tool calls "while the skill is active,
  so Claude can use them without prompting" — that coverage was murky
  for pre-resolution but is documented for runtime.

Tests:
- New regression guard fails if any probe lacks the `${CLAUDE_SKILL_DIR}`
  prefix (catching reverts to bare relative paths).
- New regression guard fails if `allowed-tools` is dropped or broadened
  to `Bash(bash *)`.
- Existing guard against `!` pre-resolution stays.

AGENTS.md updated to document both pieces (path form + allow-listing)
that the runtime-Bash pattern requires, with a concrete code-block
example.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 23:18:24 -07:00
Trevin Chow 02f2810b73 fix(ce-update): move probes from ! pre-resolution to runtime Bash tool
Empirical testing showed the previous approach was unreliable. With
`defaultMode: bypassPermissions` ON the `allowed-tools` frontmatter
appeared to work, but with bypass OFF (the configuration most users
have) the narrow `Bash(bash *<script>.sh)` patterns failed the
load-time permission check. The official docs are inconsistent on
whether `*` works as an internal-token glob, and `allowed-tools`
coverage of pre-resolution is undocumented. We can't ship a fix that
only works for users who have bypassPermissions on.

The reliable fix is to remove `!` pre-resolution entirely. Probes now
run from the skill body via the runtime Bash tool, which:
- Honors `defaultMode: bypassPermissions` (silent for those users)
- Falls back to a normal one-time approval prompt that Claude Code
  remembers (acceptable UX for users without bypass)

Skill body restructured with a "Step 1: Probe versions" section that
instructs the agent to run all three scripts in parallel, then "Step
2: Apply decision logic" that reads the captured outputs.

Tests updated:
- Drop the now-obsolete `allowed-tools` and pre-resolution-section
  assertions
- Add a regression guard that fails if `!`bash <path>`` pre-resolution
  is reintroduced
- Add a guard that ensures each of the three probe-script invocations
  appears in the skill body
- Refactor `runUpstreamCommand` -> `runUpstreamScript` since the test
  no longer extracts a pre-resolution command from SKILL.md and just
  runs the script directly

AGENTS.md updated: replace the now-incorrect "declare allowed-tools"
guidance with the correct pattern (invoke from skill body via runtime
Bash tool when the first token would be `bash <path>`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 23:12:05 -07:00
Trevin Chow b875b4d476 fix(ce-update): pin allowed-tools to specific script filenames instead of broad Bash
Previous fix used `Bash(bash *)` to grant pre-resolution permission for
the three extracted scripts, but that surface is broader than necessary
— it allows any `bash <anything>` command, not just the three scripts
ce-update actually invokes.

Narrows to per-script patterns: `Bash(bash *upstream-version.sh)` etc.
Per Claude Code's permission docs, `*` works at any position and quotes
are stripped before matching, so each pattern matches both the local
checkout path and the marketplace cache path without granting blanket
Bash access.

Updates the regression test to assert each script-specific pattern is
present and to fail if `Bash(bash *)` is reintroduced. Updates AGENTS.md
guidance to recommend script-pinned patterns with an example.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 22:47:03 -07:00
Trevin Chow 74949a95d2 fix(ce-update): declare allowed-tools so pre-resolution scripts pass permission check
The prior fix (commit 8605e0f9) extracted ce-update's pre-resolution
logic into `bash "${CLAUDE_SKILL_DIR}/scripts/<name>.sh"` invocations
to clear Claude Code's safety check. That worked, but introduced a
new failure mode: the *permission* check rejects the resulting
`bash "/abs/path/script.sh"` form because pre-resolution `!` commands
do not honor `defaultMode: bypassPermissions`, and `Bash(bash:*)` is
not a rule most users allow-list (they have `Bash(bash -c:*)` at most).

Fix: declare `allowed-tools: Bash(bash *), Bash(echo *)` in ce-update's
frontmatter so the skill carries its own permission grant instead of
depending on user settings.

- Adds regression test in tests/skills/ce-update.test.ts that fails
  if allowed-tools is dropped or stops covering Bash.
- Updates AGENTS.md with a "Permission gate on extracted scripts"
  subsection so future skills using the same script-extraction pattern
  declare allowed-tools upfront.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 22:18:31 -07:00
Trevin Chow 8605e0f96c fix(skills): replace shell antipatterns blocked by permission check
Two safety-check rejections in `!` backtick pre-resolutions were
breaking skill-load in Claude Code:

- `[A] && B || C` shape ("ambiguous syntax with command separators",
  issue #710): ce-setup, ce-update, ce-work-beta/codex-delegation-workflow.
- `$()` containing a double-quoted string ("Unhandled node type:
  string", issue #709): ce-compound, ce-sessions, ce-update, ce-work-beta.

Replaces each with a safe shape: raw env-var emit, pure-pipe sed,
`${var%suffix}` parameter expansion, or an extracted script under the
skill's `scripts/`. ce-update gets three scripts (upstream-version,
currently-loaded-version, marketplace-name) since it's Claude-only
and has the most complex pre-resolution logic.

Adds regression tests in tests/skill-shell-safety.test.ts that flag
both antipatterns at PR time, and updates AGENTS.md to enumerate the
rejected shapes alongside the existing `case`/`esac` rule.

Closes #709, #710.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 21:05:03 -07:00
Trevin Chow 41e7f72ab6 feat(ce-brainstorm,ce-plan): surface agent's scope synthesis before doc-write (#705)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 20:34:30 -07:00
Trevin Chow cd2fc67c3f fix(commit-push-pr): branch from fresh remote base to prevent stale-base contamination (#708)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 16:39:17 -07:00
Trevin Chow 4b5f28da97 fix(ce-work-beta): defer model and reasoning effort to Codex config (#704)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 21:32:13 -07:00
Trevin Chow dd080943e0 fix(ce-doc-review): tighten suggested_fix and why_it_matters rules (#702) 2026-04-26 14:39:01 -07:00
github-actions[bot] 179612039e chore: release main (#684)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-26 14:23:32 -07:00
Trevin Chow 5952b20d7f fix(skills): replace case statements blocked by permission check (#701)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 14:22:20 -07:00
Trevin Chow e8c118e28f refactor(ce-commit-push-pr): merge ce-pr-description into ce-commit-push-pr (#700)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 02:23:59 -07:00
Trevin Chow a91270ccd2 fix(session-historian): cap deep-dives, add keyword filter primitive, tighten dispatch (#699)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 23:37:30 -07:00
Trevin Chow 053c1db255 fix(ce-work): codify worktree isolation for parallel subagent dispatch (#698) 2026-04-25 23:22:49 -07:00
Trevin Chow 7eea2d1cfe feat(ce-compound): add frontmatter parser-safety validator (#697) 2026-04-25 21:37:57 -07:00
Trevin Chow ad9577e732 fix(ce-code-review): tighten autofix_class rubric for safe_auto/gated_auto boundary (#695)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 21:03:43 -07:00
Trevin Chow bd72818609 fix(ce-resolve-pr-feedback): add declined verdict for harmful suggestions (#694)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 20:04:39 -07:00
Trevin Chow e21156eeb7 fix(ce-debug): default to commit-and-PR and tighten learning offer (#693) 2026-04-25 19:27:09 -07:00
Trevin Chow 50bf65e88c fix(ce-doc-review): rename LFG path to best-judgment to avoid /lfg collision (#691)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 19:04:31 -07:00
Trevin Chow f30404e57b fix(ce-demo-reel): wait for network idle and reject blank frames (#692) 2026-04-25 18:59:40 -07:00
Trevin Chow 85e9a2073b fix(ce-code-review): move run artifacts from .context/ to /tmp per AGENTS.md (#690)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:19:40 -07:00
Trevin Chow 9ba41a14ca fix(ce-code-review): replace LFG with best-judgment auto-resolve (#685)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 18:12:09 -07:00
Trevin Chow 1284290af2 fix(ce-debug): delegate commit/PR and add branch check (#683) 2026-04-24 20:54:06 -07:00
github-actions[bot] ea8721eb21 chore: release main (#680)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-24 15:16:49 -07:00
Trevin Chow 304a975d02 feat(ce-brainstorm): probe rigor gaps with prose before Phase 2 (#677)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:13:35 -07:00
Trevin Chow bc8ae1a6b5 fix(main): recover version drift, fix stale test, document learnings (#678)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 15:07:49 -07:00
Kieran Klaassen 47350c3e4e fix(ce-test-browser): skip headed/headless question in pipeline mode
Agents spawned from LFG were blocking forever at the AskUserQuestion
prompt with no user present to respond. In mode:pipeline, default to
headless and skip step 2 entirely.

Bump 3.0.6 -> 3.0.7

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 13:17:07 -07:00
Kieran Klaassen 22d493b192 feat(ce-test-browser): gate port scan and auto-start on pipeline mode
- Port scan (find_free_port) only runs when PIPELINE_MODE=1
- Dev server auto-start only runs in pipeline mode; manual invocations
  print a help message and stop
- LFG step 6 now passes mode:pipeline to ce-test-browser so parallel
  agents claim non-colliding ports automatically
- Bump version 3.0.5 -> 3.0.6

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 12:59:50 -07:00
Kieran Klaassen f8720da3d1 feat(ce-test-browser): free-port scan and auto-server start
- Always verify preferred port is free; scan upward until finding one
- Auto-start dev server (bin/dev / rails server / npm run dev) on the
  claimed port if nothing is listening — no more "please start your server"
- Pass PORT= explicitly so parallel agents on the same machine never
  collide on 3000

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 12:56:31 -07:00
Kieran Klaassen 1f20c3842d feat(lfg): add ce-commit-push-pr step and remove ralph-loop
- Add ce-commit-push-pr as step 7 so LFG ends with a pushed branch and open PR
- Remove optional ralph-loop step (step 1) -- simplifies the pipeline
- Renumber all steps and fix cross-references accordingly
- Bump version 3.0.3 -> 3.0.4

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 12:18:26 -07:00
github-actions[bot] bc3709fc53 chore: release main (#675)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-24 07:24:56 -07:00
Trevin Chow f0433d9150 fix(ce-ideate): sharpen bug intent, surprise-me dispatch, and drop authoring refs (#672) 2026-04-24 02:21:21 -07:00
github-actions[bot] 6b5da46ccd chore: release main (#661)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-24 02:06:18 -07:00
Trevin Chow 6514b1fce5 feat(ce-ideate): subject gate, surprise-me, and warrant contract (#671)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:04:21 -07:00
Trevin Chow 494313e8eb fix(ce-brainstorm): enforce Interaction Rules in universal flow (#669) 2026-04-23 23:56:35 -07:00
Trevin Chow c33bf70f46 fix(skills): plan is a decision artifact; progress comes from git (#666) 2026-04-23 23:12:12 -07:00
Trevin Chow 9ddcd22aee fix(ce-demo-reel): prevent secrets in recorded demos (#664)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 21:52:52 -07:00
Trevin Chow 75cf4d603d feat(ce-commit-push-pr): skip evidence prompt when judgment allows (#663) 2026-04-23 16:59:53 -07:00
Trevin Chow 351d12ec5b fix(ce-update): compare against main plugin.json, not release tags (#660) 2026-04-23 14:36:58 -07:00
github-actions[bot] 5e6ec41b95 chore: release main (#657)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-22 20:15:57 -07:00
Trevin Chow a9fd8421f4 fix(ce-proof): correct op shapes and add retry/batch discipline (#658) 2026-04-22 19:52:02 -07:00
Trevin Chow b9ae6b758d fix(ce-update): replace cache sweep with claude plugin update (#656) 2026-04-22 18:23:15 -07:00
github-actions[bot] 7e83755acb chore: release main (#596)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-22 14:21:19 -07:00
Trevin Chow 5eb62a7d0e refactor(agents): restrict tools allowlist on research agents (#650)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:23:27 -07:00
Trevin Chow 23dc11b95a feat(ce-setup): check for ast-grep CLI and agent skill (#653) 2026-04-22 11:29:20 -07:00
Luca Henn fdf5fe4af5 feat(ce-demo-reel): add local save as alternative to catbox upload (#647) 2026-04-22 11:28:44 -07:00
Trevin Chow 7ddfbed33b feat(pi): first-class support via pi-subagents + pi-ask-user (#651)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 10:26:29 -07:00
Trevin Chow cce95fb814 feat(ce-debug): environment sanity, assumption audit, more techniques (#649)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 09:08:24 -07:00