23 KiB
Agent Instructions
This repository primarily houses the compound-engineering coding-agent plugin and the Claude Code marketplace/catalog metadata used to distribute it.
It also contains:
- the Bun/TypeScript CLI that converts Claude Code plugins into other agent platform formats
- shared release and metadata infrastructure for the CLI, marketplace, and plugin
AGENTS.md is the canonical repo instruction file. Root CLAUDE.md exists only as a compatibility shim for tools and conversions that still look for it.
Quick Start
bun install
bun test # full test suite
bun run release:validate # check plugin/marketplace consistency
Working Agreement
- Branching: Create a feature branch for any non-trivial change. If already on the correct branch for the task, keep using it; do not create additional branches or worktrees unless explicitly requested.
- Merge policy: All changes to
maingo through pull requests. Direct pushes and direct merges are not allowed; branch protection onmainenforces this by requiring theteststatus check to pass. The direct path bypassesrelease:validate, the test suite, and PR title validation — past direct merges have caused version drift requiring multi-PR recovery (seedocs/solutions/workflow/release-please-version-drift-recovery.md). - Safety: Do not delete or overwrite user data. Avoid destructive commands.
- Testing: Run
bun testafter changes that affect parsing, conversion, or output. - Release versioning: Releases are prepared by release automation, not normal feature PRs. The repo now has multiple release components (
cli,compound-engineering,marketplace,cursor-marketplace). GitHub release PRs and GitHub Releases are the canonical release-notes surface for new releases; rootCHANGELOG.mdis only a pointer to that history. Use conventional titles such asfeat:andfix:so release automation can classify change intent, but do not hand-bump release-owned versions or hand-author release notes in routine PRs. - Linked versions (cli + compound-engineering): The
linked-versionsrelease-please plugin keepscliandcompound-engineeringat the same version. This is intentional -- it simplifies version tracking across the CLI and the plugin it ships. A consequence is that a release with only plugin changes will still bump the CLI version (and vice versa). The CLI changelog may also include commits thatexclude-pathswould normally filter, becauselinked-versionsoverrides exclusion logic when forcing a synced bump. This is a known upstream release-please limitation, not a misconfiguration. Do not flag linked-version bumps as unnecessary. - Output Paths: Keep OpenCode output at
opencode.jsonand.opencode/{agents,skills,plugins}. For OpenCode, command go to~/.config/opencode/commands/<name>.md;opencode.jsonis deep-merged (never overwritten wholesale). - Scratch Space: Default to OS temp. Use
.context/only when explicitly justified by the rules below.- Default: OS temp — covers most scratch, including per-run throwaway AND cross-invocation reusable, regardless of whether a repo is present or whether other skills may read the files. A stable OS-temp prefix handles cross-skill and cross-invocation coordination equally well as an in-repo path; repo-adjacency is rarely the relevant property.
- Per-run throwaway:
mktemp -d -t <prefix>-XXXXXX(OS handles cleanup). Use for files consumed once and discarded — captured screenshots, stitched GIFs, intermediate build outputs, recordings, delegation prompts/results, single-run checkpoints. The resulting path is opaque (on macOS it resolves under$TMPDIR//var/folders/...) — that is appropriate for throwaway files users are not meant to access. - Cross-invocation reusable: stable path
/tmp/compound-engineering/<skill-name>/<run-id>/— notmktemp -d— so later invocations of the same skill can discover sibling run-ids. Use/tmpdirectly rather than$TMPDIRso paths stay accessible:$TMPDIRon macOS resolves to/var/folders/64/.../T/, which is hostile for users who want to inspect checkpoints, grep them, or copy them out. The per-user isolation$TMPDIRprovides is not valuable for cross-invocation reusable scratch where users are the intended audience. Use for caches keyed by session, checkpoints meant to survive context compaction within a loose session, or any state where later runs of the same skill need to locate prior outputs.
- Per-run throwaway:
- Exception:
.context/— use only when the artifact is genuinely bound to the CWD repo AND meets at least one of:- (a) User-curated: the user is expected to inspect, manipulate, or manually curate the artifact outside the skill (e.g., a per-repo TODO database, a per-spec optimization log that survives across sessions on the same checkout).
- (b) Repo+branch-inseparable: the artifact's meaning is inseparable from this specific repo or branch (e.g., branch-specific resume state that a user expects to pick up again in the same checkout).
- (c) Path is core UX: surfacing the artifact path back to the user is a core part of the skill's output and that path is easier to communicate as a repo-relative location than an OS-temp one.
Namespace under
.context/compound-engineering/<workflow-or-skill-name>/, add a per-run subdirectory when concurrent runs are plausible, and decide cleanup behavior per the artifact's lifecycle (per-run scratch clears on success; user-curated state persists). "Shared between skills" is not by itself sufficient — OS temp handles that equally well.
- Durable outputs (plans, specs, learnings, docs, final deliverables) belong in
docs/or another repo-tracked location, not in either scratch tier. - Cross-platform note:
/tmpis writable on macOS (symlink to/private/tmp), Linux, and WSL.mktemp -d -t <prefix>-XXXXXXalso works on all three. Skills authored here assume Unix-like shells; native Windows is not a current target.
- Default: OS temp — covers most scratch, including per-run throwaway AND cross-invocation reusable, regardless of whether a repo is present or whether other skills may read the files. A stable OS-temp prefix handles cross-skill and cross-invocation coordination equally well as an in-repo path; repo-adjacency is rarely the relevant property.
- Character encoding:
- Identifiers (file names, agent names, command names): ASCII only -- converters and regex patterns depend on it.
- Markdown tables: Use pipe-delimited (
| col | col |), never box-drawing characters. - Prose and skill content: Unicode is fine (emoji, punctuation, etc.). Prefer ASCII arrows (
->,<-) over Unicode arrows in code blocks and terminal examples.
Directory Layout
src/ CLI entry point, parsers, converters, target writers
plugins/ Plugin workspaces (compound-engineering)
.claude-plugin/ Claude marketplace catalog metadata
tests/ Converter, writer, and CLI tests + fixtures
docs/ Requirements, plans, solutions, and target specs
CONCEPTS.md Shared domain vocabulary (glossary of project-specific terms)
Repo Surfaces
Changes in this repo may affect one or more of these surfaces:
compound-engineeringunderplugins/compound-engineering/- the Claude marketplace catalog under
.claude-plugin/ - the converter/install CLI in
src/andpackage.json
Do not assume a repo change is "just CLI" or "just plugin" without checking which surface owns the affected files.
Plugin Maintenance
When changing plugins/compound-engineering/ content:
- Update substantive docs like
plugins/compound-engineering/README.mdwhen the plugin behavior, inventory, or usage changes. - Do not hand-bump release-owned versions in plugin or marketplace manifests.
- Do not hand-add release entries to
CHANGELOG.mdor treat it as the canonical source for new releases. - Run
bun run release:validateif agents, commands, skills, MCP servers, or release-owned descriptions/counts may have changed. - When removing a skill, agent, or command, add its name to both cleanup registries so stale flat-install artifacts are swept on upgrade:
STALE_SKILL_DIRS/STALE_AGENT_NAMES/STALE_PROMPT_FILESinsrc/utils/legacy-cleanup.tsEXTRA_LEGACY_ARTIFACTS_BY_PLUGIN["compound-engineering"]insrc/data/plugin-legacy-artifacts.ts
Useful validation commands:
bun run release:validate
cat .claude-plugin/marketplace.json | jq .
cat plugins/compound-engineering/.claude-plugin/plugin.json | jq .
Validating Agent and Skill Changes
Behavioral changes to a plugin agent or skill (anything under plugins/*/agents/ or plugins/*/skills/) need a different validation path than mechanical code changes, because of how Claude Code loads plugins.
-
Use the
skill-creatorskill to test changes. Skill-creator is purpose-built for this: it spawns a generic subagent and injects the agent or skill content into the subagent's prompt at dispatch time, so each run reads the current source from disk. Invoke/skill-creatorand use its eval workflow rather than reaching for ad-hoc workarounds. -
Plugin agent and skill definitions both cache at session start. Once a Claude Code session is open, dispatching a typed agent (e.g.,
Agent({subagent_type: "compound-engineering:ce-session-historian"})) runs the in-memory copy that was loaded when the session began. The same applies to skills: invokingSkill ce-session-inventorygoes through the cached skill loader, so edits to skill scripts are also not tested via that path. File edits to either layer after session start do not propagate within the same session. Any iteration loop built around typed-agent dispatch or Skill-tool invocation in the same session is testing pre-edit content, not your changes. -
Do NOT edit
~/.claude/plugins/cache/or~/.claude/plugins/marketplaces/to try to force a reload. Those paths are user machine state, not repo-managed. Modifying them does not reliably bypass the in-session cache (it didn't, in observed behavior), risks being silently overwritten by plugin updates, and is the wrong layer to test from. The skill-creator pattern is the proper approach; if you genuinely need fresh-loaded behavior of the typed-agent dispatch path, restart the Claude Code session — but skill-creator is preferred for fast iteration. -
Mechanical changes do not have this restriction. Skill scripts (e.g.,
extract-metadata.py), parser logic, conversion code, and anythingbun testexercises always run the current source. The caching issue only affects LLM-driven agent or skill prose behavior dispatched through the plugin loader.
Coding Conventions
- Prefer explicit mappings over implicit magic when converting between platforms.
- Keep target-specific behavior in dedicated converters/writers instead of scattering conditionals across unrelated files.
- Preserve stable output paths and merge semantics for installed targets; do not casually change generated file locations.
- When adding or changing a target, update fixtures/tests alongside implementation rather than treating docs or examples as sufficient proof.
Commit Conventions
- Prefix is based on intent, not file type. Use conventional prefixes (
feat:,fix:,docs:,refactor:, etc.) but classify by what the change does, not the file extension. Files underplugins/*/skills/,plugins/*/agents/, and.claude-plugin/are product code even though they are Markdown or JSON. Reservedocs:for files whose sole purpose is documentation (README.md,docs/,CHANGELOG.md). - Type selection — classify by intent, not diff shape. Where
fix:andfeat:could both seem to fit, default tofix:: a change that remedies broken or missing behavior isfix:even when implemented by adding code, and net additions do not turn a fix into afeat:. Reservefeat:for capabilities the user could not previously accomplish where nothing was broken. Other conventional types (chore:,refactor:,docs:,perf:,test:,ci:,build:,style:) remain primary when they describe the change more precisely than either. Heuristic: if a regression test you could write today would have failed before the change, it'sfix:. The user may override this default for a specific change. - Include a component scope. The scope appears verbatim in the changelog. Pick the narrowest useful label: skill/agent name (
document-review,learnings-researcher), CLI or marketplace area (cli,marketplace), or shared area when cross-cutting (review,research,converters). Never usecompound-engineering— it's the entire plugin and tells the reader nothing. Omit scope only when no single label adds clarity. - Never use
!or aBREAKING CHANGE:footer without explicit user confirmation. These markers trigger release-please's automatic major version bump — a decision the user may not want even when a change is technically breaking. If a change appears breaking, surface that to the user and let them decide whether to apply the marker.
Adding a New Target Provider
Only add a provider when the target format is stable, documented, and has a clear mapping for tools/permissions/hooks. Use this checklist:
-
Define the target entry
- Add a new handler in
src/targets/index.tswithimplemented: falseuntil complete. - Use a dedicated writer module (e.g.,
src/targets/codex.ts).
- Add a new handler in
-
Define types and mapping
- Add provider-specific types under
src/types/. - Implement conversion logic in
src/converters/(from Claude → provider). - Keep mappings explicit: tools, permissions, hooks/events, model naming.
- Add provider-specific types under
-
Wire the CLI
- Ensure
convertandinstallsupport--to <provider>and--also. - Keep behavior consistent with OpenCode (write to a clean provider root).
- Ensure
-
Tests (required)
- Extend fixtures in
tests/fixtures/sample-plugin. - Add spec coverage for mappings in
tests/converter.test.ts. - Add a writer test for the new provider output tree.
- Add a CLI test for the provider (similar to
tests/cli.test.ts).
- Extend fixtures in
-
Docs
- Update README with the new
--tooption and output locations.
- Update README with the new
Agent References in Skills
When referencing agents from within skill SKILL.md files (e.g., via the Agent or Task tool), use the bare ce-<agent-name> form. The ce- prefix identifies the agent as a compound-engineering component and is sufficient for uniqueness across plugins.
Example:
ce-learnings-researcher(correct)learnings-researcher(wrong — thece-prefix is required; it's what prevents collisions with agents from other plugins that might share a short name)
File References in Skills
Each skill directory is a self-contained unit. A SKILL.md file must only reference files within its own directory tree (e.g., references/, assets/, scripts/) using relative paths from the skill root. Never reference files outside the skill directory — whether by relative traversal or absolute path.
Broken patterns:
../other-skill/references/schema.yaml— relative traversal into a sibling skill/home/user/plugins/compound-engineering/skills/other-skill/file.md— absolute path to another skill~/.claude/plugins/cache/marketplace/compound-engineering/1.0.0/skills/other-skill/file.md— absolute path to an installed plugin location
Why this matters:
- Runtime resolution: Skills execute from the user's working directory, not the skill directory. Cross-directory paths and absolute paths will not resolve as expected.
- Unpredictable install paths: Plugins installed from the marketplace are cached at versioned paths. Absolute paths that worked in the source repo will not match the installed layout, and the version segment changes on every release.
- Converter portability: The CLI copies each skill directory as an isolated unit when converting to other agent platforms. Cross-directory references break because sibling directories are not included in the copy.
If two skills need the same supporting file, duplicate it into each skill's directory. Prefer small, self-contained reference files over shared dependencies.
Note (March 2026): This constraint reflects current Claude Code skill resolution behavior and known path-resolution bugs (#11011, #17741, #12541). If Anthropic introduces a shared-files mechanism or cross-skill imports in the future, this guidance should be revisited with supporting documentation.
Platform-Specific Variables in Skills
This plugin is authored once and converted for multiple agent platforms (Claude Code, Codex, Gemini CLI, etc.). Do not use platform-specific environment variables or string substitutions (e.g., ${CLAUDE_PLUGIN_ROOT}, ${CLAUDE_SKILL_DIR}, ${CLAUDE_SESSION_ID}, CODEX_SANDBOX, CODEX_SESSION_ID) in skill content without a graceful fallback that works when the variable is unavailable or unresolved.
Whether a relative path resolves against the skill directory depends on who resolves it, so the two cases below must be handled differently. Do not assume a bare scripts/… path works in both.
Read-time file references — resolve against the skill directory: When skill content points the agent at a co-located file to read (e.g., "read references/schema.yaml"), use a relative path from the skill root. The skill loader resolves these against the skill's own directory on all major platforms — no variable prefix needed. This is the rule in File References in Skills above.
Runtime script invocations via the Bash tool — resolve against the project CWD: When skill content tells the agent to execute a bundled script through the Bash tool, a bare relative path does not work on Claude Code. The Bash tool's working directory is the user's project, not the skill directory, so bash scripts/my-script.sh resolves to <project>/scripts/…, finds nothing, and the step is silently skipped. This is a recurring bug class — see #764 (ce-worktree), #811 (ce-code-review), and #898 (ce-compound). Wrap the invocation in a file-existence guard on ${CLAUDE_SKILL_DIR} so it runs on Claude Code and degrades visibly elsewhere:
if [ -n "${CLAUDE_SKILL_DIR}" ] && [ -f "${CLAUDE_SKILL_DIR}/scripts/my-script.sh" ]; then
bash "${CLAUDE_SKILL_DIR}/scripts/my-script.sh" ARG
else
echo "<this step's bundled script is unavailable on this platform; do X instead>"
fi
(The [ -n "${CLAUDE_SKILL_DIR}" ] guard keeps an unset variable from probing a root-level /scripts/… path.)
${CLAUDE_SKILL_DIR} is substituted into SKILL.md content by Claude Code, covering both marketplace-cached installs and claude --plugin-dir local dev; it resolves to the skill's own directory, so the then branch runs there. Note ${CLAUDE_SKILL_DIR} is a SKILL.md content substitution, not an environment variable available inside the executed process — a script that needs its own directory should derive it from BASH_SOURCE rather than reading $CLAUDE_SKILL_DIR (see ce-update/scripts/). ce-compound's validate-frontmatter.py invocation is the canonical example of this guard pattern.
Why the guard, and why not the old ${CLAUDE_SKILL_DIR:-.} shell default (issue #943). On other targets (Codex, Gemini CLI, etc.) ${CLAUDE_SKILL_DIR} is unset. The earlier :-. form degraded to a project-CWD-relative ./scripts/… — syntactically valid, but resolving to a path that does not exist: the bundled script lives in that runtime's own skill store (e.g. ~/.codex/skills/<plugin>/<skill>/scripts/…), so the call silently missed. The existence guard makes that case explicit — the then branch never fires, and the else branch tells the agent what to do instead of running a broken path or claiming success. Two facts make this a real product gap, not a converter bug:
- The converter does not rewrite these paths (
src/utils/codex-content.tshas noCLAUDE_SKILL_DIRcase), and in the default--to codexmode skills are not emitted by the converter at all. - This plugin also ships as a native Codex plugin (installed via Codex's
/pluginsTUI marketplace). That path never runs the converter — Codex loads the rawSKILL.mdverbatim. Thece_platformsfrontmatter is honored only by the converter'sfilterSkillsByPlatform, so it does not keep a Claude-only skill out of a native Codex install. The only protection both install paths respect is the SKILL.md content itself.
So: do not gate a skill's core behavior on a runtime bundled-script call when portability matters. The existence guard above suits an optional guard script (e.g. ce-compound's validate-frontmatter.py), where the else branch runs an equivalent inline check so the protection still fires off-Claude instead of silently skipping. For a skill whose entire behavior is a bundled script, a guard does not make it work off-Claude — prefer logic the agent can perform inline, or content the agent reads (read-time references resolve against the skill dir on all targets).
Permission caveat (Claude Code). Claude Code's permission checker evaluates every subcommand of a compound command, and a bare [ -f … ] test is not pre-approved — so wrapping a pinned bash "…sh" call in an if … then … fi guard defeats a narrow Bash(bash *…sh) allow-rule and prompts on every run. If a bundled-script call must stay auto-approved via such a pin, keep it a single pinned command rather than guarding it inline.
When a platform variable is unavoidable: Use the pre-resolution pattern (! backtick syntax) and include explicit fallback instructions in the skill content, so the agent knows what to do if the value is empty, literal, or an error:
**Plugin version (pre-resolved):** !`jq -r .version "${CLAUDE_PLUGIN_ROOT}/.claude-plugin/plugin.json"`
If the line above resolved to a semantic version (e.g., `2.42.0`), use it.
Otherwise (empty, a literal command string, or an error), use the versionless fallback.
Do not attempt to resolve the version at runtime.
This applies equally to any platform's variables — a skill converted from Codex, Gemini, or any other platform will have the same problem if it assumes platform-only variables exist without a fallback.
Repository Docs Convention
- Requirements live in
docs/brainstorms/— requirements exploration and ideation. - Plans live in
docs/plans/— implementation plans and progress tracking. - Solutions live in
docs/solutions/— documented solutions to past problems (bugs, best practices, workflow patterns), organized by category with YAML frontmatter (module,tags,problem_type). Relevant when implementing or debugging in documented areas. - Specs live in
docs/specs/— target platform format specifications.
Solution categories (docs/solutions/)
This repo builds a plugin for developers. Categorize solutions from the perspective of the end user (a developer using the plugin), not a contributor to this repo.
developer-experience/— Issues with contributing to this repo: local dev setup, shell aliases, test ergonomics, CI friction. If the fix only matters to someone with a checkout of this repo, it belongs here.integrations/— Issues where plugin output doesn't work correctly on a target platform or OS. Cross-platform bugs, target writer output problems, and converter compatibility issues go here.workflow/,skill-design/— Plugin skill and agent design patterns, workflow improvements.
When in doubt: if the bug affects someone running bun install compound-engineering or bun convert, it's an integration or product issue, not developer-experience.