Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
13 KiB
title, category, date, created, severity, component, problem_type, tags
| title | category | date | created | severity | component | problem_type | tags | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Release-please version drift recovery | workflow | 2026-04-24 | 2026-04-24 | high | release-automation | workflow_issue |
|
Release-please version drift recovery
Problem
Manual edits to a release-managed version field (any plugin.json listed in extra-files, package.json, or the release-please manifest) cause drift that:
- Breaks
bun run release:validateon every PR's CI - Can cause version regression on the next release-please run if left uncorrected
- Is easy to introduce accidentally — a one-line edit during a feature commit
- Has at least three valid recovery paths, each with different user-impact trade-offs
This doc is the playbook when drift is detected. It exists because investigating from scratch takes significant effort and the wrong choice can make things worse.
File relationship map
The repo has five release components. Each owns one or more files. Release-please reads the manifest and writes the extra-files.
.github/.release-please-manifest.json (release-please memory: last released per component)
├── "." → cli component (v = X.Y.Z)
├── "plugins/compound-engineering" → compound-engineering (v = X.Y.Z)
├── "plugins/coding-tutor" → coding-tutor (v = A.B.C)
├── ".claude-plugin" → marketplace (v = M.N.O)
└── ".cursor-plugin" → cursor-marketplace (v = P.Q.R)
.github/release-please-config.json (component config: extra-files, plugins)
└── "plugins": [{ type: "linked-versions", components: ["cli", "compound-engineering"] }]
← forces cli and compound-engineering to bump together
Each component's extra-files get rewritten by release-please when a release is cut:
cli (".") compound-engineering
├── package.json ($.version) ├── .claude-plugin/plugin.json ($.version)
├── .cursor-plugin/plugin.json ($.version)
└── .codex-plugin/plugin.json ($.version)
coding-tutor marketplace / cursor-marketplace
├── .claude-plugin/plugin.json ├── marketplace.json ($.metadata.version)
├── .cursor-plugin/plugin.json
└── .codex-plugin/plugin.json
Key invariants:
- Every file in a component's
extra-fileslist must share the same version. - Because of
linked-versions,cliandcompound-engineeringmust always be at the same version. That meanspackage.json(cli) and all threecompound-engineering/*/plugin.jsonfiles move together. - Marketplace components (
.claude-plugin,.cursor-plugin) are independent — they have their own versions and don't move with plugin bumps. bun run release:validateenforces these invariants. Its error message names the file(s) that drifted.
How release-please tracks versions
Release-please treats the manifest as the source of truth for "last released version per component." Extra-files are outputs that release-please writes to during a release. The flow is:
- Push to
maintriggers therelease-prworkflow - Release-please reads the manifest to know what was last released
- Release-please walks conventional commits since the last release tag
- Release-please computes the next version (
feat:→ minor,fix:→ patch, etc.) - Release-please opens or updates a "chore: release main" PR that:
- Bumps the manifest to the new version
- Bumps every
extra-fileto the new version
- When the release PR merges, the release is cut (git tag, optional npm publish)
Under normal operation, humans never touch the manifest or the extra-files. Both are updated together by release-please during a release PR. Drift is the state where this guarantee has been violated.
Drift detection
bun run release:validate runs on every PR and on every push to main via .github/workflows/ci.yml. It fails when:
- Any two
extra-fileswithin the same component have different versions - A marketplace plugin-list is asymmetric across
.claude-plugin,.cursor-plugin, and.agents/plugins - A Codex manifest is missing required fields or its
skillsdirectory - A description has drifted across Claude/Cursor/Codex plugin.json files (auto-corrected with
write: true)
Important: the manifest's memory of last-released version is not directly validated against extra-files. This means a state where all extra-files agree at X.Y.Z but the manifest thinks the last release was W.X.Y (W<X) will pass release:validate today. It will fail the NEXT release-please run when release-please tries to bump back down from W.X.Y.
Recovery decision tree
When drift is detected, choose the recovery path before editing anything.
release:validate reports drift
↓
1. Identify which component(s) have drifted. Check:
- extra-files vs each other within the component
- extra-files vs the manifest entry for that component
- linked-versions: is cli in sync with compound-engineering?
↓
2. Is anyone installed at the drifted (higher) version?
├── YES (or unknown — developers using `/plugin install --dev` from main)
│ → Forward-sync: update all lower files UP to match the drifted high
│
└── NO (can verify no git tag, no npm publish, no marketplace cache entry)
→ Backward-revert: revert the drifted file DOWN to match the rest
Path A: Forward-sync
Use when any user may have installed the drifted version locally (most common case — developers running /plugin install from a main checkout will have whatever version .claude-plugin/plugin.json says).
Scope of changes:
- All extra-files within the affected component(s) → bump up to the drifted version
.github/.release-please-manifest.jsonentry for the affected component → bump to match- If the affected component is
compound-engineeringorcli: because oflinked-versions, bump BOTH:- manifest's
plugins/compound-engineeringand.entries together package.json(cli's extra-file) and all three compound-engineering plugin.json files together
- manifest's
Why the manifest edit is necessary: without it, the next release-please run reads "last released was W.X.Y" (the stale manifest value), computes next version as W.X.(Y+1), and writes W.X.(Y+1) to extra-files — regressing any user at the forward-synced version.
This is release-please's documented recovery pattern for out-of-band releases. The manifest file is in git precisely so it can be manually corrected when a release happened outside release-please's normal flow.
Path B: Backward-revert
Use when you can verify no user is installed at the drifted version. Requires:
- No git tag exists for the drifted version (e.g.,
git tag -l | grep <version>) - No npm publish exists (e.g.,
npm view @every-env/compound-plugin versions) - No marketplace release exists for the drifted version
- No team member has pulled main since the drift was introduced (hard to verify; assume YES if drift has existed longer than ~an hour)
Scope of changes:
- The drifted extra-file(s) → revert DOWN to the manifest's value
- Manifest and
package.json(cli) unchanged — they were already correct
This is simpler (fewer files changed, no manifest edit) but risks user regression if anyone was installed at the drifted high version. Their local cache dir (e.g., ~/.claude/plugins/cache/.../compound-engineering/<drifted>/) becomes orphaned, and tooling that treats the version field as monotonic may refuse to downgrade or emit warnings.
Path C: release-as pin
Use when you want release-please itself to perform the sync via a normal release PR, instead of manually editing the manifest.
Scope of changes:
- Forward-sync all extra-files UP to the drifted version (same as Path A)
- Add
"release-as": "<drifted+1>"to each affected component in.github/release-please-config.json - Do NOT edit the manifest
- Next release-please run produces a release PR bumping everything to
<drifted+1>, which is above the drifted version and avoids regression
Caveat: after the release PR merges, the release-as pin must be removed — otherwise every subsequent release will be pinned to that same version. The repo has been bitten by stale release-as pins before (see ab44d89b), so Path C is usually more overhead than Path A. Prefer Path A unless there's a specific reason release-please should drive the bump.
Summary
| Path | Files changed | When to use | Risk |
|---|---|---|---|
| A — forward-sync | 3–5 (extras + manifest + linked) | Anyone might be at drifted version (default) | None if execution is correct |
| B — backward-revert | 1–2 (just the drifted extras) | Verified no one is at drifted version | User regression if verification was wrong |
C — release-as pin |
3–5 + config change + cleanup after | Want release-please to drive the bump | Requires remembering to remove pin |
Manifest manual edits
Release-please normally maintains .github/.release-please-manifest.json. It's updated inside release PRs alongside the extra-files. Humans don't touch it under normal operation.
Manual edits are legitimate in exactly one case: recovery from out-of-band releases or version changes, as in Path A above. Release-please's own documentation calls this out — the manifest is tracked in git precisely so it can be corrected when reality diverges from what release-please remembers.
If you find yourself editing the manifest for any reason other than Path A recovery, stop and reconsider. You're probably doing something release-please is meant to own.
Worked example: 2026-04-24 incident
Between chore: release main (#675) (which cut 3.0.3) and PR #677, four direct-to-main merges (1f20c384, f8720da3, 22d493b1, 47350c3e) each bumped .claude-plugin/plugin.json by one patch version — 3.0.3 → 3.0.4 → 3.0.5 → 3.0.6 → 3.0.7 — without touching .cursor-plugin, .codex-plugin, the manifest, or package.json. The bumps were added inline with feature work, bypassing PR CI because they landed via direct merge to main.
The drift was invisible until PR #677 opened. That PR's CI ran release:validate on its merge commit, which inherited main's drifted state (.claude-plugin at 3.0.7, everything else at 3.0.3). The validator failed on .cursor-plugin/plugin.json and .codex-plugin/plugin.json for not matching .claude-plugin.
Recovery used Path A (forward-sync to 3.0.7) because:
- Developers installing
compound-engineeringfrom a local main checkout would have 3.0.7 in their plugin cache by then - Path B would have orphaned those caches and triggered version-regression warnings
- Path C would have reintroduced a
release-aspin that had just been cleaned up
PR #678 applied the full Path A fix across five fields in four files and updated a stale test assertion that was also hiding behind the release-validate failure (commit 1f20c384 renumbered steps in lfg/SKILL.md but didn't update tests/review-skill-contract.test.ts). See PR #678's commits for the exact diff.
Prevention
Direct-to-main merges are the root cause. They bypass the PR CI that runs release:validate, test suites, and semantic title validation.
Branch protection on main is the enforcement. The test status check must be required before merge, and admin bypass should be off (or used only for true emergencies). Without branch protection, the AGENTS.md "don't manually bump" rule is honor-system only — which this incident showed is insufficient.
Optional complementary guards:
- Dedicated CI job detecting manual version bumps on non-release-please PR authors. Attacks the cause rather than the symptom, with a clearer error message pointing at AGENTS.md.
- Pre-commit hook running
release:validatelocally viacore.hooksPath. Catches accidents before push. Bypassable with--no-verify, so it's a speed-bump, not a lock.
These are nice-to-haves. Branch protection is the real fix.
Related docs
docs/solutions/workflow/manual-release-please-github-releases.md— big-picture release modeldocs/solutions/plugin-versioning-requirements.md— plugin-scoped contributor rulesplugins/compound-engineering/AGENTS.md— "Versioning Requirements" section with the don't-manually-bump rules.github/release-please-config.json— the extra-files and linked-versions configuration this doc referencessrc/release/metadata.ts— thesyncReleaseMetadatafunction thatrelease:validateruns