Commit Graph

3882 Commits

Author SHA1 Message Date
Kazuki Yamada cdeb9563c5 Merge branch 'main' into perf/auto-perf-tuning 2026-05-11 09:50:43 +09:00
Kazuki Yamada c8560d1bc6 perf(core): Reduce metrics warmup to 1 worker on warm-cache path
Lower `EAGER_WARMUP_THREADS` from 2 to 1 when `tokenCountCacheFileExists()`
returns true. With the persistent token-count disk cache populated by a
prior run, `calculateFileMetrics` serves every per-file token count from
the in-memory map and dispatches zero worker tasks. The only worker work
that survives caching on a warm rerun is a small fixed set of dispatches:

  - the wrapper-token tokenization (cache hit after run #2)
  - git diff staged/worktree token counts (only when
    `output.git.includeDiffs` is enabled)
  - git log token count (only when `output.git.includeLogs` is enabled)

That worst case is 2-3 short tasks (a few KB each) that fit a single warm
worker serially in well under 30 ms. Spawning a second warm worker means
a redundant ~340 ms BPE table parse that contends with the file-collection
main thread for CPU AND extends the final `pool.destroy()` blocking wait
(BPE-loaded workers take ~21 ms to terminate vs ~3 ms when idle).

Cold-cache (no cache file) behavior is preserved: the unscoped path keeps
3 warm workers and the explicit-scope path keeps 2, so the actual file
tokenizations still parallelise across the original worker counts.

The probe is a coarse heuristic — a cache file written by a previous run
that used a different `tokenCount.encoding` (e.g. cl100k_base instead of
the default o200k_base) yields no hits for the current run, so the metrics
phase pays one BPE parse sequentially on the critical path before
tokenizing files. This is a one-time cost on encoding switches; subsequent
runs rebuild the cache for the new encoding and hit again.

Benchmark (paired, n=25, repomix self-pack on 1068 files):

  WARM CACHE (cache file present)
    BASELINE  mean=968.9ms  median=976.0ms  sd=40.3ms
    AFTER     mean=883.2ms  median=875.0ms  sd=33.1ms
    DELTA     mean=85.6 ms (8.84%)  median=87.0 ms  sd=42.7
              t=10.02 (df=24)  faster=24/25

  COLD CACHE (cache file deleted before each run, n=12)
    BASELINE  mean=1606.3ms  median=1588.0ms  sd=58.6ms
    AFTER     mean=1593.2ms  median=1598.5ms  sd=58.6ms
    DELTA     mean=13.2 ms (0.82%)  t=0.62  faster=9/12  — within noise

Stacks on top of the existing warm-cache wins on this branch (token-count
disk cache, output-wrapper cache, prefetched template, native ignore-file
prescan, etc.); this single change pushes warm-cache wall-clock another
~86 ms below the previous floor.
2026-05-11 00:13:37 +09:00
Kazuki Yamada e27d8be1c4 Merge pull request #1565 from yamadashy/chore/remove-agent-memory-skill
chore(skills): Remove agent-memory skill in favor of agent-carnet
2026-05-10 23:13:25 +09:00
Kazuki Yamada fe83afdcb0 chore(skills): Remove agent-memory skill in favor of agent-carnet
agent-carnet now serves as the project-local notebook for AI agents
(introduced in #1564). The agent-memory skill is no longer used in
this repository, so its bundled SKILL.md and memories/.gitignore are
removed. Note that auto-memory loaded by Claude Code itself is a
separate, built-in mechanism and is unaffected by this change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 23:09:53 +09:00
Kazuki Yamada c58db40761 Merge pull request #1564 from yamadashy/feat/agent-carnet-skill
chore(skills): Add agent-carnet skill for repository-local notebook
2026-05-10 23:01:30 +09:00
Kazuki Yamada 366e50dd2a chore(skills): Add agent-carnet skill for repository-local notebook
Introduce agent-carnet, a file-based markdown notebook for AI agents,
as an installable skill. Notes live under .carnet/ as personal,
git-ignored content (only .gitignore and README.md are tracked).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 22:19:54 +09:00
Kazuki Yamada ca44a74f3f perf(core): Wire warm-cache heuristic into packager and tests
Companion to the previous commit. Plumb `tokenCountCacheFileExists` into
the packager `defaultDeps` so the metrics warm-up sizing can be exercised
deterministically from tests, and add a paired test that asserts the
2-warm-up-worker branch is taken when the persistent disk cache exists.

Also rename the cold-cache test to make the new gating explicit and refresh
its docstring with the warm/cold distinction.

https://claude.ai/code/session_01TJqKkJ8n3r6Pa2JdW9Vp2w
2026-05-10 15:59:43 +09:00
Kazuki Yamada dcdbdd5efc perf(core): Skip 3rd metrics warmup worker when token cache is warm
The metrics worker pool eagerly spawns N workers at pack startup so each
worker can parse gpt-tokenizer's o200k_base BPE table (~340 ms of pure-CPU
work) in parallel with file search and collection. Previously a fixed
EAGER_WARMUP_THREADS=3 was used on the unscoped (default-scan) path because
3 BPE parses amortize across the file tokenization that follows.

With the persistent token-count disk cache (introduced earlier on this
branch), warm-cache repeat runs serve almost every per-file token count
out of the in-memory cache and dispatch zero worker tasks for them. The
3rd worker's ~340 ms BPE parse becomes pure overhead that contends with
file collection (~360 ms) and security check (~140 ms) for the 4 cores
on a typical host.

Gate the 3rd warm-up worker on `tokenCountCacheFileExists()` (a sync
existsSync on the cache JSON in $TMPDIR). When the cache file exists from
a previous run we treat the run as warm-cache-likely and warm 2 workers;
when it is missing (true cold cache) we keep the original 3-worker warmup
so the actual tokenizations parallelise.

Inject tokenCountCacheFileExists via the packager `deps` object so the
test suite can deterministically exercise both branches without depending
on /tmp filesystem state. Keep the existing `hasExplicitScope` gate
intact — explicit scopes still warm only 2 workers regardless of cache
state, matching the prior tuning for shorter metrics phases on small
file sets.

Benchmark (n=30, paired, NODE_DISABLE_COMPILE_CACHE=1, repomix self-pack
on 1047 files, 4-core host):

  Warm cache (cache file present)
    BASELINE  median=1162.7  mean=1145.9  sd=62.8 ms
    AFTER     median=1033.7  mean=1035.3  sd=50.4 ms
    DELTA     mean=110.5 ms (9.65%)  median=110.3 ms
              t=11.97 (df=29)  faster=30/30

  Cold cache (cache file deleted before each run, n=20)
    BASELINE  median=1658.8  mean=1675.0  sd=91.1 ms
    AFTER     median=1632.0  mean=1652.3  sd=102.9 ms
    DELTA     mean=22.7 ms (1.36%)  median=42.2 ms
              t=1.29 (df=19)  faster=13/20  — within noise

Test plan:
- All 1261 tests pass (+1 new test for the warm-cache branch)
- Lint clean
- Hosts with `getProcessConcurrency() < 3` are unaffected: the
  `Math.min(processConcurrency, EAGER_WARMUP_THREADS)` floor in
  `getWorkerThreadCount` already collapses to the host CPU count.

https://claude.ai/code/session_01TJqKkJ8n3r6Pa2JdW9Vp2w
2026-05-10 15:56:41 +09:00
Kazuki Yamada fd82811b4e Merge pull request #1561 from yamadashy/dependabot/npm_and_yarn/npm_and_yarn-f124fd438d
chore(deps): Bump the npm_and_yarn group across 3 directories with 3 updates
2026-05-10 15:02:24 +09:00
dependabot[bot] 8623e4e80a chore(deps): Bump the npm_and_yarn group across 3 directories with 3 updates
Bumps the npm_and_yarn group with 2 updates in the / directory: [fast-xml-builder](https://github.com/NaturalIntelligence/fast-xml-builder) and [fast-uri](https://github.com/fastify/fast-uri).
Bumps the npm_and_yarn group with 2 updates in the /website/client directory: [fast-uri](https://github.com/fastify/fast-uri) and [@babel/plugin-transform-modules-systemjs](https://github.com/babel/babel/tree/HEAD/packages/babel-plugin-transform-modules-systemjs).
Bumps the npm_and_yarn group with 2 updates in the /website/server directory: [fast-xml-builder](https://github.com/NaturalIntelligence/fast-xml-builder) and [fast-uri](https://github.com/fastify/fast-uri).


Updates `fast-xml-builder` from 1.1.4 to 1.1.7
- [Changelog](https://github.com/NaturalIntelligence/fast-xml-builder/blob/main/CHANGELOG.md)
- [Commits](https://github.com/NaturalIntelligence/fast-xml-builder/compare/v1.1.4...V1.1.7)

Updates `fast-uri` from 3.1.0 to 3.1.2
- [Release notes](https://github.com/fastify/fast-uri/releases)
- [Commits](https://github.com/fastify/fast-uri/compare/v3.1.0...v3.1.2)

Updates `fast-uri` from 3.0.6 to 3.1.2
- [Release notes](https://github.com/fastify/fast-uri/releases)
- [Commits](https://github.com/fastify/fast-uri/compare/v3.1.0...v3.1.2)

Updates `@babel/plugin-transform-modules-systemjs` from 7.25.9 to 7.29.4
- [Release notes](https://github.com/babel/babel/releases)
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md)
- [Commits](https://github.com/babel/babel/commits/v7.29.4/packages/babel-plugin-transform-modules-systemjs)

Updates `fast-xml-builder` from 1.1.4 to 1.2.0
- [Changelog](https://github.com/NaturalIntelligence/fast-xml-builder/blob/main/CHANGELOG.md)
- [Commits](https://github.com/NaturalIntelligence/fast-xml-builder/compare/v1.1.4...V1.1.7)

Updates `fast-uri` from 3.1.0 to 3.1.2
- [Release notes](https://github.com/fastify/fast-uri/releases)
- [Commits](https://github.com/fastify/fast-uri/compare/v3.1.0...v3.1.2)

---
updated-dependencies:
- dependency-name: fast-xml-builder
  dependency-version: 1.1.7
  dependency-type: direct:production
  dependency-group: npm_and_yarn
- dependency-name: fast-uri
  dependency-version: 3.1.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: fast-uri
  dependency-version: 3.1.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: "@babel/plugin-transform-modules-systemjs"
  dependency-version: 7.29.4
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: fast-xml-builder
  dependency-version: 1.2.0
  dependency-type: indirect
  dependency-group: npm_and_yarn
- dependency-name: fast-uri
  dependency-version: 3.1.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-05-10 05:49:37 +00:00
Kazuki Yamada 1eb0358548 Merge pull request #1549 from yamadashy/dependabot/npm_and_yarn/npm_and_yarn-789b1e4b46
chore(deps): Bump the npm_and_yarn group across 3 directories with 3 updates
2026-05-10 14:47:18 +09:00
dependabot[bot] 4492598a84 chore(deps): Bump the npm_and_yarn group across 3 directories with 3 updates
Bumps the npm_and_yarn group with 2 updates in the / directory: [hono](https://github.com/honojs/hono) and [ip-address](https://github.com/beaugunderson/ip-address).
Bumps the npm_and_yarn group with 1 update in the /website/client directory: [serialize-javascript](https://github.com/yahoo/serialize-javascript).
Bumps the npm_and_yarn group with 2 updates in the /website/server directory: [hono](https://github.com/honojs/hono) and [ip-address](https://github.com/beaugunderson/ip-address).


Updates `hono` from 4.12.14 to 4.12.18
- [Release notes](https://github.com/honojs/hono/releases)
- [Commits](https://github.com/honojs/hono/compare/v4.12.14...v4.12.18)

Updates `ip-address` from 10.1.0 to 10.2.0
- [Commits](https://github.com/beaugunderson/ip-address/commits)

Updates `hono` from 4.12.14 to 4.12.18
- [Release notes](https://github.com/honojs/hono/releases)
- [Commits](https://github.com/honojs/hono/compare/v4.12.14...v4.12.18)

Updates `ip-address` from 10.1.0 to 10.2.0
- [Commits](https://github.com/beaugunderson/ip-address/commits)

Updates `serialize-javascript` from 6.0.2 to 7.0.5
- [Release notes](https://github.com/yahoo/serialize-javascript/releases)
- [Commits](https://github.com/yahoo/serialize-javascript/compare/v6.0.2...v7.0.5)

Updates `hono` from 4.12.16 to 4.12.18
- [Release notes](https://github.com/honojs/hono/releases)
- [Commits](https://github.com/honojs/hono/compare/v4.12.14...v4.12.18)

Updates `ip-address` from 10.1.0 to 10.2.0
- [Commits](https://github.com/beaugunderson/ip-address/commits)

Updates `hono` from 4.12.16 to 4.12.18
- [Release notes](https://github.com/honojs/hono/releases)
- [Commits](https://github.com/honojs/hono/compare/v4.12.14...v4.12.18)

Updates `ip-address` from 10.1.0 to 10.2.0
- [Commits](https://github.com/beaugunderson/ip-address/commits)

---
updated-dependencies:
- dependency-name: hono
  dependency-version: 4.12.18
  dependency-type: indirect
- dependency-name: hono
  dependency-version: 4.12.18
  dependency-type: direct:production
- dependency-name: ip-address
  dependency-version: 10.2.0
  dependency-type: indirect
- dependency-name: ip-address
  dependency-version: 10.2.0
  dependency-type: indirect
- dependency-name: serialize-javascript
  dependency-version: 7.0.5
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-05-09 16:04:53 +00:00
Kazuki Yamada daa7ff3e2b Merge pull request #1558 from yamadashy/chore/renovate-group-github-actions
chore(renovate): Group GitHub Actions, Dockerfile, and Nix updates
2026-05-10 01:02:30 +09:00
Kazuki Yamada 58fd6ef403 Merge branch 'main' into perf/auto-perf-tuning 2026-05-09 23:56:44 +09:00
Kazuki Yamada cff212c5b9 chore(renovate): Include pin and digest updates in github-actions group
Actions in this repo are SHA-pinned via pinact, so Renovate classifies
SHA bumps as `digest` (and the initial pinning as `pin`). Without
adding them to matchUpdateTypes, those updates would skip the group and
land as individual PRs, defeating the grouping.
2026-05-09 21:22:54 +09:00
Kazuki Yamada effb229852 chore(renovate): Group Dockerfile and Nix updates
Extend the manager-based grouping to dockerfile and nix so base image
bumps across the four Dockerfiles and flake.nix updates each batch into
a single PR per update channel.
2026-05-09 21:21:20 +09:00
Kazuki Yamada 60c1721d04 Merge pull request #1555 from yamadashy/renovate/github-codeql-action-4.x
chore(deps): update github/codeql-action action to v4.35.3
2026-05-09 21:19:50 +09:00
Kazuki Yamada 0f3f9e71f6 Merge pull request #1554 from yamadashy/renovate/browser-non-major-dependencies
chore(deps): update browser non-major dependencies
2026-05-09 21:19:27 +09:00
Kazuki Yamada d443d16b72 Merge pull request #1553 from yamadashy/renovate/anthropics-claude-code-action-1.x
chore(deps): update anthropics/claude-code-action action to v1.0.111
2026-05-09 21:19:02 +09:00
Kazuki Yamada b5da20dc37 chore(renovate): Group GitHub Actions updates
Add packageRules for the github-actions manager so workflow dependency
bumps are grouped into one PR per update channel, mirroring how the
package.json updates are already batched.
2026-05-09 21:18:58 +09:00
Kazuki Yamada ad3c620cc1 Merge pull request #1552 from yamadashy/renovate/homebrew-actions-digest
chore(deps): update homebrew/actions digest to 503c7f3
2026-05-09 21:18:11 +09:00
Kazuki Yamada a27cf29a4e Merge pull request #1550 from yamadashy/renovate/npm-hono-vulnerability
chore(deps): update dependency hono to v4.12.16 [security]
2026-05-09 21:17:37 +09:00
Kazuki Yamada 4caea59b2a Merge pull request #1556 from yamadashy/feat/drop-node-20-add-26
chore(deps): Drop Node.js 20, add Node.js 26 support
2026-05-09 21:16:27 +09:00
Kazuki Yamada defd47d70b Merge pull request #1557 from yamadashy/perf/turnstile-isbot-pre-mint
perf(website): Skip Turnstile pre-mint for bot-shaped user agents
2026-05-09 20:18:54 +09:00
Kazuki Yamada b9388665d2 fix(website): Key isBot() cache on UA string for shared-process safety
Previous memoization stored a single boolean at module scope. In any
Node context where the same module instance might be reused across
requests (VitePress SSG, dev server, preview server with `navigator`
polyfilled per request), the first request's UA would silently leak
into subsequent calls.

In production this code is browser-only — Cloud Run's Hono server
doesn't import `botDetect.ts`, and Cloudflare Pages serves the bundle
as static files with one fresh module instance per browser tab — so
the bug was theoretical. But the UA-keyed memo costs nothing extra
and removes the foot-gun: a long-lived process now invalidates the
cache automatically when a different UA shows up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 20:12:15 +09:00
Kazuki Yamada 373c40ea25 fix(ci): Plumb node-version through repomix composite action
The composite action hard-coded `setup-node` to Node 24, so the
`test-action.yml` matrix `[22, 24, 26]` silently ran every job under
Node 24 — the 22 and 26 cells did not actually exercise those Node
versions.

Add a `node-version` input to the action (default `"24"` to preserve
current behavior for downstream consumers) and pass `${{
matrix.node-version }}` from each `test-action.yml` invocation so the
matrix tests what its name implies.
2026-05-09 20:08:02 +09:00
Kazuki Yamada e4a635c2f2 fix(website): Address PR review feedback on isBot pre-mint guard
Four items from gemini and claude reviews:

- botDetect.ts: Memoize isBot() result. navigator.userAgent is immutable
  for the page lifetime, so re-running the isbot regex on every Turnstile
  pre-mint debounce and post-submit re-mint check is wasted work. SSR
  fallback is intentionally not cached so a module instance reused across
  SSR/CSR still reaches the real UA check on first CSR call.
- usePackRequest.ts: Disambiguate the "submit-path NOT gated" comment —
  it was confusing because the new post-submit re-mint also lives inside
  submitRequest's finally. Reworded to "click-path acquireTurnstileToken"
  to make clear which call site is intentionally skipped.
- usePackRequest.ts: Update the userTouched comment to reflect autofill
  reality — modern Chromium/Firefox DO fire input events on autofill, so
  the rationale ("autofill doesn't trigger") was already stale. The new
  isBot() guard covers the gap for well-behaved crawler UAs.
- usePackRequest.ts: Add English glosses for the Japanese CF dashboard
  labels (提示チャレンジ / 未解決) so non-Japanese-reading maintainers can
  follow the comment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 20:07:08 +09:00
Kazuki Yamada 9d34d6e9af fix(ci): Revert prepare script to npm run build for Bun compat
`bun install` fires the `prepare` lifecycle hook before `setup-node`
runs, so it executes against the runner's default Node (currently
< 22). `node --run` is unrecognized there and `bun install` aborts
with exit code 9, breaking every Bun matrix job.

Other inner `node --run` script changes (browser, website-client,
website-server lint, browser build-all) are kept as-is — they only
fire when explicitly invoked from CI steps that have already run
`setup-node`, so the Node-version prerequisite is guaranteed.

Reverts the `prepare` portion of 5a1423e.
2026-05-09 19:58:34 +09:00
Kazuki Yamada d94475ef01 perf(website): Skip Turnstile pre-mint for bot-shaped user agents
The CF Turnstile dashboard shows ~17k unsolved challenges over the past
7 days vs ~10k solved — a 2:1 unsolved:solved ratio that's larger than
the post-submit auto re-mint can explain. Most of those are JS-executing
crawlers (Slackbot, Discord card validator, Twitter card validator, X,
Apple link preview, Googlebot, etc.) that render the page, somehow trigger
a DOM input event on the URL field (autofill / accessibility tools /
focus tricks), and pay a CF challenge they can't solve.

Add an `isBot()` short-circuit at the two pre-mint call sites in
`usePackRequest`:
  - the debounced `onTrigger` after `markUserTouched` flips
  - the post-submit auto re-mint in `submitRequest`'s finally block

The actual security gate is the server-side `siteverify` in
`turnstileMiddleware` — that stays the only authoritative check, so a
crawler that spoofs UA past `isBot()` still gets blocked there. The
submit-path `takeToken()` is intentionally NOT gated to avoid
false-positive lockouts of legitimate users with unusual UAs (e.g. older
clients, accessibility tools).

Net effect:
  - "提示チャレンジ" / "未解決" CF dashboard counters drop sharply
    (well-behaved crawlers stop minting unsolvable tokens)
  - `pack_completed` server logs unaffected (legit users don't change
    paths; bots couldn't reach `/api/pack` either way)
  - server-side spend on siteverify unchanged

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 19:56:19 +09:00
Kazuki Yamada ef190425c1 docs(website): Bump pinned action versions in GitHub Actions guide examples
The hi/vi/id translations of the GitHub Actions guide still showed
`actions/checkout@v3`, `actions/setup-node@v3`,
`actions/upload-artifact@v3`, and `softprops/action-gh-release@v1`,
which are stale relative to both the English doc (already on @v4) and
the project's own CI workflows.

Bump these illustrative examples to @v4 (and @v2 for action-gh-release)
to match the English source and avoid pointing readers at deprecated
action majors.
2026-05-09 19:55:06 +09:00
Kazuki Yamada 5a1423e118 chore: Replace npm run with node --run inside package.json scripts
Follow up to commit 042750c which only converted workflow-level
invocations. With Node.js 22 as the floor, the chained scripts inside
each package.json can also use `node --run` directly, dropping the
intermediate npm process when these scripts run.

- root `prepare`
- browser `build-all`, `lint`
- website/client `lint`
- website/server `lint`
2026-05-09 19:55:06 +09:00
Kazuki Yamada df881a7c57 fix(ci): Address PR review feedback
- Migrate the build-and-run job in ci.yml to `node --run build --
  --sourceMap --declaration` so the inline command matches the
  `node --run` style of the test job (claude review round 2 #1)
- Update the hi github-actions.md matrix example from `[22, 24]` to
  `[22, 24, 26]` so the doc mirrors the project's actual CI matrix
  (gemini-code-assist / coderabbitai inline comment)
- Bump the stale "(Node 20+)" baseline in reviewer-performance.md to
  "(Node 22+)" to track the new engines floor (claude review round 2
  minor)
2026-05-09 19:48:15 +09:00
Kazuki Yamada f61ea347f2 docs(website): Update Node.js version mentions from 20 to 22
Sweep the multilingual website docs for remaining Node.js 20 references
that survived the initial English-only update:

- installation.md: System Requirements `≥ 20.0.0` → `≥ 22.0.0` across
  all 12 languages that include this section (incl. en, which was
  missed earlier)
- development/index.md: Prerequisites `≥ 20.0.0` / `v20` / `versi 20`
  / `phiên bản 20` → 22 across all 14 languages
- faq.md: programming-language Q&A narrative "Node.js 20 or later" → 22
  across all 14 languages

The version digit substitutions are mechanical and identical across
locales, so updating them in this PR keeps the docs consistent with the
new minimum without requiring the usual translation handoff.
2026-05-09 19:08:19 +09:00
Kazuki Yamada 042750cb4e chore(ci): Replace npm run with node --run in workflows
Now that the minimum supported Node.js version is 22, `node --run` is
available everywhere. It avoids the npm process-spawn overhead and
matches the style already used in package.json scripts.

Affects all GitHub Actions workflows that invoke npm scripts and the
website/server Dockerfile bundle step. `npm ci` is left as-is since it
is npm-specific.
2026-05-09 19:08:10 +09:00
renovate[bot] 041ec193fa chore(deps): update browser non-major dependencies 2026-05-09 09:50:22 +00:00
Kazuki Yamada 9caf541368 chore(deps): Drop Node.js 20, add Node.js 26 support
Node.js 20 reaches end-of-life on 2026-04-30, so raise the minimum
supported version to 22 (the next active LTS) and add Node.js 26 to the
CI matrix as the current release line.

- Bump engines.node to >=22.0.0 in package.json and scripts/memory
- Update CI matrix to [22.x, 24.x, 26.x] (drop 20.x and 25.x; 25.x EOL 2026-06)
- Update test-action.yml matrix to [22, 24, 26]
- Drop the obsolete `node --run` workaround comment in ci.yml since
  `node --run` is supported on all matrix versions
- Update Node.js version mentions in English docs, llms-install.md,
  configShard, bug report template, and code samples in hi/vi
  github-actions guides

Dockerfile (node:22-slim) is intentionally left at the minimum supported
version so the published image confirms Repomix runs on the floor.
2026-05-09 18:49:37 +09:00
Kazuki Yamada 005c1238aa perf(metrics): Cache output-wrapper token count via existing disk cache
Reuse the content-addressed disk cache used for per-file token counts to
also memoize the "output wrapper" (output minus all file contents) token
count. The wrapper string is byte-stable across runs when neither the
rootDir tree nor headers/instructions change, so the second-run wrapper
tokenization is a guaranteed cache hit.

Why this is on the critical path:
- calculateMetrics blocks on the wrapper tokenization Promise alongside
  per-file metrics; with a warm per-file cache, file metrics resolve in
  ~5 ms and the wrapper worker dispatch (~30 ms on the ~120 KB wrapper)
  becomes the longest task in calculateMetrics, which itself is on the
  pipeline's critical path after produceOutput.
- A cache hit replaces the worker round-trip with an MD5(wrapper) +
  Map.get(), each <1 ms.

Behavior preservation:
- Cache key = `${encoding}:MD5(wrapper)[0:16]`, so any wrapper change
  (headers, file set, sort order, template format) automatically misses.
- On miss, the original runTokenCount runs and the result is written
  back via setCached so subsequent runs hit. No behavioral difference.
- Falls under the same CACHE_VERSION guard as per-file entries.

Benchmark (paired, warm cache, n=30, repomix self-pack):
  BASE  mean 1097.2 ms  sd 33.5
  AFTER mean 1063.7 ms  sd 34.7
  DELTA mean 33.6 ms (3.06%)  median 32.8 ms
        t = 4.761 (df=29)  faster = 23/30
        95% CI [19.1, 48.0] ms

Cold cache (paired, n=15, file deleted before each run):
  DELTA mean -1.5 ms (-0.10%)  t = -0.12  -- within noise

Tests: all 1260 pass; npm run lint is clean (the two pre-existing
biome warnings in cliSpinner.ts are unrelated).
2026-05-09 15:54:12 +09:00
renovate[bot] 59ecf2daf0 chore(deps): update github/codeql-action action to v4.35.3 2026-05-09 05:37:39 +00:00
Kazuki Yamada 91a643ef51 perf(core): Add content-addressed token-count disk cache
Introduce a persistent token-count cache keyed by MD5(content) + encoding.
On warm runs (re-packing the same repo), the 600ms BPE metrics phase is
bypassed entirely, cutting total wall-clock time by ~28% (553ms).

- tokenCountCache.ts: new module — load/save JSON from /tmp/, in-memory
  Map<string,number> with 100k-entry LRU eviction and CACHE_VERSION guard
- calculateFileMetrics.ts: classify files as cache-hit / cache-miss before
  dispatching to workers; merge results back in original order
- packager.ts: fire-and-forget cache load at t=0; await before metrics;
  save after metrics completes

Benchmark (n=30, paired t-test on repomix self-pack, warm cache):
  Baseline (no cache): 1949.9ms +/- 73.0ms
  Candidate (warm):    1396.9ms +/- 72.9ms
  Improvement:         553ms (28.4%), t=28.20

https://claude.ai/code/session_01Fm25x51fmGGeFMJyCm1CER
2026-05-09 12:33:31 +09:00
autofix-ci[bot] 6749522c7f [autofix.ci] apply automated fixes 2026-05-09 01:43:41 +00:00
Kazuki Yamada 30d938f8ae perf(core): Skip redundant per-file scans, prefetch template, and use path-anchored wrapper extraction
Three targeted improvements to output generation and metrics:

1. **Skip redundant per-file scans in createRenderContext** (`outputGenerate.ts`)
   - `calculateMarkdownDelimiter` (backtick scan, ~4 ms) now runs only for
     markdown output; other styles use the default ``` fence.
   - `calculateFileLineCounts` (newline scan, ~6 ms) now runs only when
     the caller passes `needsLineCounts: true`; the regular output path
     (XML / plain / markdown) never uses line counts.
   - `packSkill.generateSkillReferences` passes the flag to keep its
     statistics and tree rendering correct.

2. **Prefetch compiled Handlebars template** (`packager.ts` / `outputGenerate.ts`)
   - `prefetchCompiledTemplate(style)` fires a background `getCompiledTemplate`
     call immediately after security-worker warm-up, so the ~50 ms Handlebars
     compile cost overlaps with `collectFiles` + security-check instead of
     sitting on the critical path.

3. **Path-anchored `extractOutputWrapper`** (`calculateMetrics.ts`)
   - `getFileContentStart` locates each file's content in the output string
     using a style-specific path anchor (`<file path="...">` for XML,
     `## File: ...` for markdown, `File: ...` for plain) rather than a raw
     `indexOf(content)`.
   - Prevents a false match when one file's content appears verbatim inside
     an earlier file in the output, which would cause `extractOutputWrapper`
     to return `null` and silently fall back to full-output tokenization.
   - The fast-path token calculation now passes `config.output.style` to
     `extractOutputWrapper` so the anchor logic is always engaged.

https://claude.ai/code/session_01Fm25x51fmGGeFMJyCm1CER
2026-05-09 10:41:57 +09:00
renovate[bot] bb4d211206 chore(deps): update anthropics/claude-code-action action to v1.0.111 2026-05-09 01:33:22 +00:00
renovate[bot] 3234592eea chore(deps): update homebrew/actions digest to 503c7f3 2026-05-09 01:33:14 +00:00
autofix-ci[bot] 371927e211 [autofix.ci] apply automated fixes 2026-05-09 01:07:48 +00:00
Kazuki Yamada 1d2df3bea3 test(search): Update fileSearch tests for prescan-based ignore collection
Add fs.readdir mock (returning empty array) to all relevant beforeEach
blocks so collectIgnoreFilePatterns does not fail with "entries is not
iterable". Update globby option assertions to reflect gitignore: false
and ignoreFiles: [] now that patterns are pre-collected by the prescan.

https://claude.ai/code/session_01Fm25x51fmGGeFMJyCm1CER
2026-05-09 10:06:46 +09:00
autofix-ci[bot] 462be46258 [autofix.ci] apply automated fixes 2026-05-09 01:01:36 +00:00
Kazuki Yamada d0a27fe62c perf(search): Replace globby's ignore-file traversals with native fs.readdir prescan
globby's `gitignore: true` + `ignoreFiles` options each trigger an extra
full-tree traversal to discover and parse .gitignore / .ignore /
.repomixignore files. On the repomix repo itself this adds 200–500 ms to
the searchFiles phase (measured via --verbose [globby] log lines).

Replace both traversals with a single native fs.readdir-based prescan
(`collectIgnoreFilePatterns`) that:
- walks only non-skip directories in parallel (skipping node_modules,
  .git, dist, build, lib, etc.)
- reads .gitignore, .repomixignore, and .ignore in one pass
- prefixes each pattern with its directory path and merges into the
  main `ignore` array
- skips negation lines (!) to avoid cross-directory semantic issues

globby is then called with `gitignore: false, ignoreFiles: []` so it
performs only a single traversal for file discovery.

Benchmark results (--verbose [globby] elapsed, cache-warm, NODE_DISABLE_COMPILE_CACHE=1):
  src,tests scope: 124–156ms → 82–109ms (~35% reduction in searchFiles)
  Full repo scan:  664–1153ms → 195–499ms (~60% reduction in searchFiles)

https://claude.ai/code/session_01Fm25x51fmGGeFMJyCm1CER
2026-05-09 10:00:29 +09:00
Kazuki Yamada db647834f5 test(core): Restore '# Directory Structure' assertion in markdown case
Earlier push commit (03b8e70b) accidentally dropped the
`expect(actualOutput).toContain('# Directory Structure')` assertion
inside the markdown branch of the integration test's case-style switch.
That line still exists on disk and was present pre-change; restoring it
to keep the markdown-style coverage symmetric with the plain-style
case below it.
2026-05-09 02:38:14 +09:00
Kazuki Yamada 03b8e70b9c test(core): Wire createSecurityTaskRunner mock into remaining packager tests
Final commit of the perf(core) Pre-warm security worker pool change —
extends the unit packager test and the integration packager test:

- tests/core/packager.test.ts: adds `createSecurityTaskRunner` mock to
  the orchestration test's `mockDeps` and to the `parallel error
  handling` `baseDeps()` shared fixture, updates the
  `validateFileSafety.toHaveBeenCalledWith` assertion to expect the new
  6th-argument deps object (`{ taskRunner: <Object> }`), and adds
  positive/negative gate assertions —
  `expect(deps.createSecurityTaskRunner).toHaveBeenCalled()` for the
  default unscoped path, `.not.toHaveBeenCalled()` for the
  `--include 'src'` and `explicitFiles` (--stdin) paths.

- tests/integration-tests/packager.test.ts: adds the
  `createSecurityTaskRunner` stub so the default-scope path no longer
  attempts to spawn a real worker pool (the previous unhandled-rejection
  noise from a missing worker file URL is gone with this change).

(See PR description / first source commit for the full perf change
rationale, benchmark numbers, and correctness notes.)
2026-05-09 02:35:18 +09:00
Kazuki Yamada fb281d4560 test(core): Wire createSecurityTaskRunner mock into smaller packager tests
Continuation of the perf(core) Pre-warm security worker pool change —
extends `mockDeps` / inline pack-test plumbing in the three smaller test
files so the default-scope path no longer attempts to spawn a real
worker pool from the test environment.

- tests/core/packager/diffsFunctionality.test.ts: adds
  `mockCreateSecurityTaskRunner` to both pack-call sites.
- tests/core/packager/splitOutput.test.ts: same — adds the stub to the
  inline mock deps.
- tests/core/security/validateFileSafety.test.ts: updates the
  `runSecurityCheck` call assertion to include the new
  `{ taskRunner: undefined }` deps argument forwarded by
  `validateFileSafety` when no pre-warmed runner is provided.

(See PR description / parent commit for the full perf change rationale,
benchmark numbers, and correctness notes.)
2026-05-09 02:30:26 +09:00