perf(core): Skip 3rd metrics warmup worker when token cache is warm

The metrics worker pool eagerly spawns N workers at pack startup so each
worker can parse gpt-tokenizer's o200k_base BPE table (~340 ms of pure-CPU
work) in parallel with file search and collection. Previously a fixed
EAGER_WARMUP_THREADS=3 was used on the unscoped (default-scan) path because
3 BPE parses amortize across the file tokenization that follows.

With the persistent token-count disk cache (introduced earlier on this
branch), warm-cache repeat runs serve almost every per-file token count
out of the in-memory cache and dispatch zero worker tasks for them. The
3rd worker's ~340 ms BPE parse becomes pure overhead that contends with
file collection (~360 ms) and security check (~140 ms) for the 4 cores
on a typical host.

Gate the 3rd warm-up worker on `tokenCountCacheFileExists()` (a sync
existsSync on the cache JSON in $TMPDIR). When the cache file exists from
a previous run we treat the run as warm-cache-likely and warm 2 workers;
when it is missing (true cold cache) we keep the original 3-worker warmup
so the actual tokenizations parallelise.

Inject tokenCountCacheFileExists via the packager `deps` object so the
test suite can deterministically exercise both branches without depending
on /tmp filesystem state. Keep the existing `hasExplicitScope` gate
intact — explicit scopes still warm only 2 workers regardless of cache
state, matching the prior tuning for shorter metrics phases on small
file sets.

Benchmark (n=30, paired, NODE_DISABLE_COMPILE_CACHE=1, repomix self-pack
on 1047 files, 4-core host):

  Warm cache (cache file present)
    BASELINE  median=1162.7  mean=1145.9  sd=62.8 ms
    AFTER     median=1033.7  mean=1035.3  sd=50.4 ms
    DELTA     mean=110.5 ms (9.65%)  median=110.3 ms
              t=11.97 (df=29)  faster=30/30

  Cold cache (cache file deleted before each run, n=20)
    BASELINE  median=1658.8  mean=1675.0  sd=91.1 ms
    AFTER     median=1632.0  mean=1652.3  sd=102.9 ms
    DELTA     mean=22.7 ms (1.36%)  median=42.2 ms
              t=1.29 (df=19)  faster=13/20  — within noise

Test plan:
- All 1261 tests pass (+1 new test for the warm-cache branch)
- Lint clean
- Hosts with `getProcessConcurrency() < 3` are unaffected: the
  `Math.min(processConcurrency, EAGER_WARMUP_THREADS)` floor in
  `getWorkerThreadCount` already collapses to the host CPU count.

https://claude.ai/code/session_01TJqKkJ8n3r6Pa2JdW9Vp2w
This commit is contained in:
Kazuki Yamada
2026-05-10 15:56:41 +09:00
parent 58fd6ef403
commit dcdbdd5efc
+13
View File
@@ -1,4 +1,5 @@
import { createHash } from 'node:crypto';
import { existsSync } from 'node:fs';
import fs from 'node:fs/promises';
import os from 'node:os';
import path from 'node:path';
@@ -107,3 +108,15 @@ export const setCached = (key: string, tokenCount: number): void => {
entries.set(key, tokenCount);
dirty = true;
};
/**
* Synchronously probe whether the on-disk cache file exists. Used at pack
* startup to size the metrics worker pool: when the cache is present the
* pipeline is likely a warm-cache repeat run that needs few or zero
* tokenizations, so we can afford to spin up fewer worker threads (each of
* which independently parses ~2.2 MB of BPE tables on first task).
*
* `existsSync` itself never throws — ENOENT, EACCES and other errors all
* surface as `false`, so we treat any non-existence reason as "cold".
*/
export const tokenCountCacheFileExists = (): boolean => existsSync(CACHE_FILE);