Lower `EAGER_WARMUP_THREADS` from 2 to 1 when `tokenCountCacheFileExists()`
returns true. With the persistent token-count disk cache populated by a
prior run, `calculateFileMetrics` serves every per-file token count from
the in-memory map and dispatches zero worker tasks. The only worker work
that survives caching on a warm rerun is a small fixed set of dispatches:
- the wrapper-token tokenization (cache hit after run #2)
- git diff staged/worktree token counts (only when
`output.git.includeDiffs` is enabled)
- git log token count (only when `output.git.includeLogs` is enabled)
That worst case is 2-3 short tasks (a few KB each) that fit a single warm
worker serially in well under 30 ms. Spawning a second warm worker means
a redundant ~340 ms BPE table parse that contends with the file-collection
main thread for CPU AND extends the final `pool.destroy()` blocking wait
(BPE-loaded workers take ~21 ms to terminate vs ~3 ms when idle).
Cold-cache (no cache file) behavior is preserved: the unscoped path keeps
3 warm workers and the explicit-scope path keeps 2, so the actual file
tokenizations still parallelise across the original worker counts.
The probe is a coarse heuristic — a cache file written by a previous run
that used a different `tokenCount.encoding` (e.g. cl100k_base instead of
the default o200k_base) yields no hits for the current run, so the metrics
phase pays one BPE parse sequentially on the critical path before
tokenizing files. This is a one-time cost on encoding switches; subsequent
runs rebuild the cache for the new encoding and hit again.
Benchmark (paired, n=25, repomix self-pack on 1068 files):
WARM CACHE (cache file present)
BASELINE mean=968.9ms median=976.0ms sd=40.3ms
AFTER mean=883.2ms median=875.0ms sd=33.1ms
DELTA mean=85.6 ms (8.84%) median=87.0 ms sd=42.7
t=10.02 (df=24) faster=24/25
COLD CACHE (cache file deleted before each run, n=12)
BASELINE mean=1606.3ms median=1588.0ms sd=58.6ms
AFTER mean=1593.2ms median=1598.5ms sd=58.6ms
DELTA mean=13.2 ms (0.82%) t=0.62 faster=9/12 — within noise
Stacks on top of the existing warm-cache wins on this branch (token-count
disk cache, output-wrapper cache, prefetched template, native ignore-file
prescan, etc.); this single change pushes warm-cache wall-clock another
~86 ms below the previous floor.
Companion to the previous commit. Plumb `tokenCountCacheFileExists` into
the packager `defaultDeps` so the metrics warm-up sizing can be exercised
deterministically from tests, and add a paired test that asserts the
2-warm-up-worker branch is taken when the persistent disk cache exists.
Also rename the cold-cache test to make the new gating explicit and refresh
its docstring with the warm/cold distinction.
https://claude.ai/code/session_01TJqKkJ8n3r6Pa2JdW9Vp2w
Add fs.readdir mock (returning empty array) to all relevant beforeEach
blocks so collectIgnoreFilePatterns does not fail with "entries is not
iterable". Update globby option assertions to reflect gitignore: false
and ignoreFiles: [] now that patterns are pre-collected by the prescan.
https://claude.ai/code/session_01Fm25x51fmGGeFMJyCm1CER
Earlier push commit (03b8e70b) accidentally dropped the
`expect(actualOutput).toContain('# Directory Structure')` assertion
inside the markdown branch of the integration test's case-style switch.
That line still exists on disk and was present pre-change; restoring it
to keep the markdown-style coverage symmetric with the plain-style
case below it.
Final commit of the perf(core) Pre-warm security worker pool change —
extends the unit packager test and the integration packager test:
- tests/core/packager.test.ts: adds `createSecurityTaskRunner` mock to
the orchestration test's `mockDeps` and to the `parallel error
handling` `baseDeps()` shared fixture, updates the
`validateFileSafety.toHaveBeenCalledWith` assertion to expect the new
6th-argument deps object (`{ taskRunner: <Object> }`), and adds
positive/negative gate assertions —
`expect(deps.createSecurityTaskRunner).toHaveBeenCalled()` for the
default unscoped path, `.not.toHaveBeenCalled()` for the
`--include 'src'` and `explicitFiles` (--stdin) paths.
- tests/integration-tests/packager.test.ts: adds the
`createSecurityTaskRunner` stub so the default-scope path no longer
attempts to spawn a real worker pool (the previous unhandled-rejection
noise from a missing worker file URL is gone with this change).
(See PR description / first source commit for the full perf change
rationale, benchmark numbers, and correctness notes.)
Continuation of the perf(core) Pre-warm security worker pool change —
extends `mockDeps` / inline pack-test plumbing in the three smaller test
files so the default-scope path no longer attempts to spawn a real
worker pool from the test environment.
- tests/core/packager/diffsFunctionality.test.ts: adds
`mockCreateSecurityTaskRunner` to both pack-call sites.
- tests/core/packager/splitOutput.test.ts: same — adds the stub to the
inline mock deps.
- tests/core/security/validateFileSafety.test.ts: updates the
`runSecurityCheck` call assertion to include the new
`{ taskRunner: undefined }` deps argument forwarded by
`validateFileSafety` when no pre-warmed runner is provided.
(See PR description / parent commit for the full perf change rationale,
benchmark numbers, and correctness notes.)
Packing.
Bumps EAGER_WARMUP_THREADS from 2 to 3 in src/core/packager.ts when the
user did not narrow the file set via --include / config.include / --stdin.
Tinypool fixes maxThreads at construction, so the 3rd worker must be
pre-warmed during the searchFiles + collectFiles window or it stalls
dispatch (a 4-thread / 2-warm experiment regressed by 27% paired in a
prior iteration). With explicit scope the file set is typically a few
hundred files, the metrics phase is shorter, and the 3rd worker's
~250ms BPE warm-up dominates the parallelism gain — paired benchmarks
regressed -11.85% on the 258-file `--include 'src,tests'` workload at
unconditional EAGER_WARMUP_THREADS=3, so the heuristic falls back to 2.
Reasoning.
After change 3 on this branch (eager metrics warm-up), the metrics phase
is the dominant wall-clock contributor on the default 1046-file workload
(~770 ms in `calculate metrics`, vs ~120 ms output generation, ~370 ms
search, ~270 ms collect, ~200 ms security check). Five sub-agent
investigations over independent scopes (CLI startup, file search/glob,
file collect/security, output generation, token counting) converged on
metrics worker count as the only candidate clearing the 2% bar without
regressing other phases. Output gen, security pre-warm, file-search
scoping, and CLI lazy-load were all measured below threshold or net-
negative; documented as the previous iteration's notes plus the
follow-on attempts here:
- EAGER_WARMUP_THREADS=3 unconditional: -11.85% paired regression on
the 258-file workload (n=20, t=-10.85), +2.92% on the 1046-file
workload — net negative because small workloads can't amortize the
extra BPE parse.
- Pre-warm the security worker pool gated on the metrics warm-up:
security-check phase shrunk from 197 ms to 110 ms, but the saving was
absorbed by the parallel `Process Files` branch and an offsetting
worker-spawn cost during collect. Paired n=30 measured -4.90% on
258-file and 0.81% (noise) on 1046-file. Reverted.
Verification.
Paired interleaved benchmarks (n=20, NODE_DISABLE_COMPILE_CACHE=1):
Default workload — `node bin/repomix.cjs --quiet` (1046 files):
| | min | median | mean | max | sd |
|--------|---------|---------|---------|---------|--------|
| BEFORE | 1820 ms | 1885 ms | 1886 ms | 2020 ms | 45 ms |
| AFTER | 1700 ms | 1845 ms | 1840 ms | 1970 ms | 62 ms |
- Mean paired Δ: +46.5 ms (2.46% wall-clock reduction)
- Median paired Δ: +50.0 ms (2.65%)
- Paired-delta SD: 65.3 ms · paired t = 3.18 (p < 0.01)
- AFTER faster in 15/20 pairs (75%)
Scoped workload — `node bin/repomix.cjs --include 'src,tests' --quiet`
(258 files):
| | min | median | mean | max | sd |
|--------|---------|---------|---------|---------|--------|
| BEFORE | 900 ms | 955 ms | 953 ms | 990 ms | 25 ms |
| AFTER | 910 ms | 940 ms | 946 ms | 1010 ms | 29 ms |
- Mean paired Δ: +6.5 ms (0.68%) — neutral within noise (t = 0.90)
- The heuristic falls back to 2 warm workers, so this branch matches
pre-change behavior; the small positive delta is sampling noise.
An independent reviewer's paired n=15 NODE_DISABLE_COMPILE_CACHE=1 run
on a separate sample reported +4.10% (t=6.61, 14/15 pairs) on the
default workload, consistent direction at higher magnitude.
Correctness.
- All 1260 unit tests pass (`npm test`); 3 new tests in
`tests/core/packager.test.ts` exercise both heuristic branches plus
the `--stdin` (explicitFiles) path.
- `npm run lint` clean (only pre-existing warnings unchanged from main).
- XML and Markdown output byte-identical between BEFORE and AFTER on
both workloads (verified via sha256sum).
- Worker-pool size confirmed via `--verbose` logs:
- Default scan: `min=1, max=3 threads` for `calculateMetrics`.
- `--include 'src,tests'`: `min=1, max=2 threads` (unchanged).
- Single-CPU and 2-CPU hosts are unaffected (`min(cpuCount, 3) =
min(cpuCount, 2)` for cpuCount ≤ 2).
- Public `pack()` API unchanged (no new parameters; the heuristic reads
existing `config.include` and `explicitFiles` arguments).
Risks.
The heuristic is a coarse proxy. Pathological cases:
- User runs default scan on a tiny repo (~50 files): 3 workers, +1
extra BPE parse. The cost is bounded by the eager-warm-up overlap
with searchFiles/collectFiles, so the worst case approaches the
paired noise floor (~30 ms sd on 258-file). Not measured below 50
files; expected to be neutral-to-slightly-negative within typical
run-to-run variance.
- User runs `--include 'huge-dir'` on a 5000-file project: 2 workers,
misses the parallelism win. Falls back to current production
behavior — no regression vs main.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Background
----------
Each metrics worker independently parses gpt-tokenizer's ~2.2 MB
`o200k_base.js` BPE table on its first task (~200-300 ms pure-CPU per
worker). The pool was previously created in `pack()` after the file
search and sort phases, so the only stages that could absorb the BPE
warm-up were `collectFiles` + git subprocesses + security check + file
processing. On a 258-file run this still left a residual ~80-130 ms
`await metricsWarmupPromise` stall before the metrics phase.
Change
------
Move `createMetricsTaskRunner` to fire before `searchFiles`. This adds
the ~130 ms glob scan to the hidden warm-up budget and shrinks the
residual stall to ~0-12 ms on the 258-file workload.
Pool sizing: Tinypool fixes `maxThreads` at construction, and the file
count is not yet known. Pre-warming exactly the workers we'll use is
essential — Tinypool queues tasks for newly spawned (cold) workers and
the pipeline can't progress until those workers finish their BPE parse
and pick up the queued task (an experiment with `maxThreads=cpuCount=4`
and only 2 warm workers regressed the 258-file workload by 27 % paired).
So the pool is sized to a fixed 2 workers (`numOfTasks = 2 ×
TASKS_PER_THREAD = 400` → `maxThreads = min(cpuCount, 2)`), matching the
security pool's hard cap and the typical metrics pool size for repos
≤400 files after the TASKS_PER_THREAD=200 sizing on this branch.
Larger repos (>400 files) would benefit from more parallelism, but the
1046-file regression check below shows the eager-warmup gain still
net-improves wall-clock at maxThreads=2 (the BPE warm-up cost
~250 ms × cpuCount-2 extra workers dominates the parallelism savings on
the metrics phase). On single-CPU hosts the heuristic naturally
collapses to maxThreads=1, identical to today's behavior.
The `try { } finally { cleanup }` block is widened to cover the new
early call so the worker pool is cleaned up on early throws too. A new
`searchFiles`-rejection test in `tests/core/packager.test.ts` exercises
that path explicitly.
`TASKS_PER_THREAD` is exported from `processConcurrency.ts` and consumed
by name in `packager.ts` to keep the eager-warmup constant tied to the
shared sizing rule.
Benchmark
---------
Both runs use n=… paired interleaved (alternating BEFORE-first /
AFTER-first ordering) with `NODE_DISABLE_COMPILE_CACHE=1` so cold-start
BPE parse is measured rather than masked. 4-vCPU Intel(R) Xeon(R) host.
`node bin/repomix.cjs --include 'src,tests' --quiet` (258 files, n=20):
| | min | median | mean | max | sd |
|--------|---------|---------|---------|---------|--------|
| BEFORE | 1007 ms | 1044 ms | 1054 ms | 1164 ms | 36 ms |
| AFTER | 893 ms | 966 ms | 962 ms | 1065 ms | 36 ms |
- Mean paired Δ: +91.6 ms (8.69 % wall-clock reduction)
- Median paired Δ: +97.5 ms (9.34 %)
- Paired-delta SD: 36.0 ms · paired t = 11.39 (p < 0.001)
- AFTER faster in 20/20 pairs (100 %)
Regression check — `node bin/repomix.cjs --quiet` (default, 1046 files,
n=15) on a clean repo (baseline binary built outside the working tree
so it does not get picked up as a workload file):
| | min | median | mean | max | sd |
|--------|---------|---------|---------|---------|--------|
| BEFORE | 1769 ms | 1872 ms | 1877 ms | 2063 ms | 79 ms |
| AFTER | 1751 ms | 1820 ms | 1837 ms | 2018 ms | 61 ms |
- Mean paired Δ: +40.0 ms (2.13 %)
- Median paired Δ: +48.6 ms (2.60 %)
- Paired-delta SD: 51.7 ms · paired t = 2.99 (p ≈ 0.01)
- AFTER faster in 11/15 pairs (73 %)
The larger workload also clears the 2 % threshold; the eager warm-up's
gain offsets the maxThreads=2 cap that's now applied unconditionally.
Correctness
-----------
- All 1257 unit tests pass (`npm test`); `npm run lint` clean (only
pre-existing warnings).
- XML and Markdown output byte-identical between BEFORE and AFTER on
both the 258-file and 1046-file workloads.
- Worker-pool size confirmed via `--verbose` logs: `min=1, max=2 threads`
for `calculateMetrics` on both workloads (was `max=2` on 258 files,
`max=4` on 1046 files before this change).
- New test `cleans up the metrics worker pool when searchFiles rejects`
exercises the widened `try/finally` cleanup path.
Background
----------
On a typical CLI run (`node bin/repomix.cjs --include 'src,tests' --quiet`,
258 files, 4-vCPU host), the metrics worker pool was sized as
`ceil(258 / 100) = 3 workers`. Combined with the security pool's hard cap
of 2 workers (securityCheck.ts:90) and the main thread, the process held
6 active threads on 4 cores during the overlap of `validateFileSafety`
and `calculateMetrics`.
Each metrics worker independently parses gpt-tokenizer's ~2.2 MB
`o200k_base.js` BPE table on its first task — a ~200-300 ms pure-CPU
operation per worker. Spawning 3 cold metrics workers in the warm-up
phase (calculateMetrics.ts:46-48) therefore drove the security workers
off the CPU during their own (concurrent) cold-start, inflating the
critical-path security phase.
Change
------
Raise `TASKS_PER_THREAD` from 100 to 200 so:
- ≤200 file repos: 1 metrics worker (was 1) — no change
- 201-400 file repos: 2 metrics workers (was 3) — -1 worker, the win
- 401-600 file repos: 3 metrics workers (was 4-cap) — -1 worker
- 601-800 file repos: 4 metrics workers (was 4-cap) — no change
- 801+ file repos: 4 metrics workers (was 4-cap) — no change (cap)
For the 258-file benchmark this brings active workers during the
metrics+security overlap to 2 + 2 = 4, matching CPU count, and halves
the parallel BPE-loading work in the warm-up phase.
Tests for `getWorkerThreadCount` and `createWorkerPool` are updated to
reflect the new ratio.
Benchmark
---------
`node bin/repomix.cjs --include 'src,tests' --quiet` (258 files), n=20
paired interleaved (alternating BEFORE-first / AFTER-first ordering):
| | min | p25 | median | p75 | mean | sd |
|--------|---------|---------|---------|---------|---------|--------|
| BEFORE | 1045 ms | 1092 ms | 1109 ms | 1122 ms | 1107 ms | 27 ms |
| AFTER | 937 ms | 973 ms | 991 ms | 1020 ms | 994 ms | 29 ms |
Mean paired Δ: +112.5 ms (10.17 % wall-clock reduction)
Median paired Δ: +115.4 ms (10.66 % wall-clock reduction)
Paired-delta SD: 36.2 ms (paired t = 13.88, p < 0.001)
AFTER faster in 20/20 pairs (100 %)
Regression check — `node bin/repomix.cjs --quiet` (default, 1572 files),
n=15 paired interleaved:
| | min | p25 | median | p75 | mean | sd |
|--------|---------|---------|---------|---------|---------|--------|
| BEFORE | 1933 ms | 1970 ms | 2016 ms | 2102 ms | 2028 ms | 62 ms |
| AFTER | 1955 ms | 1966 ms | 2004 ms | 2131 ms | 2034 ms | 74 ms |
Mean paired Δ: -6.2 ms (-0.31 %) (paired t = -0.29, p > 0.05)
Median paired Δ: -12.7 ms (statistically neutral)
No regression on the large workload — both 100 and 200 saturate the
per-CPU cap at 4 workers for ≥800 file repos, so the dispatch-time
behavior is identical there.
Correctness
-----------
- 1256 / 1256 unit tests pass.
- `npm run lint` clean (only pre-existing warnings unrelated to this change).
- No behavioral change to file processing, tokenization, security checks,
or output. Pool sizing is the only effect.
Three constructor variants were silently dropped during --compress:
- `const Foo(...);` and `const Foo.named(...) : ...;` parse as
`(declaration (constant_constructor_signature ...))` — a node type the
existing constructor query did not list.
- `const factory Foo() = Bar;` parses as
`(redirecting_factory_constructor_signature (const_builtin) (identifier) ...)`
whose first named child is `const_builtin`, so the leading-anchor
`. (identifier)` pattern failed to match.
- `external factory Foo.make();` parses as
`(declaration (factory_constructor_signature ...))` — bare under
`declaration`, not wrapped in `method_signature`, so the existing
factory query missed it.
Switch the constructor / factory / redirecting-factory queries to
capture the whole signature node as `@name.definition.method`. This
emits the same source line(s) DefaultParseStrategy already produces and
is robust across all body / external / const / redirecting variants.
Two pre-existing gaps surfaced while extending queryDart:
- Plain constructors (e.g. `Animal(this.name);`) live directly under
`declaration`, not wrapped in `method_signature`, so the existing
`(method_signature (constructor_signature ...))` query never matched
them. Add a sibling query against `(declaration (constructor_signature ...))`.
- Operator overloads (`operator +`, `operator []`, `operator []=`,
`operator ==`, ...) parse as `(method_signature (operator_signature ...))`
but `operator_signature` has no identifier name field — the operator
token surfaces as `(binary_operator)` / `([])` / `([]=)` children.
Capture the whole `operator_signature` as `@name.definition.method` so
DefaultParseStrategy emits its full source range.
Verified against `--compress` on a real Dart file: signatures that were
previously dropped (only their `///` doc comments survived) now appear
in compressed output.
intent(dart-query): make --compress preserve Dart definition kinds that were silently dropped — mixin, typedef, getter, setter, factory, and redirecting factory
decision(capture-naming): align Dart captures with the dominant @name.definition.X convention used by queryTypeScript/queryPython/queryRust; output is unchanged because DefaultParseStrategy matches via name.includes('name')
constraint(redirecting-factory): tree-sitter-dart grammar makes redirecting_factory_constructor_signature a child of `declaration`, not `method_signature`, so it must be queried bare to avoid a "Bad pattern structure" parse error
constraint(type-alias): type_alias's name node is `type_identifier`, not `identifier` — using `identifier` would silently match nothing
learned(external-keyword): `external` modifier in Dart is a sibling token outside function_signature/method_signature, so existing captures already cover `external void foo();` without changes
intent(ci-green): typos@1.45.1 flags `mis` as a typo of `miss`/`mist` and the
hyphenated `mis-classified` in the new PDF-magic regression test comment
trips the `Check typos` job. The unhyphenated `misclassified` is the more
common spelling and passes the dictionary.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cuts per-thread IPC overhead in calculateFileMetrics by ~80% by sending
fewer, larger batches to the metrics worker pool. At ~1000 files this
reduces the dispatched task count from ~100 to ~20 (~5 batches per
worker on a 4-thread pool instead of ~25), saving ~40 estimated
worker-ms of pure serialization/dispatch overhead per worker on the
critical path.
The previous size of 10 was tuned for fast worker availability to
overlap with output generation, but tinypool round-trips were measured
at ~2ms minimum (batches of trivially small files bottom out there),
so the IPC dominated for short batches. Increasing to 50 keeps load
balance comparable (~5:1 per thread vs. the prior ~25:1) and avoids
the load-imbalance pitfall of even larger sizes (e.g. 100), where one
oversized batch can monopolize a worker and stretch the tail.
Benchmark on this repo (1016 files, 4 cores; 100 paired interleaved
runs alternating BATCH=10 and BATCH=50 with the same prebuilt JS,
swapping only the constant, full pipeline timed end-to-end):
BATCH=10 mean 1.5589s median 1.5480s stdev 0.0966s
BATCH=50 mean 1.5111s median 1.5030s stdev 0.0625s
Saved 47.8ms (3.07%) mean / 45.0ms (2.91%) median
(trimmed mean 41.0ms, dropping top/bottom 10%)
Wins 69/100 paired runs faster with BATCH=50
t-test paired t = 4.545 (df=99, p < 0.001)
95% CI [27.2 ms, 68.4 ms] on the mean improvement
Sign test 2-sided p = 0.0002
Also adds a regression test for the batching path: with 120 input files
the test asserts result order/completeness and that the progress
callback fires once per dispatched batch (not per file, not just once).
All 1250 tests pass; output is byte-identical for the same source tree.
- validateFileSafety: pin the negative path of `if (config.security.enableSecurityCheck)`
— every other test enabled the check, so a regression that always runs
the security check would have passed silently.
- unifiedWorker:
- Add a positive workerData=securityCheck + ambiguous-task case so the
pair (override + this) distinguishes "inference always wins" from
"inference wins only when it yields a value".
- Stop pretending the handler-cache test verifies caching. Both branches
of `if (cached) return cached;` end with the same Map.set, and Node's
own module cache makes the dynamic import effectively free, so the
cache is unobservable from outside without exposing internals.
Renamed to "repeated calls" with a comment explaining the limitation.
- fileSystemReadDirectoryTool: translate the pre-existing Japanese comment
to English per CLAUDE.md.
- TokenCounter: extract `LoadEncodingFn` type alias instead of the
unusual `typeof loadEncoding`, so a signature drift between the local
function and the deps field would surface at the type level.
- outputGenerate: tests titled "throws RepomixError…" / "wraps … in
RepomixError" now assert the rejection is an instance of RepomixError
in addition to the message regex, matching the test names.
- LanguageParser: collapse the duplicate getParserForLang('javascript')
rejection assertions into a single .catch capture that checks both
type and message.
- calculateMetrics: vi.mocked(initTaskRunner).mockReset() before
mockReturnValueOnce so a future test that omits taskRunner can't
silently consume the override.
- packager: pre-attach a no-op .catch on the rejected warmupPromise so
vitest's unhandled-rejection detection doesn't fire before pack
awaits it. Production code mirrors this pattern in packager.ts:262.
Address two review threads on PR #1518 that flagged tests whose titles
overstated what was being verified.
- fileProcess: the longBase64 string is one continuous line, so the
truncateBase64 → removeEmptyLines ordering was never actually under
test (truncateBase64Content's regex does not span newlines). Rename
to describe the combined behavior the test really pins.
- skillTechStack: rename the per-directory case to reflect that root
and subpackage land in separate buckets keyed by getDirPath, and
add a second case with two package.json entries at the same path
to genuinely exercise the parsed.packageManager && !result.packageManager
guard at skillTechStack.ts:541.
Address PR review feedback (claude R4): the previous error-handling tests
overwrote the private `countFn` field via a cast, which silently breaks
on a rename. Add a `deps` parameter to TokenCounter that defaults to the
real `loadEncoding`, and switch the error-handling tests to inject a fake
that returns the throwing function directly. Matches the dependency
injection pattern documented in CLAUDE.md.
- shared/errorHandle: recognize duck-typed OperationCancelledError from
worker boundaries in isRepomixError (it extends RepomixError but the
name was missing from the structured-clone fallback comparison).
Add a regression test for the worker-boundary case.
Test improvements per coderabbit / claude review:
- cliReport: assert skill-directory + relative path on the same log line.
- processConcurrency: restore process.versions.bun by removing the property
when it didn't originally exist, instead of leaving it defined-as-undefined.
- logger: drop the no-op `process.env.REPOMIX_LOG_LEVEL = undefined` (it
coerces to the string "undefined" and is overwritten by the next delete).
- unifiedWorker: replace the tautological cache test with one that proves
cache uniqueness via onWorkerTermination cleanup count; add a test for
task-based inference overriding workerData (bundled-env reuse).
- calculateMetricsWorker: new direct test for the default export's items
vs. single-mode dispatch — unifiedWorker mocks this module so the branch
was otherwise untested.
- packRemoteRepositoryTool: hard-code the expected output path instead of
expect.any(String) to catch arg-swap regressions.
- memoryUtils: tighten getMemoryStats assertions with sanity bounds
(heapUsed <= heapTotal, rss > 0, heapUsagePercent <= 100) so a
unit-conversion regression (bytes vs MB) would fail the test.
The test asserted on a literal '.claude/skills/test-skill' substring, but
getDisplayPath calls path.relative which produces backslashes on Windows.
CI failed on all Windows runners. Constructing the expected substring
with path.join makes the assertion OS-native.
Targeted regression tests for the high-risk areas identified in the
v1.13.1..main audit, focusing on silent-correctness bugs and parallel
error handling — places that wouldn't surface in CI but would in the
field.
- core/metrics/calculateMetrics: pin numeric equivalence between the
fast path (Σ file tokens + wrapper tokens) and the slow path (full
output tokenization). Cover wrapper-extraction fallback, split-output
fallback, and worker pool cleanup when fileMetrics rejects.
- core/file/fileProcess: pin transform ordering invariants —
removeComments → removeEmptyLines (blank lines from comment removal
must be cleaned up; preserved when removeEmptyLines is off);
truncateBase64 → removeEmptyLines (multi-line base64 squashed first);
trim → showLineNumbers (no leading/trailing blanks numbered).
Plus worker/lightweight path parity for inputs that don't need
worker processing.
- core/packager: pin metrics worker pool cleanup on parallel branch
failures (validateFileSafety, produceOutput, calculateMetrics, warmup
rejection). Verify prefetchSortData failure is isolated and does not
block sortOutputFiles.
- core/skill/skillTechStack: cover untested fix-commit invariants —
root entry sorts first in monorepo output; configFiles deduplicated
within a directory; first-seen packageManager wins per directory.
intent(truncateBase64-tests): add explicit coverage for the new fast-path guards introduced alongside the regex-skip optimization
decision(test-cases): focus on the four cases that exercise guard behavior not previously asserted — empty input, exactly-below-threshold (255 chars), run-reset on non-base64 separator, and non-base64 data URI without `;base64,`
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(asyncMap): tighten doc and test based on coderabbitai review feedback
decision(asyncMap-doc): explicitly note that workers are not cooperatively cancelled after a rejection — sibling workers keep claiming indices
decision(asyncMap-test): replace timing-sensitive `peakActive > 1` with exact `=== 4` — workers spawn synchronously via Promise.all so the cap is hit deterministically
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(empty-dir-check): protect very large repos from FD exhaustion that unbounded Promise.all could trigger
rejected(p-limit): user wants to keep dependencies minimal — built a small in-tree helper instead
decision(asyncMap): single mapWithConcurrency helper rather than a p-limit-style limiter object — only call site is array map
decision(concurrency-limit): 20 in flight — well above libuv default thread pool (4) while still bounded for users who tuned UV_THREADPOOL_SIZE
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merge the two separate globby traversals used by `searchFiles` into a
single one and parallelize the per-directory `readdir` calls used to
filter empty directories.
Background
----------
When `output.includeEmptyDirectories` is enabled (the default for
`repomix.config.json` in this repo, and any repo that wants an accurate
directory tree), `searchFiles` previously walked the working tree twice:
once with `onlyFiles: true` and a second time with `onlyDirectories:
true`. Each call re-traversed the tree and re-parsed every `.gitignore`
/ `.repomixignore` file. `findEmptyDirectories` then issued `readdir`
serially for every matched directory, awaiting each syscall before
starting the next.
Change
------
* Replace the two globby invocations with one `objectMode: true,
onlyFiles: false` call. Partition the returned `GlobEntry[]` by
`dirent.isFile()` / `dirent.isDirectory()`, matching the previous
`onlyFiles: true` semantics for symlinks and other non-file
non-directory entries.
* Rewrite `findEmptyDirectories` to run the per-directory `readdir`
checks concurrently via `Promise.all`. Ordering is preserved by the
result array and the caller sorts the final list anyway.
* When `includeEmptyDirectories` is disabled, keep the fast
`onlyFiles: true` path unchanged so the default CLI run pays no cost.
Benchmark (hyperfine, repomix packing itself, 30 runs, warmup 3)
----------------------------------------------------------------
Run 1: baseline 2.162s ± 0.042s → perf 2.017s ± 0.029s → -145ms (-6.7%)
Run 2: baseline 2.161s ± 0.023s → perf 2.030s ± 0.027s → -131ms (-6.1%)
Per-stage verbose timings:
baseline: [globby files 200ms] + [globby dirs 85ms] + [empty dirs 61ms]
perf: [combined globby 223ms] + [empty dirs 66ms]
saved: -57ms consistently on the critical path
intent(test-ownership): move tests into website/server/tests/ so they collocate with the code under test and stop reaching up through three parents; reviewer follow-up wanted dedicated coverage and the root vs. website/server package boundary makes collocation the right long-term layout
decision(vitest-config): give website/server its own vitest.config.ts + `test` script; root's existing tests/**/*.test.ts include no longer catches server tests since they moved outside that tree, so the two test runs stay independent
decision(tsconfig-test): add tsconfig.test.json extending the build config and lift lint-tsc to `-p tsconfig.test.json` — the build tsconfig's rootDir: "./src" excludes tests/, so a single lint command wouldn't have type-checked them
learned(valibot-instanceof): with tests now resolving valibot from the same website/server/node_modules as validateRequest, the cause-check can go back to `instanceof v.ValiError` — the duck-type workaround was only needed when the root harness and server pulled different valibot copies
constraint(ci-website): added a `test-website-server` job that links the local repomix build the same way lint-website-server does; tests don't actually import repomix today, but colocation means they easily could later and the link step keeps parity
intent(server): land validateRequest unit tests alongside this PR's rewrite per reviewer follow-up — previously only exercised indirectly via classifyRejectReason with mocks, so silently dropping `.cause` during a future refactor would not fail any test
learned(valibot-instanceof): website/server resolves valibot from its own node_modules (distinct copy from the root harness), so `instanceof v.ValiError` fails across that boundary even for genuine ValiError instances; duck-type the cause check on `{ name: 'ValiError', issues: [] }` to match classifyRejectReason's own contract
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(drift-guard): swap the hard-coded 'Either URL or file must be provided' literal for MESSAGES.MISSING_INPUT so the cause-chain test honors the classifier drift-guard the file's header comment promises — a future message rewrite now propagates to this assertion too, instead of silently falling back to 'unknown'
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(pack-event-test): migrate tests/website/server/packEventSchema.test.ts to the valibot-shaped `{ name, issues: [{ message, path: [{ key }] }] }` fixture so the invalid_format case — which relies on the path fallback — actually exercises the post-migration classifier
decision(path-formatting): replace the hand-rolled `path.map/filter/join` in both packEventSchema.ts and validation.ts with `v.getDotPath(issue)` — the valibot built-in does the same reconstruction and eliminates the two-site drift risk reviewers flagged
fix(validation-error-message): drop the leading `": "` when a valibot issue has no path (top-level checks like MISSING_INPUT / BOTH_PROVIDED), so the rendered AppError message reads cleanly
rejected(cause-walk-depth): coderabbit suggested walking `.cause` recursively up to depth 3 — declined because validateRequest wraps exactly once by contract and no second wrapper exists in the chain; defensive recursion would add surface area for no caller
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(parity-coverage): the numeric-constraint enforcement block already rejects non-integer `maxFileSize` and `splitOutput`, but the three `v.pipe(v.number(), v.integer(), v.minValue(1))` siblings (`topFilesLength`, `git.sortByChangesMaxCommits`, `git.includeLogsCount`) only had min-bound rejection. Add non-integer cases for each so every pipe-guarded numeric field has matching fencepost coverage
intent(readability): an issue with no `path` (e.g., root-level schema mismatch) previously rendered as `[] message`; emit just `message` when segments are empty. Small quality-of-life for error output
intent(limitation-pin): add an integration test that documents the known ESM unwrap ambiguity — a CJS module shaped like `{ default: {...}, otherKey: ... }` has `otherKey` silently dropped by our heuristic. Non-issue for RepomixConfig today, but worth freezing so the behavior can't drift without someone noticing
intent(test-coverage): normalizeObjectNode is the load-bearing workaround for @valibot/to-json-schema's output gaps (missing additionalProperties:false, noisy empty required arrays). A bug in its tree walk would silently break the published repomix.config.json IntelliSense schema. Extract to its own module and pin the behavior: adding/respecting additionalProperties, stripping empty required, recursing through properties / anyOf / oneOf / items, handling null and primitives
Addresses PR #1493 round 5 follow-up (schema-drift tracking) from claude.
intent(eliminate-drift): prior reviews flagged the string-coupling
between packRequestSchema (produces zod messages) and
classifyRejectReason (matches them back to a metric label) as
fragile — a message rewrite in one place would silently land
requests in the 'other' bucket, quietly mislabeling the dashboard.
Extract all 11 messages to a shared MESSAGES module so the
producer, the consumer, and the test's expected values all
reference the same constants
decision(map-over-switch): classifyRejectReason's 11-case switch
becomes a MESSAGE_TO_REASON lookup table. Same runtime behavior,
easier to scan, and the keys are compile-time-typed via the
shared constants (so a typo in either side is a TS error)
decision(preserve-invalid_format-path): 'invalid_format' is keyed by
path (`format`), not by message text. Left as path-match so it
continues to work even if zod's default enum error message changes
improve(test-references-constants): test.each cases now reference
MESSAGES directly. The test continues to catch classifier-logic
drift (wrong label for a known key, missing entry in the lookup)
while schema/classifier drift becomes impossible by construction
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(pre-merge-gate): buildCliConfig parses against repomixConfigCliSchema before mergeConfigs, so the intersect must preserve both the base-schema `output.filePath` and the CLI-only `output.stdout` + top-level `skillGenerate` before the merged schema ever sees the value. Add a targeted test so a future change to the intersect shape doesn't silently strip CLI fields at the cli-parse step
intent(ci-green): CI failed on `Cannot find package 'repomix'` because
the test imported `packRequestSchema.ts` which has `import from
'repomix'` — the root vitest harness doesn't install server-only
packages (repomix is this repo, hono is a server dep). Local runs
passed by coincidence of self-reference resolution
decision(construct-zoderror-directly): rewrite the test to construct
`z.ZodError` instances with messages matching the schema rather
than invoking the schema. Keeps classifier-drift coverage (if the
switch stops matching a known label, a test fails) at the cost of
not catching schema-drift (if a zod message in packRequestSchema is
edited without updating the test, the test still passes but
dashboards mislabel). Schema-drift catching would need a shared
constants module — tracked as a follow-up
decision(native-error-cause): use `new Error(msg, { cause })` instead
of importing `AppError`. AppError pulls `hono/utils/http-status`
(type-only, but some module resolvers still probe). Native Error
with cause exercises the same classifier code path
improve(table-driven): restructure the 11 label cases as `test.each`
so each case is named individually in the test output
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses PR #1493 round 3 review from claude.
intent(catch-schema-drift): every previous review round flagged the
fragile string coupling between classifyRejectReason and the zod
messages in packRequestSchema. The earlier response was that adding
tests would require new infrastructure in website/server/ — but
tests/website/cliCommand.test.ts already tests website/ code under
the root vitest harness. Extended the same pattern to cover every
rejection path through the real schema so message edits surface as
CI failures instead of silent 'other'-bucket dashboard drift
decision(table-driven-through-real-schema): the tests invoke
validateRequest(packRequestSchema, input) rather than constructing
synthetic ZodError objects — this catches both classifier drift AND
schema drift in one assertion. 13 tests covering 11 reject labels
+ cause-chain extraction + non-error input
improve(boolean-intent-comment): claude noted that Boolean() collapses
`undefined` (user didn't send field) with `false` (user explicitly
disabled), losing that distinction in the pack_options_usage metric.
This is intentional — the metric answers "what % of packs had
compress enabled" where both "off" and "unspecified" correctly mean
the feature wasn't active. Added a comment so future contributors
don't "fix" it
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(regression-guard): `repomixConfigDefaultSchema.output` is strict and omits `stdout`, so any future change that alters `v.intersect` merging semantics would silently drop the CLI `--stdout` flag (and `skillGenerate`) after `mergeConfigs` validates the merged value. Pin the behavior with a targeted test
intent(interop-consolidation): drop `interopDefault: true` from the jiti setup in configLoad — the explicit ESM namespace unwrap at the call site already handles every module-format case we test (.ts / .mts / .js / .mjs / .cjs). Having both the jiti flag and the manual unwrap was redundant and made the intent fuzzier
intent(error-path-cleanliness): filter out empty path segments before joining in rethrowValidationErrorIfSchemaError — a malformed ValiError item (object without `key`) would otherwise produce `[output..style]`; dropping the empty entry keeps the path readable. Added a dedicated test covering the filter
intent(behavioral-coverage): previous tests verified structural equivalence to the zod schemas but never asserted that the integer / minValue / maxValue pipes actually reject invalid values — add focused rejection cases for maxFileSize, topFilesLength, splitOutput, and the git counts so the constraint migration is proven behaviorally
decision(default-schema-maxvalue): mirror the `v.maxValue(Number.MAX_SAFE_INTEGER)` from the base schema onto `repomixConfigDefaultSchema.splitOutput` so the upper bound is enforced regardless of which schema the caller parses against — the prior edit only touched the base entry
intent(test-coverage): the nuanced `defaultExport typeof object` check in configLoad was only exercised through real jiti imports — add two mocked jitiImport cases so the wrapped-object unwrap branch and the CJS-with-primitive-`default` pass-through branch are both guarded against regressions
intent(error-handle): drop the instanceof Error guard in rethrowValidationErrorIfSchemaError so ValiError / ZodError round-tripped through a worker (plain { name, message, issues }) is still recognized — aligns with isError / isRepomixError elsewhere in the file
intent(schema-parity): restore splitOutput's upper bound (Number.MAX_SAFE_INTEGER) in the generated JSON schema so editor hints match the previous zod output; also strip the empty required:[] arrays that @valibot/to-json-schema emits on every object node
intent(esm-unwrap): only unwrap jiti's .default when it's an object, preserving a CJS config that legitimately exports { default: 'plain', ...rest }; plain Symbol.toStringTag === 'Module' was too narrow — jiti returns non-Module namespace wrappers for .ts / .mts files
intent(test-coverage): add tests/shared/errorHandle.test.ts covering the Zod + Valibot + worker-serialized paths through rethrowValidationErrorIfSchemaError; tighten the default-schema assertion in configSchema.test.ts from /expected|invalid/i to toThrow(v.ValiError) + targeted message pattern
decision(path-segment-fallback): return '' for unknown object-shaped path items instead of falling into String(segment) → "[object Object]"; defensive, non-breaking
intent(cold-start): evaluate valibot as a simpler alternative to PR #1484's async deferral of zod
decision(validation-library): valibot v1 picked for smaller module graph and tree-shaking friendliness
decision(error-handling): rename rethrowValidationErrorIfZodError to rethrowValidationErrorIfSchemaError and duck-type both ZodError and ValiError so the helper stays library-agnostic
constraint(module-unwrap): jiti.import returns an ESM Module namespace even with interopDefault — zod v4 quietly unwrapped it, valibot does not, so configLoad manually picks .default
learned(zod-v4-modules): z.object().parse(moduleNamespace) reaches into .default automatically — not documented prominently and was masking the jiti interop gap
learned(cold-start-impact): ~10ms (~6%) faster startup in hyperfine A/B — below the -15ms bar from the task spec, but worth keeping as a branch for the simplicity win