The security worker pool currently spawns its 2 workers lazily inside
`runSecurityCheck`, paying a ~50 ms `@secretlint/core` +
`@secretlint/secretlint-rule-preset-recommend` module load on each
freshly spawned worker (~100 ms wall-clock for both workers loading
concurrently). That cold-start cost runs on the critical path inside
the security-check phase, before any scanning begins.
Mirror the existing `createMetricsTaskRunner` pattern: hoist the pool
construction to `pack()` and dispatch one no-op task per worker at the
pipeline entry, so the module load overlaps with the collectFiles + git
ops phase (~200 ms) instead of stalling the security check.
## Mechanism
- New `createSecurityTaskRunner(numOfTasks, deps?)` in
`src/core/security/securityCheck.ts` returns
`{ taskRunner, warmupPromise }`. The warm-up dispatches `maxThreads`
no-op tasks (`{ items: [] }`) — Tinypool spawns a fresh worker for
each concurrent task, fanning out the @secretlint/core load across
all workers in parallel.
- `runSecurityCheck` accepts an optional `taskRunner` in `deps`. When
provided, the caller owns the pool's lifecycle (creation + cleanup);
when omitted, runSecurityCheck creates and cleans up a fresh pool —
preserving the existing behavior for direct callers (e.g. the MCP
fileSystemReadFileTool path).
- `validateFileSafety` accepts and forwards an optional `taskRunner`.
- `pack()` calls `createSecurityTaskRunner` after `searchFiles` resolves
(file count is now known) and before the parallel collectFiles + git
ops block, so the warm-up runs concurrently with disk I/O. The
task runner is plumbed through `validateFileSafety` deps; the pool
is cleaned up alongside the metrics pool in the surrounding
try/finally.
## Scope gate
Pre-warming is gated on the same `hasExplicitScope` heuristic that
already differentiates 2- vs. 3-worker metrics warm-up:
| Workload | Pre-warm? |
|--------------------------------------------------|-----------|
| Default scan (no `--include` / `--stdin`) | yes |
| `--include`, `config.include`, or `--stdin` set | no |
Without the gate, the small/scoped workload regresses by 3.4 % paired
mean: the security check scans only ~5 batches and finishes in ~50–80
ms, so the up-front cost of constructing + destroying a second worker
pool outweighs the saved cold-start. The unconstrained scan runs
security over ~1000+ files where the hidden cold-start dominates.
## Benchmark — `node bin/repomix.cjs --quiet` (1046 files)
Two independent paired n=50 runs (interleaved BEFORE/AFTER alternating
order, NODE_DISABLE_COMPILE_CACHE=1):
| | min | median | mean | max | sd |
|--------|---------|---------|---------|---------|--------|
| BEFORE | 1320 ms | 1454 ms | 1451 ms | 1590 ms | 49 ms |
| AFTER | 1318 ms | 1410 ms | 1416 ms | 1501 ms | 40 ms |
- Mean paired Δ: **+35.2 ms (2.42 % wall-clock reduction)**
- Median paired Δ: +32.5 ms (2.23 %)
- Paired-delta SD: 64.78 ms · paired t = **3.84** (p < 0.001)
- AFTER faster in **39/50** pairs (78 %)
Confirmation run (same setup, n=50): mean Δ +37.0 ms (2.55 %),
t = 3.93, 36/50 pairs faster.
## Regression check — `--include 'src,tests' --quiet` (258 files)
n=30 paired interleaved, NODE_DISABLE_COMPILE_CACHE=1:
| | min | median | mean | max |
|--------|--------|--------|--------|--------|
| BEFORE | 670 ms | 732 ms | 730 ms | 783 ms |
| AFTER | 688 ms | 728 ms | 729 ms | 786 ms |
- Mean paired Δ: +0.9 ms (0.13 %) — **neutral within noise**
(paired t = 0.17)
- AFTER faster in 16/30 pairs
The gate falls back to the original lazy-spawn path on this workload,
so AFTER == BEFORE up to noise. Without the gate this workload
regresses by 3.4 % paired (t = -4.88).
## Correctness
- All **1260** unit tests pass (`npm test`); `npm run lint` clean
(only the two pre-existing `biome-ignore` warnings unrelated to
this change).
- XML output **byte-identical** between BEFORE and AFTER on both the
default 1046-file workload and the `--include 'src,tests'`
258-file workload (verified via `diff` on full ~4.85 MB outputs).
- `runSecurityCheck`'s public signature gains an optional `taskRunner`
in deps; when omitted, behavior is unchanged. Existing callers
outside the pack pipeline (e.g. MCP `fileSystemReadFileTool`) still
spawn their own pool.
- The MCP main-thread security path is unaffected — it uses
`runSecretLint` directly (worker module loaded once at process
start) and never goes through the pool.
## Tests
- `tests/core/security/validateFileSafety.test.ts` — assertion on the
`runSecurityCheck` call updated to include the new `{ taskRunner }`
deps argument (currently undefined when no pre-warmed runner is
provided).
- `tests/core/packager.test.ts`,
`tests/core/packager/diffsFunctionality.test.ts`,
`tests/core/packager/splitOutput.test.ts`,
`tests/integration-tests/packager.test.ts` — extended `mockDeps` /
`baseDeps` with a stubbed `createSecurityTaskRunner` so the default
scope path no longer attempts to spawn a real worker pool from the
test environment. The pack-level assertion on `validateFileSafety`
now matches the new 6th-argument deps object via
`expect.objectContaining({ taskRunner: expect.any(Object) })`.
`node:fs/promises.readFile` wraps every read in a `FileHandle` object,
paying ~60μs of per-call JS bookkeeping vs. the callback-based path. With
~1000 files draining concurrently through `collectFiles`, the overhead
compounds onto the search → collect → security/process critical path.
Switch the single hot read site (`readRawFile` in `src/core/file/fileRead.ts`)
to `util.promisify(fs.readFile)`, which returns the same `Buffer` without
constructing a `FileHandle` per call. All downstream code
(`Buffer.indexOf(0)`, `isBinaryFile(buffer)`, `TextDecoder.decode`, BOM
strip, jschardet/iconv slow path) is unchanged.
Other readFile call sites in the codebase (config load, gitignore parse,
output instruction, MCP tools) are 1–2 calls each and remain on
`fs/promises` for their `'utf-8'` ergonomics.
## Mechanism
Isolated raw-I/O microbenchmark walking the repo with a minimal ignore
list (`node_modules`, `.git`, `lib`, `repomix-output*` only — 1086 files,
slightly looser than repomix's full default-ignore set), n=15 paired
interleaved, `NODE_DISABLE_COMPILE_CACHE=1`:
| concurrency | fs/promises median | promisify median | Δ |
|-------------|---------------------|------------------|--------|
| 50 | 118.3 ms | 55.3 ms | 63.0 ms |
| 100 | 116.2 ms | 47.1 ms | 69.1 ms |
| 200 | 122.7 ms | 49.9 ms | 72.7 ms |
| unlimited | 119.3 ms | 59.5 ms | 59.8 ms |
The savings are concurrency-independent — the overhead is per-call, not
contention.
In the full pipeline (1046 files after the default-ignore filter), the
60–70 ms read savings flow through to the `collect` phase (verbose
timings: 263 → 223 ms, ~40 ms phase-level reduction), then to wall-clock
with some absorption by the parallel `getGitDiffs` / `getGitLogs` branch
in the same `Promise.all`.
## Benchmark — `node bin/repomix.cjs --quiet` (1046 files)
n=50 paired interleaved, `NODE_DISABLE_COMPILE_CACHE=1`:
| | min | median | mean | max | sd |
|--------|---------|---------|---------|---------|--------|
| BEFORE | 1647 ms | 1800 ms | 1794 ms | 2091 ms | 81 ms |
| AFTER | 1606 ms | 1752 ms | 1756 ms | 1926 ms | 70 ms |
- Mean paired Δ: **+38.2 ms (2.13% wall-clock reduction)**
- Median paired Δ: +43.0 ms (2.40%)
- Paired-delta SD: 81.3 ms · paired t = **3.33** (p < 0.01)
- AFTER faster in **34/50** pairs (68%)
n=50 was the minimum credible sample size given a paired-delta SD ≈
80 ms and an effect size near 38 ms; an independent reviewer's two
n=20 runs straddled the claim (t=0.93 and t=4.27 respectively),
consistent with this distribution.
## Regression check — `node bin/repomix.cjs --include 'src,tests' --quiet` (258 files)
n=30 paired interleaved, `NODE_DISABLE_COMPILE_CACHE=1`:
| | min | median | mean | max | sd |
|--------|--------|--------|--------|--------|-------|
| BEFORE | 850 ms | 918 ms | 922 ms | 984 ms | 38 ms |
| AFTER | 844 ms | 911 ms | 912 ms | 992 ms | 38 ms |
- Mean paired Δ: +10.2 ms (1.11%) — **neutral within noise** (paired t = 1.85)
- AFTER faster in 17/30 pairs
## Correctness
- All **1260** unit tests pass (`npm test`); `npm run lint` clean
(only pre-existing warnings unrelated to this change).
- XML output **byte-identical** between BEFORE and AFTER on both the
default 1046-file workload and the `--include 'src,tests'` 258-file
workload (verified via `diff` on full ~4.83 MB outputs).
- The change is a one-site swap of an internal helper; the public
`readRawFile()` API and all `RawFile` content semantics are unchanged.
## Local review
Two independent sub-agent reviewers approved:
- **Code-correctness reviewer:** APPROVE. Verified API equivalence
(callback-based `fs.readFile` and `fs.promises.readFile` both delegate
to the same `uv_fs_read` op and surface identical `SystemError` codes
for ENOENT/EACCES/EISDIR/EMFILE), `Buffer` return-type match, libuv
thread-pool concurrency parity, no FD leak (callback path closes its
own FD before invoking the callback), and dependency-injection mocks
in `tests/core/file/fileCollect.test.ts` cover the new code path.
- **Benchmark-methodology reviewer:** APPROVE WITH NITS. Confirmed the
before/after binaries differ exactly at the documented one-site swap,
reproduced byte-equivalence, and ran two independent paired n=20
benchmarks bracketing the +38.2 ms claim. The t-stat math checks out
(38.2 / (81.3/√50) = 3.32 ≈ reported 3.33). Doc nit on the microbench
vs pipeline file-count difference is addressed inline above.
Packing.
Bumps EAGER_WARMUP_THREADS from 2 to 3 in src/core/packager.ts when the
user did not narrow the file set via --include / config.include / --stdin.
Tinypool fixes maxThreads at construction, so the 3rd worker must be
pre-warmed during the searchFiles + collectFiles window or it stalls
dispatch (a 4-thread / 2-warm experiment regressed by 27% paired in a
prior iteration). With explicit scope the file set is typically a few
hundred files, the metrics phase is shorter, and the 3rd worker's
~250ms BPE warm-up dominates the parallelism gain — paired benchmarks
regressed -11.85% on the 258-file `--include 'src,tests'` workload at
unconditional EAGER_WARMUP_THREADS=3, so the heuristic falls back to 2.
Reasoning.
After change 3 on this branch (eager metrics warm-up), the metrics phase
is the dominant wall-clock contributor on the default 1046-file workload
(~770 ms in `calculate metrics`, vs ~120 ms output generation, ~370 ms
search, ~270 ms collect, ~200 ms security check). Five sub-agent
investigations over independent scopes (CLI startup, file search/glob,
file collect/security, output generation, token counting) converged on
metrics worker count as the only candidate clearing the 2% bar without
regressing other phases. Output gen, security pre-warm, file-search
scoping, and CLI lazy-load were all measured below threshold or net-
negative; documented as the previous iteration's notes plus the
follow-on attempts here:
- EAGER_WARMUP_THREADS=3 unconditional: -11.85% paired regression on
the 258-file workload (n=20, t=-10.85), +2.92% on the 1046-file
workload — net negative because small workloads can't amortize the
extra BPE parse.
- Pre-warm the security worker pool gated on the metrics warm-up:
security-check phase shrunk from 197 ms to 110 ms, but the saving was
absorbed by the parallel `Process Files` branch and an offsetting
worker-spawn cost during collect. Paired n=30 measured -4.90% on
258-file and 0.81% (noise) on 1046-file. Reverted.
Verification.
Paired interleaved benchmarks (n=20, NODE_DISABLE_COMPILE_CACHE=1):
Default workload — `node bin/repomix.cjs --quiet` (1046 files):
| | min | median | mean | max | sd |
|--------|---------|---------|---------|---------|--------|
| BEFORE | 1820 ms | 1885 ms | 1886 ms | 2020 ms | 45 ms |
| AFTER | 1700 ms | 1845 ms | 1840 ms | 1970 ms | 62 ms |
- Mean paired Δ: +46.5 ms (2.46% wall-clock reduction)
- Median paired Δ: +50.0 ms (2.65%)
- Paired-delta SD: 65.3 ms · paired t = 3.18 (p < 0.01)
- AFTER faster in 15/20 pairs (75%)
Scoped workload — `node bin/repomix.cjs --include 'src,tests' --quiet`
(258 files):
| | min | median | mean | max | sd |
|--------|---------|---------|---------|---------|--------|
| BEFORE | 900 ms | 955 ms | 953 ms | 990 ms | 25 ms |
| AFTER | 910 ms | 940 ms | 946 ms | 1010 ms | 29 ms |
- Mean paired Δ: +6.5 ms (0.68%) — neutral within noise (t = 0.90)
- The heuristic falls back to 2 warm workers, so this branch matches
pre-change behavior; the small positive delta is sampling noise.
An independent reviewer's paired n=15 NODE_DISABLE_COMPILE_CACHE=1 run
on a separate sample reported +4.10% (t=6.61, 14/15 pairs) on the
default workload, consistent direction at higher magnitude.
Correctness.
- All 1260 unit tests pass (`npm test`); 3 new tests in
`tests/core/packager.test.ts` exercise both heuristic branches plus
the `--stdin` (explicitFiles) path.
- `npm run lint` clean (only pre-existing warnings unchanged from main).
- XML and Markdown output byte-identical between BEFORE and AFTER on
both workloads (verified via sha256sum).
- Worker-pool size confirmed via `--verbose` logs:
- Default scan: `min=1, max=3 threads` for `calculateMetrics`.
- `--include 'src,tests'`: `min=1, max=2 threads` (unchanged).
- Single-CPU and 2-CPU hosts are unaffected (`min(cpuCount, 3) =
min(cpuCount, 2)` for cpuCount ≤ 2).
- Public `pack()` API unchanged (no new parameters; the heuristic reads
existing `config.include` and `explicitFiles` arguments).
Risks.
The heuristic is a coarse proxy. Pathological cases:
- User runs default scan on a tiny repo (~50 files): 3 workers, +1
extra BPE parse. The cost is bounded by the eager-warm-up overlap
with searchFiles/collectFiles, so the worst case approaches the
paired noise floor (~30 ms sd on 258-file). Not measured below 50
files; expected to be neutral-to-slightly-negative within typical
run-to-run variance.
- User runs `--include 'huge-dir'` on a 5000-file project: 2 workers,
misses the parallelism win. Falls back to current production
behavior — no regression vs main.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Background
----------
Each metrics worker independently parses gpt-tokenizer's ~2.2 MB
`o200k_base.js` BPE table on its first task (~200-300 ms pure-CPU per
worker). The pool was previously created in `pack()` after the file
search and sort phases, so the only stages that could absorb the BPE
warm-up were `collectFiles` + git subprocesses + security check + file
processing. On a 258-file run this still left a residual ~80-130 ms
`await metricsWarmupPromise` stall before the metrics phase.
Change
------
Move `createMetricsTaskRunner` to fire before `searchFiles`. This adds
the ~130 ms glob scan to the hidden warm-up budget and shrinks the
residual stall to ~0-12 ms on the 258-file workload.
Pool sizing: Tinypool fixes `maxThreads` at construction, and the file
count is not yet known. Pre-warming exactly the workers we'll use is
essential — Tinypool queues tasks for newly spawned (cold) workers and
the pipeline can't progress until those workers finish their BPE parse
and pick up the queued task (an experiment with `maxThreads=cpuCount=4`
and only 2 warm workers regressed the 258-file workload by 27 % paired).
So the pool is sized to a fixed 2 workers (`numOfTasks = 2 ×
TASKS_PER_THREAD = 400` → `maxThreads = min(cpuCount, 2)`), matching the
security pool's hard cap and the typical metrics pool size for repos
≤400 files after the TASKS_PER_THREAD=200 sizing on this branch.
Larger repos (>400 files) would benefit from more parallelism, but the
1046-file regression check below shows the eager-warmup gain still
net-improves wall-clock at maxThreads=2 (the BPE warm-up cost
~250 ms × cpuCount-2 extra workers dominates the parallelism savings on
the metrics phase). On single-CPU hosts the heuristic naturally
collapses to maxThreads=1, identical to today's behavior.
The `try { } finally { cleanup }` block is widened to cover the new
early call so the worker pool is cleaned up on early throws too. A new
`searchFiles`-rejection test in `tests/core/packager.test.ts` exercises
that path explicitly.
`TASKS_PER_THREAD` is exported from `processConcurrency.ts` and consumed
by name in `packager.ts` to keep the eager-warmup constant tied to the
shared sizing rule.
Benchmark
---------
Both runs use n=… paired interleaved (alternating BEFORE-first /
AFTER-first ordering) with `NODE_DISABLE_COMPILE_CACHE=1` so cold-start
BPE parse is measured rather than masked. 4-vCPU Intel(R) Xeon(R) host.
`node bin/repomix.cjs --include 'src,tests' --quiet` (258 files, n=20):
| | min | median | mean | max | sd |
|--------|---------|---------|---------|---------|--------|
| BEFORE | 1007 ms | 1044 ms | 1054 ms | 1164 ms | 36 ms |
| AFTER | 893 ms | 966 ms | 962 ms | 1065 ms | 36 ms |
- Mean paired Δ: +91.6 ms (8.69 % wall-clock reduction)
- Median paired Δ: +97.5 ms (9.34 %)
- Paired-delta SD: 36.0 ms · paired t = 11.39 (p < 0.001)
- AFTER faster in 20/20 pairs (100 %)
Regression check — `node bin/repomix.cjs --quiet` (default, 1046 files,
n=15) on a clean repo (baseline binary built outside the working tree
so it does not get picked up as a workload file):
| | min | median | mean | max | sd |
|--------|---------|---------|---------|---------|--------|
| BEFORE | 1769 ms | 1872 ms | 1877 ms | 2063 ms | 79 ms |
| AFTER | 1751 ms | 1820 ms | 1837 ms | 2018 ms | 61 ms |
- Mean paired Δ: +40.0 ms (2.13 %)
- Median paired Δ: +48.6 ms (2.60 %)
- Paired-delta SD: 51.7 ms · paired t = 2.99 (p ≈ 0.01)
- AFTER faster in 11/15 pairs (73 %)
The larger workload also clears the 2 % threshold; the eager warm-up's
gain offsets the maxThreads=2 cap that's now applied unconditionally.
Correctness
-----------
- All 1257 unit tests pass (`npm test`); `npm run lint` clean (only
pre-existing warnings).
- XML and Markdown output byte-identical between BEFORE and AFTER on
both the 258-file and 1046-file workloads.
- Worker-pool size confirmed via `--verbose` logs: `min=1, max=2 threads`
for `calculateMetrics` on both workloads (was `max=2` on 258 files,
`max=4` on 1046 files before this change).
- New test `cleans up the metrics worker pool when searchFiles rejects`
exercises the widened `try/finally` cleanup path.
Background
----------
On a typical CLI run (`node bin/repomix.cjs --include 'src,tests' --quiet`,
258 files, 4-vCPU host), the metrics worker pool was sized as
`ceil(258 / 100) = 3 workers`. Combined with the security pool's hard cap
of 2 workers (securityCheck.ts:90) and the main thread, the process held
6 active threads on 4 cores during the overlap of `validateFileSafety`
and `calculateMetrics`.
Each metrics worker independently parses gpt-tokenizer's ~2.2 MB
`o200k_base.js` BPE table on its first task — a ~200-300 ms pure-CPU
operation per worker. Spawning 3 cold metrics workers in the warm-up
phase (calculateMetrics.ts:46-48) therefore drove the security workers
off the CPU during their own (concurrent) cold-start, inflating the
critical-path security phase.
Change
------
Raise `TASKS_PER_THREAD` from 100 to 200 so:
- ≤200 file repos: 1 metrics worker (was 1) — no change
- 201-400 file repos: 2 metrics workers (was 3) — -1 worker, the win
- 401-600 file repos: 3 metrics workers (was 4-cap) — -1 worker
- 601-800 file repos: 4 metrics workers (was 4-cap) — no change
- 801+ file repos: 4 metrics workers (was 4-cap) — no change (cap)
For the 258-file benchmark this brings active workers during the
metrics+security overlap to 2 + 2 = 4, matching CPU count, and halves
the parallel BPE-loading work in the warm-up phase.
Tests for `getWorkerThreadCount` and `createWorkerPool` are updated to
reflect the new ratio.
Benchmark
---------
`node bin/repomix.cjs --include 'src,tests' --quiet` (258 files), n=20
paired interleaved (alternating BEFORE-first / AFTER-first ordering):
| | min | p25 | median | p75 | mean | sd |
|--------|---------|---------|---------|---------|---------|--------|
| BEFORE | 1045 ms | 1092 ms | 1109 ms | 1122 ms | 1107 ms | 27 ms |
| AFTER | 937 ms | 973 ms | 991 ms | 1020 ms | 994 ms | 29 ms |
Mean paired Δ: +112.5 ms (10.17 % wall-clock reduction)
Median paired Δ: +115.4 ms (10.66 % wall-clock reduction)
Paired-delta SD: 36.2 ms (paired t = 13.88, p < 0.001)
AFTER faster in 20/20 pairs (100 %)
Regression check — `node bin/repomix.cjs --quiet` (default, 1572 files),
n=15 paired interleaved:
| | min | p25 | median | p75 | mean | sd |
|--------|---------|---------|---------|---------|---------|--------|
| BEFORE | 1933 ms | 1970 ms | 2016 ms | 2102 ms | 2028 ms | 62 ms |
| AFTER | 1955 ms | 1966 ms | 2004 ms | 2131 ms | 2034 ms | 74 ms |
Mean paired Δ: -6.2 ms (-0.31 %) (paired t = -0.29, p > 0.05)
Median paired Δ: -12.7 ms (statistically neutral)
No regression on the large workload — both 100 and 200 saturate the
per-CPU cap at 4 workers for ≥800 file repos, so the dispatch-time
behavior is identical there.
Correctness
-----------
- 1256 / 1256 unit tests pass.
- `npm run lint` clean (only pre-existing warnings unrelated to this change).
- No behavioral change to file processing, tokenization, security checks,
or output. Pool sizing is the only effect.
Handlebars and the per-style template modules (which transitively re-import
handlebars via `outputStyleUtils`) collectively add ~50 ms to the synchronous
CLI startup path, even though they are only consumed by `generateHandlebarOutput`
near the end of `pack()` — after file search, collection, processing, and the
security check have all run.
Switch the static `import Handlebars from 'handlebars'` and the three style
imports to a `import type` plus dynamic `await import(...)` inside
`getCompiledTemplate`. The compiled-template cache (`compiledTemplateCache`)
ensures the imports only run once per process. The dynamic load now overlaps
with `calculateMetrics` (the two run inside the `Promise.all` in `packager.ts`),
so on workloads where the metrics phase is the wall-clock critical path the
import cost is fully hidden; on smaller workloads the cost moves off the
serial startup path.
## Benchmark — node bin/repomix.cjs --include 'src,tests' --quiet (258 files), n=30 paired interleaved
| | min | p25 | median | p75 | mean | sd |
|--------|---------|---------|---------|---------|---------|---------|
| BEFORE | 983 ms | 1061 ms | 1088 ms | 1108 ms | 1081.6 ms | 39.1 ms |
| AFTER | 998 ms | 1040 ms | 1061 ms | 1078 ms | 1058.8 ms | 32.7 ms |
- Mean paired Δ: +22.7 ms (~2.10 % wall-clock reduction)
- Median paired Δ: +24.5 ms
- AFTER faster in 22/30 pairs (73 %)
A second independent re-run on the same machine (n=15, AFTER-first ordering)
reproduced the direction with mean Δ +21.5 ms / median Δ +12 ms / 10/15 pairs
faster — paired-delta SD ≈ 48 ms, so the 95 % CI on the per-machine effect
straddles zero (t(14) ≈ 1.75, p ≈ 0.10). The mean magnitude is consistent
across runs but the per-pair variance on this 4-vCPU host is large relative
to the effect; the percentage-level claim should be read as "~2 % mean
reduction in the typical case, with run-to-run noise dominating any single
pair." Cleaner machines would likely tighten the CI.
## Benchmark — node bin/repomix.cjs --quiet (default, 1572 files), n=30 paired interleaved
| | min | p25 | median | p75 | mean | sd |
|--------|---------|---------|---------|---------|---------|---------|
| BEFORE | 1906 ms | 2040 ms | 2084 ms | 2147 ms | 2087.2 ms | 80.0 ms |
| AFTER | 1912 ms | 2024 ms | 2097 ms | 2145 ms | 2089.6 ms | 81.2 ms |
- Mean paired Δ: -2.5 ms · Median paired Δ: +2.5 ms — statistically neutral
- AFTER faster in 16/30 pairs (53 %)
- No regression: the import cost is fully absorbed by the longer metrics tail
on this workload.
## Correctness
- XML output (default style) byte-identical between BEFORE and AFTER (verified
via `cmp` on `--include 'src,tests'`).
- Markdown output byte-identical between BEFORE and AFTER (verified via `cmp`
with `--style markdown --output /tmp/{before,after}.md`).
- The compiled-template cache continues to dedupe per style; the
`markdownStyle.ts` top-level call to `registerHandlebarsHelpers()` runs the
first time the markdown branch is awaited (idempotent — the helper module
guards on `handlebarsHelpersRegistered`, so the duplicate-import path that
exists in `skillSectionGenerators.ts` continues to work unchanged).
- All 1256 tests pass (`npm test`); lint clean (only pre-existing warnings).
## Why other candidates were not chosen
Five investigation sub-agents (CLI startup, file I/O, security pipeline,
output, metrics) ran in parallel. Other candidates measured below threshold
or regressed on the larger workload:
- Run `lintSource` on the main thread instead of the security worker pool
(security candidate): +3.78 % on 258-file pack, but a reproducible -1.85 %
regression on the 1572-file pack — the main-thread `await lintSource` loop
starves I/O callbacks (worker-pool messages, git pipe drains) for ~190 ms
on the large repo and the savings are eaten by downstream phases.
`setImmediate` yielding reduced but did not eliminate the regression.
- `METRICS_BATCH_SIZE` 50 → 100 (metrics candidate): claimed alignment with
`TASKS_PER_THREAD = 100`, but measured -2.56 % regression on the 1572-file
pack (5/20 pairs faster). The original comment-warning held empirically.
- Skip `calculateFileLineCounts` / `calculateMarkdownDelimiter` for non-markdown
styles (output candidate): only 0.6–1.0 % wall-clock impact, below threshold.
- Skip the `**/.ignore` globby file-tree scan when no root `.ignore` exists
(file I/O candidate): warm-disk savings ~10–35 ms, below threshold.
The redirecting_factory_constructor_signature pattern previously stood on
its own. Wrap it in (declaration ...) like the other constructor and
factory patterns above so the comment about being a direct child of
'declaration' is reflected in the query shape itself. Behavior is
unchanged — the wrapper does not alter what gets matched in practice.
Addresses PR review feedback from coderabbitai.
Three constructor variants were silently dropped during --compress:
- `const Foo(...);` and `const Foo.named(...) : ...;` parse as
`(declaration (constant_constructor_signature ...))` — a node type the
existing constructor query did not list.
- `const factory Foo() = Bar;` parses as
`(redirecting_factory_constructor_signature (const_builtin) (identifier) ...)`
whose first named child is `const_builtin`, so the leading-anchor
`. (identifier)` pattern failed to match.
- `external factory Foo.make();` parses as
`(declaration (factory_constructor_signature ...))` — bare under
`declaration`, not wrapped in `method_signature`, so the existing
factory query missed it.
Switch the constructor / factory / redirecting-factory queries to
capture the whole signature node as `@name.definition.method`. This
emits the same source line(s) DefaultParseStrategy already produces and
is robust across all body / external / const / redirecting variants.
Use a kind-specific tag for extension_declaration instead of reusing
@definition.class, matching the convention used by the newly added
@definition.mixin / @definition.enum / @definition.type tags.
Output is unchanged: DefaultParseStrategy only branches on
name.includes('name'), and no consumer reads @definition.* values today.
This keeps the labels honest if per-kind handling lands later.
Addresses PR review feedback from gemini-code-assist and coderabbitai.
Two pre-existing gaps surfaced while extending queryDart:
- Plain constructors (e.g. `Animal(this.name);`) live directly under
`declaration`, not wrapped in `method_signature`, so the existing
`(method_signature (constructor_signature ...))` query never matched
them. Add a sibling query against `(declaration (constructor_signature ...))`.
- Operator overloads (`operator +`, `operator []`, `operator []=`,
`operator ==`, ...) parse as `(method_signature (operator_signature ...))`
but `operator_signature` has no identifier name field — the operator
token surfaces as `(binary_operator)` / `([])` / `([]=)` children.
Capture the whole `operator_signature` as `@name.definition.method` so
DefaultParseStrategy emits its full source range.
Verified against `--compress` on a real Dart file: signatures that were
previously dropped (only their `///` doc comments survived) now appear
in compressed output.
Carries upstream Dart 3.10 dot-shorthand and external-member fixes,
unblocking the additions in this PR. Previously deferred by
.npmrc min-release-age=7; 0.1.17 was published 2026-04-22 so the
window is now clear.
intent(dart-query): make --compress preserve Dart definition kinds that were silently dropped — mixin, typedef, getter, setter, factory, and redirecting factory
decision(capture-naming): align Dart captures with the dominant @name.definition.X convention used by queryTypeScript/queryPython/queryRust; output is unchanged because DefaultParseStrategy matches via name.includes('name')
constraint(redirecting-factory): tree-sitter-dart grammar makes redirecting_factory_constructor_signature a child of `declaration`, not `method_signature`, so it must be queried bare to avoid a "Bad pattern structure" parse error
constraint(type-alias): type_alias's name node is `type_identifier`, not `identifier` — using `identifier` would silently match nothing
learned(external-keyword): `external` modifier in Dart is a sibling token outside function_signature/method_signature, so existing captures already cover `external void foo();` without changes
Six items from claude's incremental review (`12:48:43Z`):
- monitoring/dashboard.json: Group the outcomes widget by both
`metric.label.outcome` and `metric.label.reason`. Previously all
failures collapsed into a single `turnstile_failed` series, which
contradicted the README claim that the `reason` label drives the
breakdown.
- monitoring/metrics/*.yaml: Narrow the metric filter to
`jsonPayload.event=("turnstile_siteverify" OR "pack_completed")`.
Without this anchor, any future code path attaching
`siteverifyDurationMs` to an unrelated log silently joins the
distribution and creates new metric label values.
- usePackRequest.ts: Mirror `progressMessage.value = null` alongside
the `progressStage.value = null` clear on token-acquisition aborted /
error branches. Prevents a future edit setting a verifying message
from leaking prior-run state.
- turnstile.test.ts: Add a focused `describe` block with five tests
asserting `siteverifyDurationMs` is attached to every post-siteverify
log (one success path + four reject branches). The metric YAML
filters on field presence, so a refactor that drops the field on any
branch would silently break the metric without other tests failing.
Uses the existing `vi.spyOn(logger, ...)` pattern; no clock injection
needed.
- monitoring/README.md: Note that the metric filter pins
`service_name="repomix-server-us"`, so future regions (`-eu`,
`-asia`) silently drop out until the filter is broadened or
per-region counterparts applied.
- monitoring/README.md: Add a `gcloud logging metrics describe` snippet
for verifying a YAML edit was actually applied (gcloud update is
silent on no-op vs effective change).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six items from gemini, claude initial review, and claude follow-up:
- turnstile.ts: Update misleading comment that claimed the metric filters
on `event=turnstile_siteverify` and `outcome=success`. The actual
Cloud Monitoring metrics in `monitoring/metrics/` filter on
`siteverifyDurationMs` field presence, which uniformly captures both
the parallel success log (event=turnstile_siteverify) and the four
rejectAndLog failure paths (event=pack_completed). The comment
contradicted README and YAML and would mislead future readers.
- turnstile.ts: Wrap rejectAndLog in a local `rejectWithDuration` helper
so every post-siteverify branch automatically carries
`siteverifyDurationMs`. Prevents drift if a fifth reject reason gets
added later.
- client.ts: Split the wire-protocol `PackProgressStage` (server-emitted
SSE values) from the display-only `DisplayProgressStage` superset that
adds `verifying`. Keeping the synthetic stage out of the wire type
prevents silent divergence with the server's `PackProgressStage`.
- usePackRequest.ts, TryItLoading.vue, TryItResult.vue: Switch the
display-side type to `DisplayProgressStage`. `onProgress` callbacks
still take the wire `PackProgressStage`.
- usePackRequest.ts: Clear `progressStage` on token-acquisition failure
branches (aborted / error). Functionally invisible since loading=false
hides the loading UI, but prevents the next submit's verifying flash
from briefly showing the previous run's stale state.
- monitoring/metrics/turnstile_siteverify_duration.yaml: Retune the
exponential bucket layout for the 100ms-1s SLO band where decisions
get made. Doubling buckets only placed ~3 boundaries between 100ms
and 1s; growthFactor=1.5 with scale=10 places ~8 boundaries there.
18 finite buckets cover 10ms to ~9.85s, comfortably above the 5s
siteverify timeout so timeouts don't land in overflow.
- monitoring/README.md: Document that pre-network rejections
(secret_missing, missing_token, token_too_long) intentionally don't
carry siteverifyDurationMs, so they're excluded from both metrics
but still appear in the existing pack_requests metric.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the metric setup from prose-only README instructions to checked-in
YAML files under `monitoring/metrics/` so the dashboard, the metrics it
depends on, and the apply commands all live next to each other.
- `turnstile_siteverify_duration.yaml`: distribution metric on
`jsonPayload.siteverifyDurationMs`, exponential buckets 1ms-32s.
- `turnstile_siteverify_outcomes.yaml`: counter metric with `outcome`
and `reason` labels for the success-vs-failure breakdown widget.
README updated with the gcloud commands.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wire up two new tiles to surface the `siteverifyDurationMs` field added
in the previous commit:
- "Turnstile siteverify latency P50 / P95 / P99" — line chart with a
1s threshold marker so a steady regression jumps off the chart.
- "Turnstile siteverify outcomes (by outcome)" — stacked area
breaking down success vs turnstile_failed counts over time.
Both depend on log-based metrics `turnstile_siteverify_duration`
(distribution) and `turnstile_siteverify_outcomes` (counter) that need
to be created once in the GCP Console — README documents the filter,
field, and label extractors so the setup is reproducible.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two changes targeting the visible "..." gap between Pack click and the
first SSE progress event observed after PR #1544 landed:
- Client: add a synthetic `verifying` PackProgressStage so the loading
UI displays "Verifying request..." while the server runs Turnstile
siteverify (typically 100-1000ms before the first 'cache-check' SSE
event arrives). The first onProgress callback from handlePackRequest
overwrites it with the real server-reported stage.
- Server: time the siteverify round-trip in `turnstileMiddleware` and
emit `siteverifyDurationMs` on every outcome (success / network
failure / rejected / action mismatch / hostname mismatch). Success
path adds a structured log with `event: turnstile_siteverify` so
Cloud Monitoring can build a log-based distribution metric for
p50/p95/p99 latency and alert on regressions during Cloudflare
incidents.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A valid `?repo=` permalink was treated as an intent signal so the
visitor's click path used a pre-minted token. claude's follow-up review
flagged that this re-creates the dashboard counter inflation this PR is
meant to fix: any third-party page driving traffic to
`https://repomix.com/?repo=<owner/repo>` (Slack / Discord / Twitter card
validators that execute JS) would mint a token per visit, regardless of
whether the visitor ever submits.
Permalink visitors now pay the cold mint on their first click; the user's
first real form interaction (typing, mode click, option tweak, file
upload) is what gates the pre-mint.
Also document the idle-tab cliff in `expired-callback`: re-arming
pre-mint there would burn a challenge every 240s for the lifetime of
an open tab, which is worse than the cold-mint-on-return cost it would
save.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Commit 2a0922d dropped the `error` ref declaration from `useTurnstile`
but left two references behind — `error.value = null` at the top of
`mintToken()` and `error` in the returned object. The first throws
ReferenceError at runtime; the second is a tsc error. No external
caller reads `useTurnstile().error` (TryIt.vue only consumes
`usePackRequest.error`), so the export was already vestigial.
Co-Authored-By: Kazuki Yamada <yamadashy@users.noreply.github.com>
- Drop the unused `error` ref from `useTurnstile`. The widget-level
error-callback writes had no observer (only `usePackRequest.error`
feeds the UI), so the export and writes were vestigial.
- Drop `getResponse` from `TurnstileGlobal`. Never called anywhere in
the codebase; clearer to leave it off the typed surface.
- Don't `console.warn` on normal cancel/timeout flows in
`acquireTurnstileToken`. Move the warn after the `signal.aborted`
check so the dev console only logs genuine challenge / script-load
failures.
- Hoist the consecutive `if (widgetId.value)` guards in `mintToken` by
capturing the rendered widget id into a local const after the throw.
- Drop the redundant `userTouched.value` check in the post-pack
pre-mint guard. `userTouched` is necessarily true at this point —
it was a precondition for `isSubmitValid` being true when the submit
started.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Mark `userTouched=true` when arriving via a valid `?repo=` permalink so
the visitor's click path uses a pre-minted token. Browser autofill and
malformed `?repo=` values still don't burn a challenge.
- After a non-aborted submit completes, schedule a fresh `preMintToken()`
in the finally block. Warms the cache for the typical "view result →
tweak options → repack" flow and for `repackWithSelectedFiles`.
- Reduce the pre-mint debounce from 500ms to 300ms. Tightens the window
where a paste-and-click cadence misses the cache.
- Split composables to fit the 250-line file-size guideline:
* Extract token cache (cache state + single-flight mint + atomic
one-shot consumption) into `useTurnstileTokenCache.ts`. Shrinks
`useTurnstile.ts` from 358 → 241 lines and lets the widget file
focus on render lifecycle / supersede logic.
* Extract pre-mint debounce trigger into `usePreMintDebounce.ts`.
* Extract Turnstile token acquisition + user-facing failure copy into
`turnstileSubmit.ts`. Drops `usePackRequest.ts` from 345 → 331
lines; `submitRequest` is a single cohesive request lifecycle that
resists further splitting.
- Drop the unused `consumed` flag on `CachedToken` (claude review). The
cache nulls the entry on consumption instead, which is what the
takeToken atomic-claim loop already relies on.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Drop the unused `invalidateCache` export from useTurnstile. Both call
paths (takeToken cache claim, expired-callback) already null
cachedToken inline, so the helper had no callers.
- Update stale `turnstile.getToken()` references in usePackRequest and
useTurnstileScript comments to match the renamed `takeToken()` /
`preMintToken()` API.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- useTurnstile: Make takeToken() one-shot under concurrency. Two callers
awaiting the same shared mintPromise both received the same token,
which siteverify rejects as `timeout-or-duplicate`. Claim the cache
atomically post-await and loop into a fresh mint if another caller won.
- usePackRequest: Drop pending pre-mint debounce timer at submitRequest
start and on unmount, and skip scheduling while loading is true. Stops
a debounce-firing-during-submit from minting an extra Turnstile
challenge alongside the click path's mint.
- TryItPackOptions: Emit userInput from option handlers and wire to
markUserTouched in TryIt. Without this, users hydrating via `?repo=`
who only tweak format/include patterns/checkboxes never tripped the
pre-mint gate, so their click path always cold-minted.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(ci-green): typos@1.45.1 flags `mis` as a typo of `miss`/`mist` and the
hyphenated `mis-classified` in the new PDF-magic regression test comment
trips the `Check typos` job. The unhyphenated `misclassified` is the more
common spelling and passes the dictionary.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(fast-click-race): When the user clicked Pack within the 500 ms pre-mint debounce window, takeToken() cold-pathed into mintToken() *and* the debounce timer later fired preMintToken() which started a second mintToken(). The generation-counter supersede logic in mintToken() rejected the first call as "Superseded by new Turnstile request", so the user's own click surfaced as a verification failure even though Turnstile would have happily issued a token.
fix(unified-startMint): Extract a single `startMint()` that both takeToken (cold path) and preMintToken share. Concurrent calls return the same in-flight promise, so only one `turnstile.execute()` ever runs and the supersede branch only triggers when there is genuinely a stale request.
fix(takeToken-abort-race): The signal threading is now via a `waitWithAbort` helper that races the awaiter against the abort signal but lets the underlying mint keep going. If the user cancels mid-mint, the underlying challenge still runs to completion and the token lands in the cache for whoever submits next, instead of being thrown away.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(latency-and-counter): The Cloudflare Turnstile dashboard's "challenges issued" counter sits at roughly 1.25× of GA page_view, which means simply loading `api.js` on every visitor (the PR #1541 pre-warm) is already inflating the counter — render() and execute() are not the only trigger. At the same time, click→token latency is the main remaining UX cost. This PR reshapes pre-warm so script load + challenge happen only when the user has shown real intent (filled a valid URL or chose a file *and* interacted with the form), achieving both a lower counter and a near-zero click→token latency.
fix(useTurnstile-api): Replace the single `getToken()` entry point with a `preMintToken()` / `takeToken()` pair backed by an in-memory `{ token, mintedAt, consumed }` cache. `preMintToken()` runs the challenge in the background and stashes the resulting token; `takeToken()` consumes the cache synchronously (instant submit) or awaits the in-flight mint, falling back to a cold mint with the supplied AbortSignal. `invalidateCache()` lets the caller drop the cache without minting a new token. TTL is bounded at 240 s — Cloudflare hard-caps tokens at 300 s, the margin absorbs clock skew and network round-trips so a token that's "almost expired" is never sent to siteverify.
fix(useTurnstile-no-mount-prewarm): Stop calling `loadTurnstileScript()` from `setContainer()`. The mount-time script load was the source of the page-view-shaped counter inflation. Pre-warm now only runs from the intent-gated trigger in `usePackRequest`, so visitors who never interact with the form never appear on the dashboard.
fix(usePackRequest-intent-gate): Add a `userTouched` ref that flips on real user interaction (input event, file upload, mode switch) and never goes back. A debounced (500 ms) `watch(isSubmitValid && userTouched)` calls `preMintToken()`, so URL-parameter hydration (`?repo=`), browser form restoration, and autofill never pre-mint. `submitRequest()` switches from `getToken()` to `takeToken()` so the cached token is consumed on the first click, with the cold mint path as a transparent fallback.
fix(TryItUrlInput-user-input-event): Emit a new `userInput` event from the URL field's actual `@input` handler. `TryIt.vue` wires it to `markUserTouched()`. Watching `inputUrl` directly would have re-fired during onMounted hydration and defeated the gate.
learned(cloudflare-counter-includes-script-load): Even with `execution: 'execute'`, the Cloudflare Turnstile dashboard counts api.js loads toward "challenges issued" (verified by comparing GA page_view ≈ 106 with CF issued ≈ 132 in the same 30 min window after the PR #1541 deploy). Treat any new place that loads api.js as a billable analytics side effect.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Match the frontmatter convention of the sibling review-loop.md so the
description shows up consistently in tooling that surfaces commands.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a review/triage/fix/verify/re-review cycle command that delegates
review to the codex reviewer agent. Sibling to the existing review-loop
command, but explicitly uses codex as the reviewer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(comment-drift): Claude's review surfaced three stale comments that no longer match the code after pre-warm was narrowed to the script-load step. No behaviour change — the comments were lying about what the code actually does now.
fix(useTurnstileScript-jsdoc): The JSDoc on `TurnstileRenderOptions.execution` claimed `'execute'` was chosen so the pre-warm path could render() without firing a challenge. PR #1541 proved that's not what `'execute'` does in practice — render() itself counts toward the dashboard's challenge counters, regardless of `execution`. Rewrote the comment to explain why we still pass `'execute'` (token-mint guardrail) and why we no longer pre-warm by rendering.
fix(useTurnstile-render-comment): The inline comment at the render() call site said this option is "what makes the pre-warm in setContainer() free of side-effects". setContainer() no longer calls render(), so the rationale is obsolete. Updated to describe the current role: a guardrail against an accidental render() minting a token before getToken() is ready.
fix(useTurnstile-race-comment): The single-flight cache comment said "pre-warm and getToken() can both race past the widgetId.value null check". Pre-warm doesn't enter ensureWidget anymore, so only back-to-back getToken() submits can race. Updated to reflect the narrower scope.
perf(preconnect-crossorigin): Add a `crossorigin` companion to the existing `<link rel="preconnect">`. Turnstile's `api.js` is fetched anonymously, but the Turnstile iframe issues CORS sub-requests on a separate browser connection pool. Without both hints, the iframe's first handshake still happens on click. Cheap defensive addition.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(token-waste): Production telemetry (12-hour Cloudflare dashboard sample) showed 2,620 challenges issued and 986 solved against only ~150-200 actual pack clicks tracked in GA. Despite the widget being configured with `execution: 'execute'`, calling `turnstile.render()` at form-mount time was inflating the dashboard's challenge counters by every visitor — humans, crawlers, ad-blocked browsers, abandoned tabs. Cloudflare's docs say render() should be side-effect free in execute mode, but the analytics disagree.
fix(prewarm-scope): Drop the widget render from `setContainer()`. Pre-warm now only calls `loadTurnstileScript()` so the script is cached before the user clicks; the actual `turnstile.render()` happens on the first `getToken()` call. This restores the documented 1:1 relationship between solved challenges and actual pack submissions.
perf(preconnect): Add `preconnect` and `dns-prefetch` hints to challenges.cloudflare.com so the DNS / TLS / HTTP/2 handshake is warm before the click. Compensates for losing the render() pre-warm — the script load and the challenge round-trip both reuse the warmed connection.
learned(don't-trust-docs-when-telemetry-disagrees): When the dashboard says one thing and the docs say another, the dashboard wins. The previous PR added pre-warm believing render() was inert in execute mode; the analytics showed it wasn't. Going forward, treat any new render() call site as a billable side effect until proved otherwise on the dashboard.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>