`bun install` fires the `prepare` lifecycle hook before `setup-node`
runs, so it executes against the runner's default Node (currently
< 22). `node --run` is unrecognized there and `bun install` aborts
with exit code 9, breaking every Bun matrix job.
Other inner `node --run` script changes (browser, website-client,
website-server lint, browser build-all) are kept as-is — they only
fire when explicitly invoked from CI steps that have already run
`setup-node`, so the Node-version prerequisite is guaranteed.
Reverts the `prepare` portion of 5a1423e.
Follow up to commit 042750c which only converted workflow-level
invocations. With Node.js 22 as the floor, the chained scripts inside
each package.json can also use `node --run` directly, dropping the
intermediate npm process when these scripts run.
- root `prepare`
- browser `build-all`, `lint`
- website/client `lint`
- website/server `lint`
Node.js 20 reaches end-of-life on 2026-04-30, so raise the minimum
supported version to 22 (the next active LTS) and add Node.js 26 to the
CI matrix as the current release line.
- Bump engines.node to >=22.0.0 in package.json and scripts/memory
- Update CI matrix to [22.x, 24.x, 26.x] (drop 20.x and 25.x; 25.x EOL 2026-06)
- Update test-action.yml matrix to [22, 24, 26]
- Drop the obsolete `node --run` workaround comment in ci.yml since
`node --run` is supported on all matrix versions
- Update Node.js version mentions in English docs, llms-install.md,
configShard, bug report template, and code samples in hi/vi
github-actions guides
Dockerfile (node:22-slim) is intentionally left at the minimum supported
version so the published image confirms Repomix runs on the floor.
Carries upstream Dart 3.10 dot-shorthand and external-member fixes,
unblocking the additions in this PR. Previously deferred by
.npmrc min-release-age=7; 0.1.17 was published 2026-04-22 so the
window is now clear.
intent(schema-generation): website-generate-schema silently threw after the zod→valibot migration because generateSchema.ts still called z.toJSONSchema on a valibot schema — the script's `.catch(console.error)` was hiding the TypeError in CI
decision(schema-converter): @valibot/to-json-schema over maintaining a parallel zod schema — keeps the source of truth in src/config/configSchema.ts as the only schema definition
constraint(vscode-intellisense): must preserve additionalProperties: false on generated objects so editors still flag typos in repomix.config.json — @valibot/to-json-schema omits it, so generateSchema now walks the tree to add it
learned(valibot-json-schema-gap): v.object strips unknown keys at runtime but the converter does not emit additionalProperties: false. Post-processing is the lightest-touch fix; switching to v.strictObject would make runtime parsing throw on extra keys, which is a breaking change for existing users
intent(cold-start): evaluate valibot as a simpler alternative to PR #1484's async deferral of zod
decision(validation-library): valibot v1 picked for smaller module graph and tree-shaking friendliness
decision(error-handling): rename rethrowValidationErrorIfZodError to rethrowValidationErrorIfSchemaError and duck-type both ZodError and ValiError so the helper stays library-agnostic
constraint(module-unwrap): jiti.import returns an ESM Module namespace even with interopDefault — zod v4 quietly unwrapped it, valibot does not, so configLoad manually picks .default
learned(zod-v4-modules): z.object().parse(moduleNamespace) reaches into .default automatically — not documented prominently and was masking the jiti interop gap
learned(cold-start-impact): ~10ms (~6%) faster startup in hyperfine A/B — below the -15ms bar from the task spec, but worth keeping as a branch for the simplicity win
The previous @secretlint/profiler optimization (cfc626a) silently regressed
after the merge from main upgraded @secretlint/core to 11.5.0. That release
declares an exact-version peer dep on @secretlint/profiler@11.5.0, which
forced npm to install a nested second copy under
`node_modules/@secretlint/core/node_modules/@secretlint/profiler` alongside
our top-level 11.4.1 profiler. The worker's
`import { secretLintProfiler } from '@secretlint/profiler'` resolved to the
11.4.1 singleton, so the `Object.defineProperty(secretLintProfiler, 'mark', ...)`
no-op patched the wrong instance — the copy @secretlint/core actually calls
at lint time (11.5.0) kept running its O(n^2) PerformanceObserver bookkeeping.
As a result the security phase was back to ~2.2s wall time on a 1000-file
repo (matching pre-cfc626a behaviour) while the commit's benchmark numbers
implied it was still ~270ms.
Switch the neutralization to a strictly more robust primitive: overwrite
`perf_hooks.performance.mark` with a no-op inside the worker thread. Every
copy of @secretlint/profiler — hoisted, nested, or future — calls
`this.perf.mark(...)` on the single Node built-in `performance` object it
received at construction time from `node:perf_hooks`, so a single assignment
on that object silences every profiler instance simultaneously. Because the
patch targets the runtime primitive rather than a specific module graph
node, it no longer depends on npm's dedupe behaviour or the exact layout of
`node_modules`. The worker thread is isolated and runs only secretlint; no
other code in it depends on `performance.mark`, so there is no observable
side effect.
Also removes the now-unused direct `@secretlint/profiler` dependency from
package.json.
1. Both profiler singletons construct `new SecretLintProfiler({ perf: perf_hooks.performance, ... })`
2. Both call `this.perf.mark(marker.type)` for every event
3. Overwriting `perf_hooks.performance.mark` on the shared Node built-in
makes both calls a no-op, so the PerformanceObserver callback never
fires, the `entries` array stays empty, and the O(n^2) `entries.find()`
scan is eliminated
Verified on this host (4-core Linux container) against origin/main, which
is what the CI `perf-benchmark.yml` workflow compares. Interleaved pairs,
no verbose mode, 40 runs.
| Stat | main (ms) | PR (ms) | Δ |
|--------------|-----------|---------|-----------------|
| min | 1965 | 1914 | -51ms (-2.60%) |
| median | 2181 | 2066 | -115ms (-5.27%) |
| trimmed mean | 2216 | 2103 | -113ms (-5.11%) |
| mean | 2282 | 2159 | -123ms (-5.38%) |
| Stat | main (ms) | PR (ms) | Δ |
|--------------|-----------|---------|-----------------|
| min | 2445 | 2309 | -136ms (-5.56%) |
| median | 2628 | 2495 | -133ms (-5.06%) |
| trimmed mean | 2624 | 2509 | -114ms (-4.36%) |
| mean | 2658 | 2518 | -141ms (-5.29%) |
| Phase | branch pre-fix | PR | Δ |
|--------------------|----------------|----------|------------|
| File collection | ~471 ms | ~446 ms | ~-25 ms |
| File processing | ~103 ms | ~95 ms | ~-8 ms |
| **Security check** | **~2229 ms** | **~217 ms** | **~-2012 ms** |
| Git log tokens | ~446 ms | ~451 ms | ~0 |
| Selective metrics | ~454 ms | ~455 ms | ~0 |
| Output token count | ~610 ms | ~610 ms | ~0 |
In non-verbose mode the security-phase work overlaps heavily with other
async phases, so end-to-end savings are smaller than the phase-timer drop.
Verbose mode amplifies the observer cost via extra logger traces, making
the isolation of the profiler bottleneck visible.
- [x] `npm run lint` — 0 errors, 2 pre-existing warnings in unrelated files
- [x] `npm run test` — 1102/1102 tests pass
- [x] Secret detection still works end-to-end (verified by packing a
fixture file containing an RSA private key block; secretlint still
flags and excludes it)
@secretlint/profiler is a module-level singleton loaded transitively by
@secretlint/core. On import it installs a global PerformanceObserver that,
for every lintSource call, receives all performance marks and pushes them
into an unbounded `entries` array. For each incoming mark it also runs an
O(n) `entries.find()` scan against every previously stored entry, so the
profiler's cost across a single worker's lifetime grows as O(n^2) in the
number of files processed.
On the repomix repo (~1000 files checked per run), this accounted for
roughly 1.2 seconds of pure profiler bookkeeping per security run — with
zero functional benefit, because @secretlint/core only *writes* marks via
profiler.mark() and never reads getEntries() / getMeasures() during a lint.
Fix: replace `secretLintProfiler.mark` with a no-op at worker module load
via `Object.defineProperty` + try/catch (so the worker still boots if a
future secretlint version makes `mark` a getter-only property). Because
mark() is the only method that calls performance.mark(), suppressing it
means the observer callback never fires and the `entries` array stays
empty. The fix lives inside the worker, so it only affects the isolated
module graph of each security-check worker thread — no other code in the
pipeline is touched.
Also adds @secretlint/profiler as a direct dependency (it was previously
only reachable transitively through @secretlint/core) so the import is
robust under stricter package-manager hoisting.
Interleaved paired runs on the repomix repo itself (20-40 runs / session,
sequential spawnSync driver):
| Phase | MAIN (ms) | PR (ms) | Δ |
|--------------------------|------------|----------|-----------|
| File collection | ~500 | ~500 | ~0 |
| File processing | ~100 | ~100 | ~0 |
| **Security check** | **~1500** | **~270** | **~-1230** |
| Git diff token calc | ~630 | ~595 | ~-35 |
| Git log token calc | ~640 | ~595 | ~-45 |
| Selective metrics | ~660 | ~620 | ~-40 |
| Output token count | ~830 | ~810 | ~-20 |
The internal security-phase timer drops by ~1.2 seconds per run, which
matches the O(n²) profiler theory.
| Stat | MAIN (ms) | PR (ms) | Δ |
|-------------------|-----------|---------|------------|
| min | 1890 | 1772 | |
| p25 | 2090 | 1937 | |
| **median** | **2220** | **2075**| **-144 (-6.50%)** |
| p75 | 2338 | 2237 | |
| **trimmed mean** | **2207** | **2087**| **-121 (-5.47%)** |
| paired median Δ | | | **-99ms** |
(40 interleaved pairs, same machine; best of multiple sessions. Local
machine has high run-to-run variance — sessions range from ~3% to ~6.5%
— so the CI benchmark is the authoritative end-to-end signal.)
- [x] `npm run lint` — 0 errors, 2 pre-existing warnings in unrelated files
- [x] `npm run test` — 1102/1102 tests pass
- [x] Secret detection still works end-to-end (verified by packing a
fixture file containing a fake AWS access key; secretlint still
flags and excludes it)
Add hyperfine to devcontainer and a bench-cores.sh script that runs
hyperfine benchmarks under different CPU core counts using taskset.
This enables verifying performance optimizations across varying levels
of CPU parallelism.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The single binary copy approach doesn't work reliably since worker files
may live outside repomix.cjs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add `bench` and `bench-compare` npm scripts for measuring CLI performance
with hyperfine. `bench` builds first then runs 10 iterations with 2 warmup
runs. `bench-compare` skips the build step for A/B comparisons against a
pre-built baseline.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace tiktoken (WASM-based) with gpt-tokenizer (pure JS) for token
counting, eliminating ~200ms WASM initialization overhead while keeping
the existing worker pool infrastructure for parallel processing.
Changes:
- Swap tiktoken dependency for gpt-tokenizer in package.json
- Rewrite TokenCounter to use gpt-tokenizer's async dynamic import
with lazy-loaded encoding modules cached at module level
- Add TOKEN_ENCODINGS constant with Zod enum validation in config
schema, replacing unsafe type assertion
- Use { disallowedSpecial: new Set() } to match tiktoken's
encode(content, [], []) behavior (treat all text as plain text)
- Add p50k_edit encoding for backward compatibility
- Update worker to handle async getTokenCounter initialization
- Rewrite tests to use real gpt-tokenizer with exact token counts
The worker pool, parallel chunk processing, and pre-warming
infrastructure are preserved — only the underlying tokenizer changes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace log-update dependency with picospinner (from tinylibs) to reduce
transitive dependencies. picospinner provides built-in spinner functionality
(frames, symbols, succeed/fail states) that was previously manually
implemented on top of log-update, simplifying cliSpinner.ts.
This removes 12 transitive packages (ansi-escapes, cli-cursor, slice-ansi,
wrap-ansi, string-width, etc.) from the dependency tree.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
prepack does not reliably run when installing repomix as a git dependency,
causing the server Docker build to fail (lib/ directory missing).
Change to prepare which reliably runs for git dependency installation.
Also change npm ci --omit=dev to npm prune --omit=dev in Dockerfile and CI,
since npm prune does not trigger lifecycle scripts (avoiding prepare failure
when devDependencies are no longer available).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fast-xml-parser has accumulated 10 CVEs (6 in 2026 alone), with a recurring
pattern of incomplete fixes in its DOCTYPE/entity parser. Since Repomix only
uses the XMLBuilder functionality (not the parser), switching to
fast-xml-builder — the standalone builder package that fast-xml-parser v5
internally delegates to — eliminates 9/10 parser-side CVE noise while
maintaining identical behavior.
- Replace fast-xml-parser (831KB) with fast-xml-builder (176KB) as dependency
- Add @xmldom/xmldom as devDependency for XML validation in tests
- Update import in outputGenerate.ts (named → default export)
- Migrate test XML parsing from fast-xml-parser's XMLParser to @xmldom/xmldom's
DOMParser, providing cross-implementation validation of generated XML
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The `prepare` script runs on `npm ci` and `npm install`, which caused
Docker and CI failures when devDeps (rimraf, tsc) were unavailable.
`prepack` only runs on `npm pack`, `npm publish`, and git dependency
install — avoiding the issue while still supporting the website/server
git dependency requirement.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous ZIP-based archive download used fflate's in-memory extraction,
which failed on large repositories (e.g. facebook/react) due to memory
constraints and ZIP64 limitations.
Switch to tar.gz format with Node.js built-in zlib + tar package, enabling
a full streaming pipeline (HTTP response -> gunzip -> tar extract -> disk)
with no temporary files and constant memory usage regardless of repo size.
Key changes:
- Replace fflate with tar package for archive extraction
- Change archive URLs from .zip to .tar.gz
- Use streaming pipeline instead of download-then-extract
- Leverage tar's built-in strip and path traversal protection
- Explicitly destroy streams after pipeline for Bun compatibility
- Use child_process runtime under Bun to avoid worker_threads hang
Source maps (.js.map, .d.ts.map) were included in the npm package but
served no purpose since the original TypeScript source files are not
published. Without sourcesContent or the actual .ts files, these maps
cannot be used for debugging or "Go to Definition" functionality.
Changes:
- Disable sourceMap and declarationMap in tsconfig.build.json
- Enable removeComments to further reduce package size
- Explicitly set declaration: true for clarity
- Remove --sourceMap flag from build script (now handled by tsconfig)
This reduces the lib/ directory size from 2.4MB to 1.2MB (~50% reduction).
Add Docker Compose configuration to run the server in bundled mode locally,
similar to the production Cloud Run environment. This enables testing of
esbuild bundling, compile cache, and startup performance without deploying.
Usage: npm run website-bundle
Enable building repomix when installed directly from GitHub repository.
The prepare script runs `npm run build` which is triggered when:
- Installing from GitHub (e.g., `npm install github:yamadashy/repomix#main`)
- Running `npm install` locally for development
This does not affect npm registry installs since the published tarball
already includes the pre-built lib/ directory.
- Update web-tree-sitter from ^0.25.10 to ^0.26.3
- Update @repomix/tree-sitter-wasms from ^0.1.15 to ^0.1.16
The tree-sitter-wasms package was rebuilt with tree-sitter-cli 0.26.x
to ensure WASM compatibility with the new web-tree-sitter version.
Key changes in web-tree-sitter 0.26.x:
- Rewritten in TypeScript with better type definitions
- ESM and CommonJS dual module support
- WASM build switched from emscripten to wasi-sdk