Files
repomix-mirror/tests/core/packager/splitOutput.test.ts
Claude fb4c895085 perf(core): Pre-warm security worker pool to overlap @secretlint/core load
The security worker pool currently spawns its 2 workers lazily inside
`runSecurityCheck`, paying a ~50 ms `@secretlint/core` +
`@secretlint/secretlint-rule-preset-recommend` module load on each
freshly spawned worker (~100 ms wall-clock for both workers loading
concurrently). That cold-start cost runs on the critical path inside
the security-check phase, before any scanning begins.

Mirror the existing `createMetricsTaskRunner` pattern: hoist the pool
construction to `pack()` and dispatch one no-op task per worker at the
pipeline entry, so the module load overlaps with the collectFiles + git
ops phase (~200 ms) instead of stalling the security check.

## Mechanism

- New `createSecurityTaskRunner(numOfTasks, deps?)` in
  `src/core/security/securityCheck.ts` returns
  `{ taskRunner, warmupPromise }`. The warm-up dispatches `maxThreads`
  no-op tasks (`{ items: [] }`) — Tinypool spawns a fresh worker for
  each concurrent task, fanning out the @secretlint/core load across
  all workers in parallel.
- `runSecurityCheck` accepts an optional `taskRunner` in `deps`. When
  provided, the caller owns the pool's lifecycle (creation + cleanup);
  when omitted, runSecurityCheck creates and cleans up a fresh pool —
  preserving the existing behavior for direct callers (e.g. the MCP
  fileSystemReadFileTool path).
- `validateFileSafety` accepts and forwards an optional `taskRunner`.
- `pack()` calls `createSecurityTaskRunner` after `searchFiles` resolves
  (file count is now known) and before the parallel collectFiles + git
  ops block, so the warm-up runs concurrently with disk I/O. The
  task runner is plumbed through `validateFileSafety` deps; the pool
  is cleaned up alongside the metrics pool in the surrounding
  try/finally.

## Scope gate

Pre-warming is gated on the same `hasExplicitScope` heuristic that
already differentiates 2- vs. 3-worker metrics warm-up:

| Workload                                         | Pre-warm? |
|--------------------------------------------------|-----------|
| Default scan (no `--include` / `--stdin`)        | yes       |
| `--include`, `config.include`, or `--stdin` set  | no        |

Without the gate, the small/scoped workload regresses by 3.4 % paired
mean: the security check scans only ~5 batches and finishes in ~50–80
ms, so the up-front cost of constructing + destroying a second worker
pool outweighs the saved cold-start. The unconstrained scan runs
security over ~1000+ files where the hidden cold-start dominates.

## Benchmark — `node bin/repomix.cjs --quiet` (1046 files)

Two independent paired n=50 runs (interleaved BEFORE/AFTER alternating
order, NODE_DISABLE_COMPILE_CACHE=1):

|        | min     | median  | mean    | max     | sd     |
|--------|---------|---------|---------|---------|--------|
| BEFORE | 1320 ms | 1454 ms | 1451 ms | 1590 ms | 49 ms  |
| AFTER  | 1318 ms | 1410 ms | 1416 ms | 1501 ms | 40 ms  |

- Mean paired Δ:   **+35.2 ms (2.42 % wall-clock reduction)**
- Median paired Δ: +32.5 ms (2.23 %)
- Paired-delta SD: 64.78 ms · paired t = **3.84** (p < 0.001)
- AFTER faster in **39/50** pairs (78 %)

Confirmation run (same setup, n=50): mean Δ +37.0 ms (2.55 %),
t = 3.93, 36/50 pairs faster.

## Regression check — `--include 'src,tests' --quiet` (258 files)

n=30 paired interleaved, NODE_DISABLE_COMPILE_CACHE=1:

|        | min    | median | mean   | max    |
|--------|--------|--------|--------|--------|
| BEFORE | 670 ms | 732 ms | 730 ms | 783 ms |
| AFTER  | 688 ms | 728 ms | 729 ms | 786 ms |

- Mean paired Δ:   +0.9 ms (0.13 %) — **neutral within noise**
  (paired t = 0.17)
- AFTER faster in 16/30 pairs

The gate falls back to the original lazy-spawn path on this workload,
so AFTER == BEFORE up to noise. Without the gate this workload
regresses by 3.4 % paired (t = -4.88).

## Correctness

- All **1260** unit tests pass (`npm test`); `npm run lint` clean
  (only the two pre-existing `biome-ignore` warnings unrelated to
  this change).
- XML output **byte-identical** between BEFORE and AFTER on both the
  default 1046-file workload and the `--include 'src,tests'`
  258-file workload (verified via `diff` on full ~4.85 MB outputs).
- `runSecurityCheck`'s public signature gains an optional `taskRunner`
  in deps; when omitted, behavior is unchanged. Existing callers
  outside the pack pipeline (e.g. MCP `fileSystemReadFileTool`) still
  spawn their own pool.
- The MCP main-thread security path is unaffected — it uses
  `runSecretLint` directly (worker module loaded once at process
  start) and never goes through the pool.

## Tests

- `tests/core/security/validateFileSafety.test.ts` — assertion on the
  `runSecurityCheck` call updated to include the new `{ taskRunner }`
  deps argument (currently undefined when no pre-warmed runner is
  provided).
- `tests/core/packager.test.ts`,
  `tests/core/packager/diffsFunctionality.test.ts`,
  `tests/core/packager/splitOutput.test.ts`,
  `tests/integration-tests/packager.test.ts` — extended `mockDeps` /
  `baseDeps` with a stubbed `createSecurityTaskRunner` so the default
  scope path no longer attempts to spawn a real worker pool from the
  test environment. The pack-level assertion on `validateFileSafety`
  now matches the new 6th-argument deps object via
  `expect.objectContaining({ taskRunner: expect.any(Object) })`.
2026-05-08 17:05:51 +00:00

103 lines
3.3 KiB
TypeScript

import { describe, expect, it, vi } from 'vitest';
import { pack } from '../../../src/core/packager.js';
import { createMockConfig } from '../../testing/testUtils.js';
describe('packager split output', () => {
it('passes split output results correctly through the packager', async () => {
const processedFiles = [
{ path: 'a/file1.txt', content: '11111' },
{ path: 'b/file2.txt', content: '22222' },
];
const allFilePaths = ['a/file1.txt', 'b/file2.txt'];
const mockConfig = createMockConfig({
cwd: '/test',
output: {
filePath: 'repomix-output.xml',
splitOutput: 12,
copyToClipboard: false,
stdout: false,
git: {
includeDiffs: false,
includeLogs: false,
},
},
});
const produceOutput = vi.fn().mockResolvedValue({
outputFiles: ['repomix-output.1.xml', 'repomix-output.2.xml'],
outputForMetrics: ['x'.repeat(10), 'x'.repeat(10)],
});
const calculateMetrics = vi.fn().mockResolvedValue({
totalFiles: 2,
totalCharacters: 0,
totalTokens: 0,
fileCharCounts: {},
fileTokenCounts: {},
gitDiffTokenCount: 0,
gitLogTokenCount: 0,
});
const result = await pack(['root'], mockConfig, () => {}, {
searchFiles: vi.fn().mockResolvedValue({ filePaths: allFilePaths, emptyDirPaths: [] }),
sortPaths: vi.fn().mockImplementation((paths) => paths),
collectFiles: vi.fn().mockResolvedValue({ rawFiles: processedFiles, skippedFiles: [] }),
processFiles: vi.fn().mockReturnValue(processedFiles),
validateFileSafety: vi.fn().mockResolvedValue({
safeFilePaths: allFilePaths,
safeRawFiles: processedFiles,
suspiciousFilesResults: [],
suspiciousGitDiffResults: [],
suspiciousGitLogResults: [],
}),
getGitDiffs: vi.fn().mockResolvedValue(undefined),
getGitLogs: vi.fn().mockResolvedValue(undefined),
produceOutput,
calculateMetrics,
createMetricsTaskRunner: vi.fn().mockReturnValue({
taskRunner: {
run: vi.fn().mockResolvedValue(0),
cleanup: vi.fn().mockResolvedValue(undefined),
},
warmupPromise: Promise.resolve(),
}),
createSecurityTaskRunner: vi.fn().mockReturnValue({
taskRunner: {
run: vi.fn().mockResolvedValue([]),
cleanup: vi.fn().mockResolvedValue(undefined),
},
warmupPromise: Promise.resolve(),
}),
});
expect(produceOutput).toHaveBeenCalledWith(
['root'],
mockConfig,
processedFiles,
allFilePaths,
undefined,
undefined,
expect.any(Function),
[{ rootLabel: 'root', files: allFilePaths }],
undefined,
);
expect(calculateMetrics).toHaveBeenCalledWith(
processedFiles,
expect.anything(),
expect.anything(),
mockConfig,
undefined,
undefined,
expect.objectContaining({ taskRunner: expect.anything() }),
);
// Verify that calculateMetrics received a promise that resolves to the expected split output
const outputArg = calculateMetrics.mock.calls[0][1];
await expect(outputArg).resolves.toEqual(['x'.repeat(10), 'x'.repeat(10)]);
expect(result.outputFiles).toEqual(['repomix-output.1.xml', 'repomix-output.2.xml']);
});
});