mirror of
https://github.com/yamadashy/repomix.git
synced 2026-05-30 11:18:53 +02:00
fb4c895085
The security worker pool currently spawns its 2 workers lazily inside
`runSecurityCheck`, paying a ~50 ms `@secretlint/core` +
`@secretlint/secretlint-rule-preset-recommend` module load on each
freshly spawned worker (~100 ms wall-clock for both workers loading
concurrently). That cold-start cost runs on the critical path inside
the security-check phase, before any scanning begins.
Mirror the existing `createMetricsTaskRunner` pattern: hoist the pool
construction to `pack()` and dispatch one no-op task per worker at the
pipeline entry, so the module load overlaps with the collectFiles + git
ops phase (~200 ms) instead of stalling the security check.
## Mechanism
- New `createSecurityTaskRunner(numOfTasks, deps?)` in
`src/core/security/securityCheck.ts` returns
`{ taskRunner, warmupPromise }`. The warm-up dispatches `maxThreads`
no-op tasks (`{ items: [] }`) — Tinypool spawns a fresh worker for
each concurrent task, fanning out the @secretlint/core load across
all workers in parallel.
- `runSecurityCheck` accepts an optional `taskRunner` in `deps`. When
provided, the caller owns the pool's lifecycle (creation + cleanup);
when omitted, runSecurityCheck creates and cleans up a fresh pool —
preserving the existing behavior for direct callers (e.g. the MCP
fileSystemReadFileTool path).
- `validateFileSafety` accepts and forwards an optional `taskRunner`.
- `pack()` calls `createSecurityTaskRunner` after `searchFiles` resolves
(file count is now known) and before the parallel collectFiles + git
ops block, so the warm-up runs concurrently with disk I/O. The
task runner is plumbed through `validateFileSafety` deps; the pool
is cleaned up alongside the metrics pool in the surrounding
try/finally.
## Scope gate
Pre-warming is gated on the same `hasExplicitScope` heuristic that
already differentiates 2- vs. 3-worker metrics warm-up:
| Workload | Pre-warm? |
|--------------------------------------------------|-----------|
| Default scan (no `--include` / `--stdin`) | yes |
| `--include`, `config.include`, or `--stdin` set | no |
Without the gate, the small/scoped workload regresses by 3.4 % paired
mean: the security check scans only ~5 batches and finishes in ~50–80
ms, so the up-front cost of constructing + destroying a second worker
pool outweighs the saved cold-start. The unconstrained scan runs
security over ~1000+ files where the hidden cold-start dominates.
## Benchmark — `node bin/repomix.cjs --quiet` (1046 files)
Two independent paired n=50 runs (interleaved BEFORE/AFTER alternating
order, NODE_DISABLE_COMPILE_CACHE=1):
| | min | median | mean | max | sd |
|--------|---------|---------|---------|---------|--------|
| BEFORE | 1320 ms | 1454 ms | 1451 ms | 1590 ms | 49 ms |
| AFTER | 1318 ms | 1410 ms | 1416 ms | 1501 ms | 40 ms |
- Mean paired Δ: **+35.2 ms (2.42 % wall-clock reduction)**
- Median paired Δ: +32.5 ms (2.23 %)
- Paired-delta SD: 64.78 ms · paired t = **3.84** (p < 0.001)
- AFTER faster in **39/50** pairs (78 %)
Confirmation run (same setup, n=50): mean Δ +37.0 ms (2.55 %),
t = 3.93, 36/50 pairs faster.
## Regression check — `--include 'src,tests' --quiet` (258 files)
n=30 paired interleaved, NODE_DISABLE_COMPILE_CACHE=1:
| | min | median | mean | max |
|--------|--------|--------|--------|--------|
| BEFORE | 670 ms | 732 ms | 730 ms | 783 ms |
| AFTER | 688 ms | 728 ms | 729 ms | 786 ms |
- Mean paired Δ: +0.9 ms (0.13 %) — **neutral within noise**
(paired t = 0.17)
- AFTER faster in 16/30 pairs
The gate falls back to the original lazy-spawn path on this workload,
so AFTER == BEFORE up to noise. Without the gate this workload
regresses by 3.4 % paired (t = -4.88).
## Correctness
- All **1260** unit tests pass (`npm test`); `npm run lint` clean
(only the two pre-existing `biome-ignore` warnings unrelated to
this change).
- XML output **byte-identical** between BEFORE and AFTER on both the
default 1046-file workload and the `--include 'src,tests'`
258-file workload (verified via `diff` on full ~4.85 MB outputs).
- `runSecurityCheck`'s public signature gains an optional `taskRunner`
in deps; when omitted, behavior is unchanged. Existing callers
outside the pack pipeline (e.g. MCP `fileSystemReadFileTool`) still
spawn their own pool.
- The MCP main-thread security path is unaffected — it uses
`runSecretLint` directly (worker module loaded once at process
start) and never goes through the pool.
## Tests
- `tests/core/security/validateFileSafety.test.ts` — assertion on the
`runSecurityCheck` call updated to include the new `{ taskRunner }`
deps argument (currently undefined when no pre-warmed runner is
provided).
- `tests/core/packager.test.ts`,
`tests/core/packager/diffsFunctionality.test.ts`,
`tests/core/packager/splitOutput.test.ts`,
`tests/integration-tests/packager.test.ts` — extended `mockDeps` /
`baseDeps` with a stubbed `createSecurityTaskRunner` so the default
scope path no longer attempts to spawn a real worker pool from the
test environment. The pack-level assertion on `validateFileSafety`
now matches the new 6th-argument deps object via
`expect.objectContaining({ taskRunner: expect.any(Object) })`.
118 lines
5.2 KiB
TypeScript
118 lines
5.2 KiB
TypeScript
import { afterEach, describe, expect, it, vi } from 'vitest';
|
|
import type { RepomixConfigMerged } from '../../../src/config/configSchema.js';
|
|
import type { RawFile } from '../../../src/core/file/fileTypes.js';
|
|
import type { SuspiciousFileResult } from '../../../src/core/security/securityCheck.js';
|
|
import { validateFileSafety } from '../../../src/core/security/validateFileSafety.js';
|
|
import { logger } from '../../../src/shared/logger.js';
|
|
import type { RepomixProgressCallback } from '../../../src/shared/types.js';
|
|
|
|
describe('validateFileSafety', () => {
|
|
it('should validate file safety and return safe files and paths', async () => {
|
|
const rawFiles: RawFile[] = [
|
|
{ path: 'file1.txt', content: 'content1' },
|
|
{ path: 'file2.txt', content: 'content2' },
|
|
{ path: 'file3.txt', content: 'content3' },
|
|
];
|
|
const safeRawFiles = [rawFiles[0], rawFiles[1]];
|
|
const config: RepomixConfigMerged = {
|
|
security: { enableSecurityCheck: true },
|
|
} as RepomixConfigMerged;
|
|
const progressCallback: RepomixProgressCallback = vi.fn();
|
|
const suspiciousFilesResults: SuspiciousFileResult[] = [
|
|
{ filePath: 'file2.txt', messages: ['something suspicious.'], type: 'file' },
|
|
];
|
|
const deps = {
|
|
runSecurityCheck: vi.fn().mockResolvedValue(suspiciousFilesResults),
|
|
filterOutUntrustedFiles: vi.fn().mockReturnValue(safeRawFiles),
|
|
};
|
|
|
|
const result = await validateFileSafety(rawFiles, progressCallback, config, undefined, undefined, deps);
|
|
|
|
expect(deps.runSecurityCheck).toHaveBeenCalledWith(rawFiles, progressCallback, undefined, undefined, {
|
|
taskRunner: undefined,
|
|
});
|
|
expect(deps.filterOutUntrustedFiles).toHaveBeenCalledWith(rawFiles, suspiciousFilesResults);
|
|
expect(result).toEqual({
|
|
safeRawFiles,
|
|
safeFilePaths: ['file1.txt', 'file2.txt'],
|
|
suspiciousFilesResults,
|
|
suspiciousGitDiffResults: [],
|
|
suspiciousGitLogResults: [],
|
|
});
|
|
});
|
|
|
|
describe('git content warnings', () => {
|
|
afterEach(() => {
|
|
vi.restoreAllMocks();
|
|
});
|
|
|
|
it('warns about each suspicious git diff/log entry with singular/plural form', async () => {
|
|
const warnSpy = vi.spyOn(logger, 'warn').mockImplementation(() => {});
|
|
const config: RepomixConfigMerged = {
|
|
security: { enableSecurityCheck: true },
|
|
} as RepomixConfigMerged;
|
|
const progressCallback: RepomixProgressCallback = vi.fn();
|
|
|
|
const allResults: SuspiciousFileResult[] = [
|
|
{ filePath: 'work_tree', messages: ['leaked key'], type: 'gitDiff' },
|
|
{ filePath: 'staged', messages: ['leaked key', 'token'], type: 'gitDiff' },
|
|
{ filePath: 'commit_log', messages: ['secret', 'pw', 'token'], type: 'gitLog' },
|
|
];
|
|
const deps = {
|
|
runSecurityCheck: vi.fn().mockResolvedValue(allResults),
|
|
filterOutUntrustedFiles: vi.fn().mockReturnValue([]),
|
|
};
|
|
|
|
const result = await validateFileSafety([], progressCallback, config, undefined, undefined, deps);
|
|
|
|
expect(result.suspiciousGitDiffResults).toHaveLength(2);
|
|
expect(result.suspiciousGitLogResults).toHaveLength(1);
|
|
|
|
const warnings = warnSpy.mock.calls.map((c) => String(c[0]));
|
|
// Header lines for each section
|
|
expect(warnings).toContain('Security issues found in Git diffs, but they will still be included in the output');
|
|
expect(warnings).toContain('Security issues found in Git logs, but they will still be included in the output');
|
|
// Singular form for 1 message, plural for >1
|
|
expect(warnings).toContain(' - work_tree: 1 issue detected');
|
|
expect(warnings).toContain(' - staged: 2 issues detected');
|
|
expect(warnings).toContain(' - commit_log: 3 issues detected');
|
|
});
|
|
|
|
it('does not log a header when no suspicious git content is found', async () => {
|
|
const warnSpy = vi.spyOn(logger, 'warn').mockImplementation(() => {});
|
|
const config: RepomixConfigMerged = {
|
|
security: { enableSecurityCheck: true },
|
|
} as RepomixConfigMerged;
|
|
|
|
await validateFileSafety([], vi.fn(), config, undefined, undefined, {
|
|
runSecurityCheck: vi.fn().mockResolvedValue([]),
|
|
filterOutUntrustedFiles: vi.fn().mockReturnValue([]),
|
|
});
|
|
|
|
expect(warnSpy).not.toHaveBeenCalled();
|
|
});
|
|
});
|
|
|
|
it('skips runSecurityCheck entirely when enableSecurityCheck is false', async () => {
|
|
// Pin the negative path of the `if (config.security.enableSecurityCheck)` guard.
|
|
// Dropping the guard would still pass every other test in this file because
|
|
// they all enable the check; this one fails if the guard ever regresses.
|
|
const config: RepomixConfigMerged = {
|
|
security: { enableSecurityCheck: false },
|
|
} as RepomixConfigMerged;
|
|
const rawFiles: RawFile[] = [{ path: 'file1.txt', content: 'content' }];
|
|
const deps = {
|
|
runSecurityCheck: vi.fn(),
|
|
filterOutUntrustedFiles: vi.fn().mockReturnValue(rawFiles),
|
|
};
|
|
|
|
const result = await validateFileSafety(rawFiles, vi.fn(), config, undefined, undefined, deps);
|
|
|
|
expect(deps.runSecurityCheck).not.toHaveBeenCalled();
|
|
expect(result.suspiciousFilesResults).toEqual([]);
|
|
expect(result.suspiciousGitDiffResults).toEqual([]);
|
|
expect(result.suspiciousGitLogResults).toEqual([]);
|
|
expect(result.safeFilePaths).toEqual(['file1.txt']);
|
|
});
|
|
});
|