mirror of
https://github.com/yamadashy/repomix.git
synced 2026-05-30 11:18:53 +02:00
fb4c895085
The security worker pool currently spawns its 2 workers lazily inside
`runSecurityCheck`, paying a ~50 ms `@secretlint/core` +
`@secretlint/secretlint-rule-preset-recommend` module load on each
freshly spawned worker (~100 ms wall-clock for both workers loading
concurrently). That cold-start cost runs on the critical path inside
the security-check phase, before any scanning begins.
Mirror the existing `createMetricsTaskRunner` pattern: hoist the pool
construction to `pack()` and dispatch one no-op task per worker at the
pipeline entry, so the module load overlaps with the collectFiles + git
ops phase (~200 ms) instead of stalling the security check.
## Mechanism
- New `createSecurityTaskRunner(numOfTasks, deps?)` in
`src/core/security/securityCheck.ts` returns
`{ taskRunner, warmupPromise }`. The warm-up dispatches `maxThreads`
no-op tasks (`{ items: [] }`) — Tinypool spawns a fresh worker for
each concurrent task, fanning out the @secretlint/core load across
all workers in parallel.
- `runSecurityCheck` accepts an optional `taskRunner` in `deps`. When
provided, the caller owns the pool's lifecycle (creation + cleanup);
when omitted, runSecurityCheck creates and cleans up a fresh pool —
preserving the existing behavior for direct callers (e.g. the MCP
fileSystemReadFileTool path).
- `validateFileSafety` accepts and forwards an optional `taskRunner`.
- `pack()` calls `createSecurityTaskRunner` after `searchFiles` resolves
(file count is now known) and before the parallel collectFiles + git
ops block, so the warm-up runs concurrently with disk I/O. The
task runner is plumbed through `validateFileSafety` deps; the pool
is cleaned up alongside the metrics pool in the surrounding
try/finally.
## Scope gate
Pre-warming is gated on the same `hasExplicitScope` heuristic that
already differentiates 2- vs. 3-worker metrics warm-up:
| Workload | Pre-warm? |
|--------------------------------------------------|-----------|
| Default scan (no `--include` / `--stdin`) | yes |
| `--include`, `config.include`, or `--stdin` set | no |
Without the gate, the small/scoped workload regresses by 3.4 % paired
mean: the security check scans only ~5 batches and finishes in ~50–80
ms, so the up-front cost of constructing + destroying a second worker
pool outweighs the saved cold-start. The unconstrained scan runs
security over ~1000+ files where the hidden cold-start dominates.
## Benchmark — `node bin/repomix.cjs --quiet` (1046 files)
Two independent paired n=50 runs (interleaved BEFORE/AFTER alternating
order, NODE_DISABLE_COMPILE_CACHE=1):
| | min | median | mean | max | sd |
|--------|---------|---------|---------|---------|--------|
| BEFORE | 1320 ms | 1454 ms | 1451 ms | 1590 ms | 49 ms |
| AFTER | 1318 ms | 1410 ms | 1416 ms | 1501 ms | 40 ms |
- Mean paired Δ: **+35.2 ms (2.42 % wall-clock reduction)**
- Median paired Δ: +32.5 ms (2.23 %)
- Paired-delta SD: 64.78 ms · paired t = **3.84** (p < 0.001)
- AFTER faster in **39/50** pairs (78 %)
Confirmation run (same setup, n=50): mean Δ +37.0 ms (2.55 %),
t = 3.93, 36/50 pairs faster.
## Regression check — `--include 'src,tests' --quiet` (258 files)
n=30 paired interleaved, NODE_DISABLE_COMPILE_CACHE=1:
| | min | median | mean | max |
|--------|--------|--------|--------|--------|
| BEFORE | 670 ms | 732 ms | 730 ms | 783 ms |
| AFTER | 688 ms | 728 ms | 729 ms | 786 ms |
- Mean paired Δ: +0.9 ms (0.13 %) — **neutral within noise**
(paired t = 0.17)
- AFTER faster in 16/30 pairs
The gate falls back to the original lazy-spawn path on this workload,
so AFTER == BEFORE up to noise. Without the gate this workload
regresses by 3.4 % paired (t = -4.88).
## Correctness
- All **1260** unit tests pass (`npm test`); `npm run lint` clean
(only the two pre-existing `biome-ignore` warnings unrelated to
this change).
- XML output **byte-identical** between BEFORE and AFTER on both the
default 1046-file workload and the `--include 'src,tests'`
258-file workload (verified via `diff` on full ~4.85 MB outputs).
- `runSecurityCheck`'s public signature gains an optional `taskRunner`
in deps; when omitted, behavior is unchanged. Existing callers
outside the pack pipeline (e.g. MCP `fileSystemReadFileTool`) still
spawn their own pool.
- The MCP main-thread security path is unaffected — it uses
`runSecretLint` directly (worker module loaded once at process
start) and never goes through the pool.
## Tests
- `tests/core/security/validateFileSafety.test.ts` — assertion on the
`runSecurityCheck` call updated to include the new `{ taskRunner }`
deps argument (currently undefined when no pre-warmed runner is
provided).
- `tests/core/packager.test.ts`,
`tests/core/packager/diffsFunctionality.test.ts`,
`tests/core/packager/splitOutput.test.ts`,
`tests/integration-tests/packager.test.ts` — extended `mockDeps` /
`baseDeps` with a stubbed `createSecurityTaskRunner` so the default
scope path no longer attempts to spawn a real worker pool from the
test environment. The pack-level assertion on `validateFileSafety`
now matches the new 6th-argument deps object via
`expect.objectContaining({ taskRunner: expect.any(Object) })`.
178 lines
6.0 KiB
TypeScript
178 lines
6.0 KiB
TypeScript
import { beforeEach, describe, expect, test, vi } from 'vitest';
|
|
import type { RepomixConfigMerged } from '../../../src/config/configSchema.js';
|
|
import type { ProcessedFile } from '../../../src/core/file/fileTypes.js';
|
|
import * as gitDiffModule from '../../../src/core/git/gitDiffHandle.js';
|
|
import * as gitRepositoryModule from '../../../src/core/git/gitRepositoryHandle.js';
|
|
import { pack } from '../../../src/core/packager.js';
|
|
import { createMockConfig } from '../../testing/testUtils.js';
|
|
|
|
// Mock the dependencies
|
|
vi.mock('../../../src/core/git/gitDiffHandle.js', () => ({
|
|
getWorkTreeDiff: vi.fn(),
|
|
getStagedDiff: vi.fn(),
|
|
getGitDiffs: vi.fn(),
|
|
}));
|
|
|
|
vi.mock('../../../src/core/git/gitRepositoryHandle.js', () => ({
|
|
isGitRepository: vi.fn(),
|
|
isGitInstalled: vi.fn().mockResolvedValue(false),
|
|
getFileChangeCount: vi.fn().mockResolvedValue({}),
|
|
}));
|
|
|
|
describe('Git Diffs Functionality', () => {
|
|
let mockConfig: RepomixConfigMerged;
|
|
const mockRootDir = '/test/repo';
|
|
const sampleDiff = `diff --git a/file1.js b/file1.js
|
|
index 123..456 100644
|
|
--- a/file1.js
|
|
+++ b/file1.js
|
|
@@ -1,5 +1,5 @@
|
|
-old line
|
|
+new line
|
|
`;
|
|
|
|
beforeEach(() => {
|
|
vi.resetAllMocks();
|
|
|
|
// Sample minimal config using createMockConfig utility
|
|
mockConfig = createMockConfig({
|
|
cwd: mockRootDir,
|
|
output: {
|
|
filePath: 'repomix-output.txt',
|
|
style: 'plain',
|
|
git: {
|
|
includeDiffs: false,
|
|
},
|
|
},
|
|
});
|
|
|
|
// Set up our mocks
|
|
vi.mocked(gitRepositoryModule.isGitRepository).mockResolvedValue(true);
|
|
vi.mocked(gitDiffModule.getWorkTreeDiff).mockResolvedValue(sampleDiff);
|
|
vi.mocked(gitDiffModule.getStagedDiff).mockResolvedValue('');
|
|
});
|
|
|
|
test('should not fetch diffs when includeDiffs is disabled', async () => {
|
|
// Mock the dependencies for pack
|
|
const mockSearchFiles = vi.fn().mockResolvedValue({ filePaths: [] });
|
|
const mockCollectFiles = vi.fn().mockResolvedValue({ rawFiles: [], skippedFiles: [] });
|
|
const mockProcessFiles = vi.fn().mockResolvedValue([]);
|
|
const mockValidateFileSafety = vi.fn().mockResolvedValue({
|
|
safeFilePaths: [],
|
|
safeRawFiles: [],
|
|
suspiciousFilesResults: [],
|
|
});
|
|
const mockProduceOutput = vi.fn().mockResolvedValue({
|
|
outputForMetrics: 'mocked output',
|
|
});
|
|
const mockCalculateMetrics = vi.fn().mockResolvedValue({
|
|
totalFiles: 0,
|
|
totalCharacters: 0,
|
|
totalTokens: 0,
|
|
fileCharCounts: {},
|
|
fileTokenCounts: {},
|
|
});
|
|
const mockSortPaths = vi.fn().mockImplementation((paths) => paths);
|
|
const mockCreateMetricsTaskRunner = vi.fn().mockReturnValue({
|
|
taskRunner: {
|
|
run: vi.fn().mockResolvedValue(0),
|
|
cleanup: vi.fn().mockResolvedValue(undefined),
|
|
},
|
|
warmupPromise: Promise.resolve(),
|
|
});
|
|
const mockCreateSecurityTaskRunner = vi.fn().mockReturnValue({
|
|
taskRunner: {
|
|
run: vi.fn().mockResolvedValue([]),
|
|
cleanup: vi.fn().mockResolvedValue(undefined),
|
|
},
|
|
warmupPromise: Promise.resolve(),
|
|
});
|
|
|
|
// Config with diffs disabled
|
|
if (mockConfig.output.git) {
|
|
mockConfig.output.git.includeDiffs = false;
|
|
}
|
|
|
|
await pack([mockRootDir], mockConfig, vi.fn(), {
|
|
searchFiles: mockSearchFiles,
|
|
collectFiles: mockCollectFiles,
|
|
processFiles: mockProcessFiles,
|
|
validateFileSafety: mockValidateFileSafety,
|
|
produceOutput: mockProduceOutput,
|
|
calculateMetrics: mockCalculateMetrics,
|
|
createMetricsTaskRunner: mockCreateMetricsTaskRunner,
|
|
createSecurityTaskRunner: mockCreateSecurityTaskRunner,
|
|
sortPaths: mockSortPaths,
|
|
});
|
|
|
|
// Should not call getWorkTreeDiff
|
|
expect(gitDiffModule.getWorkTreeDiff).not.toHaveBeenCalled();
|
|
});
|
|
|
|
test('should calculate diff token count correctly', async () => {
|
|
// Create a processed files array with a sample file
|
|
const processedFiles: ProcessedFile[] = [
|
|
{
|
|
path: 'test.js',
|
|
content: 'console.log("test");',
|
|
},
|
|
];
|
|
|
|
// Mock dependencies
|
|
const mockSearchFiles = vi.fn().mockResolvedValue({ filePaths: ['test.js'] });
|
|
const mockCollectFiles = vi.fn().mockResolvedValue({ rawFiles: processedFiles, skippedFiles: [] });
|
|
const mockProcessFiles = vi.fn().mockResolvedValue(processedFiles);
|
|
const mockValidateFileSafety = vi.fn().mockResolvedValue({
|
|
safeFilePaths: ['test.js'],
|
|
safeRawFiles: processedFiles,
|
|
suspiciousFilesResults: [],
|
|
});
|
|
const mockProduceOutput = vi.fn().mockResolvedValue({
|
|
outputForMetrics: 'Generated output with diffs included',
|
|
});
|
|
const mockCalculateMetrics = vi.fn().mockResolvedValue({
|
|
totalFiles: 1,
|
|
totalCharacters: 30,
|
|
totalTokens: 10,
|
|
fileCharCounts: { 'test.js': 10 },
|
|
fileTokenCounts: { 'test.js': 5 },
|
|
gitDiffTokenCount: 15, // Mock diff token count
|
|
});
|
|
const mockSortPaths = vi.fn().mockImplementation((paths) => paths);
|
|
const mockCreateMetricsTaskRunner = vi.fn().mockReturnValue({
|
|
taskRunner: {
|
|
run: vi.fn().mockResolvedValue(0),
|
|
cleanup: vi.fn().mockResolvedValue(undefined),
|
|
},
|
|
warmupPromise: Promise.resolve(),
|
|
});
|
|
const mockCreateSecurityTaskRunner = vi.fn().mockReturnValue({
|
|
taskRunner: {
|
|
run: vi.fn().mockResolvedValue([]),
|
|
cleanup: vi.fn().mockResolvedValue(undefined),
|
|
},
|
|
warmupPromise: Promise.resolve(),
|
|
});
|
|
|
|
// Config with diffs enabled
|
|
if (mockConfig.output.git) {
|
|
mockConfig.output.git.includeDiffs = true;
|
|
}
|
|
|
|
const result = await pack([mockRootDir], mockConfig, vi.fn(), {
|
|
searchFiles: mockSearchFiles,
|
|
collectFiles: mockCollectFiles,
|
|
processFiles: mockProcessFiles,
|
|
validateFileSafety: mockValidateFileSafety,
|
|
produceOutput: mockProduceOutput,
|
|
calculateMetrics: mockCalculateMetrics,
|
|
createMetricsTaskRunner: mockCreateMetricsTaskRunner,
|
|
createSecurityTaskRunner: mockCreateSecurityTaskRunner,
|
|
sortPaths: mockSortPaths,
|
|
});
|
|
|
|
// Check gitDiffTokenCount in the result
|
|
expect(result.gitDiffTokenCount).toBe(15);
|
|
});
|
|
});
|