repomix-mirror

mirror of https://github.com/yamadashy/repomix.git synced 2026-05-30 11:18:53 +02:00

Author	SHA1	Message	Date
autofix-ci[bot]	371927e211	[autofix.ci] apply automated fixes	2026-05-09 01:07:48 +00:00
Kazuki Yamada	1d2df3bea3	test(search): Update fileSearch tests for prescan-based ignore collection Add fs.readdir mock (returning empty array) to all relevant beforeEach blocks so collectIgnoreFilePatterns does not fail with "entries is not iterable". Update globby option assertions to reflect gitignore: false and ignoreFiles: [] now that patterns are pre-collected by the prescan. https://claude.ai/code/session_01Fm25x51fmGGeFMJyCm1CER	2026-05-09 10:06:46 +09:00
Kazuki Yamada	3ff49306e1	refactor(file): Simplify cheap pre-screen down to NULL probe + BOM exemption intent(simpler): 前 commit の cheap pre-screen は `isbinaryfile@5.0.2` の `isBinaryCheck` のうち valid UTF-8 でも binary 判定する 3 規則 (PDF magic / NULL / suspicious 制御バイト比率 >10%) を mirror していたが、`TextDecoder('utf-8', { fatal: true })` を `isBinaryFile` の前に動かしている時点で valid UTF-8 buffer は protobuf detector に渡らない。pathological case 回避という主目的は TextDecoder の reorder だけで達成しており、PDF magic と suspicious-byte ratio の mirror は (1) 実害ほぼゼロのエッジケースを救うだけで (2) `isbinaryfile` 内部実装への coupling を抱える、という割に合わない構成だった。 fix(simplify): cheap pre-screen を NULL-byte probe + BOM exemption の最小構成に縮小。 decision(keep-null-probe): NULL byte だけは独立した正当な理由で残す — `U+0000` は XML 1.0 で不正な文字で、本ツールの主出力フォーマット (XML) の正当性を破壊する。`TextDecoder` は `0x00` を valid UTF-8 (U+0000) として通すので、ここで弾かないと NULL を含む buffer が text として pack され、downstream の XML parser が落ちる。これは `isbinaryfile` の rule mirror ではなく、repomix 自身の出力 robustness 要件。 decision(drop-pdf-magic): PDF は `is-binary-path` の `.pdf` 拡張子で先に弾かれる。拡張子なしの ASCII-only PDF stub は実例ほぼゼロ (本物の PDF は cross-ref とバイナリストリームを内包し UTF-8 decode で失敗する経路を通る)。守る価値が低い。 decision(drop-suspicious-ratio): 純粋な C0 制御バイト高比率の valid UTF-8 buffer は実プロジェクトに存在しない。`isbinaryfile` の UTF-8 lookahead を完全 mirror する保守コスト (DEL boundary 等のドリフトリスク) > 効用。 constraint(coupling-minimal): NULL byte は universal な binary signal で、`isbinaryfile` の rule に縛られない。同 BOM exemption も標準 BOM の規格に準拠したもので upstream ドリフトの影響を受けない。 test(cleanup): 関連 regression test 3 件を削除 (PDF magic / suspicious ratio / DEL boundary)。これらは削除した規則の挙動を保証するもので、現実装ではすべて意図的に「text として pack」する。残るのは UTF-8 multi-byte / UTF-8 BOM+NULL / UTF-16 LE BOM の 3 件で、いずれも本 PR が回避したい pathological / regression を直接守る。 bench(no-regression, M-series Mac, hyperfine --warmup 1 --runs 5): - `node bin/repomix.cjs --quiet`: 399ms → 418ms (誤差範囲、JS の手書き 512-byte loop が消えた分の差は noise floor) - 出力差分: 既存の base に対して 0 ファイル削除、Korean md が +1 (silent drop 解消、変更なし) - fileRead.ts: 175 → 159 行 (-16 行)。pre-screen 関連で実質 ~35 行削減。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 23:11:29 +09:00
Kazuki Yamada	0248b61085	fix(file): Include UTF-8 BOM in cheap pre-screen exemption intent(no-regression): codex re-review で指摘 — BOM 例外関数 `hasNonUtf8TextBom` が UTF-16/UTF-32/GB18030 のみで UTF-8 BOM (`EF BB BF`) を含めていなかった。`isbinaryfile@5.0.2` の `isBinaryCheck` は UTF-8 BOM を見た瞬間に `false` を返すため、`EF BB BF 00 41` のような buffer は変更前は text として fast path に流れていた。今回の差分では UTF-8 BOM 後の NULL byte が cheap probe で先に拾われ binary 判定される regression。 fix(utf8-bom-exempt): 関数を `hasTextBom` に rename し、UTF-8 BOM (`EF BB BF`) の判定を最初に追加。`isbinaryfile` 本家と同じ並びで BOM 例外を持たせ、cheap probe を skip して UTF-8 fast path に到達させる。 constraint(test-source-non-binary): codex iter2 指摘 — 新規 BOM exemption テストの期待値に raw NUL byte literal を埋めると `fileRead.test.ts` 自身が `grep`/`rg` から binary 扱いになり、ファイル横断検索や CI のテキスト走査ジョブから不可視化される。`'\0A'` エスケープ表記に置換して、実行時の文字列 (char codes 0, 65) は同じまま source は ASCII に戻した。 test(regression): 1 件追加。 - `EF BB BF 00 41` (UTF-8 BOM + NULL + 'A') が `binary-content` で skip されず、UTF-8 fast path で `'\0A'` として decode されることを確認。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 20:41:30 +09:00
Kazuki Yamada	6e81c449b6	fix(test): Drop hyphen from "mis-classified" to satisfy typos check intent(ci-green): typos@1.45.1 flags `mis` as a typo of `miss`/`mist` and the hyphenated `mis-classified` in the new PDF-magic regression test comment trips the `Check typos` job. The unhyphenated `misclassified` is the more common spelling and passes the dictionary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 20:32:38 +09:00
Kazuki Yamada	5ab82d5152	fix(file): Mirror isbinaryfile's PDF magic + suspicious-byte ratio rules in cheap pre-screen intent(no-regression): codex review で指摘 — 元の差分は cheap pre-screen を NULL-byte probe + UTF-16/UTF-32 BOM 例外のみで構成していたが、`isbinaryfile@5.0.2` の `isBinaryCheck` には valid UTF-8 でも binary 判定する規則が他に2つある: (1) 先頭 5 バイトが `%PDF-` (PDF magic), (2) 先頭 512 バイト中の suspicious 制御バイト比率 >10%。これらを cheap pre-screen に含めないと、`%PDF-` 始まりの拡張子なし/`.txt` ファイルや、ASCII 制御文字が高比率の valid UTF-8 ファイルが従来 skip されていたのに pack に含まれる回帰が発生する。 fix(pdf-magic): UTF-16/UTF-32 BOM 例外の後、NULL probe の前に `%PDF-` 判定を追加。`isbinaryfile` 本家と同じ位置。 fix(suspicious-ratio): 既存の NULL probe ループに suspicious カウンタを追加し、ループ後に `>10%` 閾値で binary 判定。suspicious 集合は `isbinaryfile` の `(b < 7 \|\| b > 14) && (b < 32 \|\| b > 127)` 条件を valid-UTF-8 入力に絞って簡略化したもの: `b < 7` または `b in 0x0F..0x1F`。これは valid UTF-8 multi-byte の continuation/lead bytes (0x80..0xFF) と排他なので、UTF-8 awareness なしの flat byte scan で正しい結果になる。protobuf 検出器 (`isBinaryProto`) は意図的に mirror しない — それが本 PR で回避している pathological case 本体。 constraint(del-boundary): codex 再レビュー指摘 — 当初 0x7F (DEL) を suspicious 集合に含めていたが、`isbinaryfile` の条件 `b < 32 \|\| b > 127` は 127 を排除する。修正版では `b === 0x7f` を外し、コメントも本家挙動に合わせて訂正。 test(regression): 3 件追加。 - valid-UTF-8 PDF magic (`%PDF-1.4\n...`) を `binary-content` で skip することを確認 - 64 バイトの 0x01 のみの buffer (suspicious 100%) を `binary-content` で skip することを確認 - 64 バイトの 0x7F のみの buffer (valid UTF-8, DEL 100%) は skip しないことを境界として固定 bench(no-perf-regression, M-series Mac, hyperfine --warmup 1 --runs 5): - `node bin/repomix.cjs --quiet`: 406ms → 399ms (誤差範囲) - pre-screen の追加コストは 512 バイト線形スキャン分で、UTF-8 fast path (元から TextDecoder で全バッファ走査) に比べて無視できる Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 16:15:44 +09:00
Kazuki Yamada	8a08815ba1	perf(file): Try UTF-8 decode before isBinaryFile to dodge protobuf-detector pathological case intent(latency): `node bin/repomix.cjs` がリポジトリ自身を pack する際の wall-clock が PR #1533 (`docs(website): Add localized page metadata`, commit `9bd663ae`) で 0.38s → 1.15s に約3倍に増えた。9bd663ae は 14言語 × 24ページ = 336個の md に YAML frontmatter (`title` + `description`) を追記しただけで、出力サイズはほぼ変わらないのに実行時間だけが伸びていた。 root-cause(isbinaryfile): `readRawFile` は全ファイルの buffer を `isBinaryFile` (= `isbinaryfile` パッケージ) に通してから UTF-8 fast path に進む。`isbinaryfile` の `isBinaryCheck` は protobuf-shape 検出器 (`isBinaryProto`) を含み、これが任意ファイル先頭バイトを varint として解釈し `new Array(varint)` で配列を確保する。一部の正当な UTF-8 バイト列ではこのループが数秒スピン or `RangeError: Invalid array length` を投げる。具体例: `website/client/src/ko/guide/tips/best-practices.md` (4,243 bytes, valid UTF-8 韓国語 md) は単独で `isBinaryFile` 呼び出しに ~3,500ms かかり最終的に throw → 外側の try/catch で握り潰され `encoding-error` で silent drop されていた。デフォルトの pack 時はこの 1 ファイルだけで毎回 ~3,500ms を払っていた。これは upstream `isbinaryfile` のバグ (信頼できない入力で `new Array(n)` を bound せず確保) だが、修正を待たずに自衛する。 fix(reorder): `isBinaryFile` を UTF-8 fast path の後に動かし、UTF-8 として decode 失敗した buffer のみに適用する。NULL バイト (= U+0000、valid UTF-8) を含むバイナリは UTF-8 fast path を素通りしてしまうため、`isBinaryFile` の前に 512 バイトの cheap な NULL-byte probe を挿入。NULL は最強のバイナリシグナルかつ `isBinaryCheck` のうち UTF-8 fatal decode を通過する入力に triggers する唯一の規則。残りの heuristics (PDF magic / suspicious-byte ratio / protobuf shape) は非 UTF-8 バイト列を要求するので、UTF-8 fast path に乗らないファイルだけが従来通り `isBinaryFile` に渡る。 constraint(utf16-utf32-bom): UTF-16 LE は ASCII `A` を `0x41 0x00` と encode し、UTF-32 BE BOM は `0x00 0x00 0xFE 0xFF` で始まる。NULL probe をそのまま走らせるとこれらの text ファイルを binary 誤判定する。`isbinaryfile` 自身の `isBinaryCheck` は BOM 例外を持っているので、`hasNonUtf8TextBom` でこれを mirror し、UTF-16/UTF-32 BOM 始まりの buffer は probe を skip して slow path (jschardet+iconv) にそのまま落とす。挙動は pre-change と同一。 side-effect(restored-file): 上の Korean Markdown ファイルは throw → silent drop されて出力から消えていたが、本修正後は正しく出力に含まれる。 test(regression): `tests/core/file/fileRead.test.ts` に2件追加。 - valid UTF-8 multi-byte (Hangul 3-byte 連続、NULL なし) を text としてそのまま round-trip - UTF-16 LE BOM ファイル ("Hello\n") が NULL を含んでも slow path で正しく decode bench(local, M-series Mac, hyperfine --warmup 1 --runs 5): - `node bin/repomix.cjs --quiet` 全体: 1152ms → 406ms (約 2.8× 速) - `--include 'website/client/src/ko/guide/tips/best-practices.md'` 単独: 880ms → 170ms (約 5.2× 速) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 15:56:10 +09:00
Claude	1bef40ba2d	test: Tighten misleading test names and pin packageManager guard Address two review threads on PR #1518 that flagged tests whose titles overstated what was being verified. - fileProcess: the longBase64 string is one continuous line, so the truncateBase64 → removeEmptyLines ordering was never actually under test (truncateBase64Content's regex does not span newlines). Rename to describe the combined behavior the test really pins. - skillTechStack: rename the per-directory case to reflect that root and subpackage land in separate buckets keyed by getDirPath, and add a second case with two package.json entries at the same path to genuinely exercise the parsed.packageManager && !result.packageManager guard at skillTechStack.ts:541.	2026-04-26 13:24:32 +00:00
Kazuki Yamada	402e4906d7	test: Pin v1.14.0 regression-prone invariants Targeted regression tests for the high-risk areas identified in the v1.13.1..main audit, focusing on silent-correctness bugs and parallel error handling — places that wouldn't surface in CI but would in the field. - core/metrics/calculateMetrics: pin numeric equivalence between the fast path (Σ file tokens + wrapper tokens) and the slow path (full output tokenization). Cover wrapper-extraction fallback, split-output fallback, and worker pool cleanup when fileMetrics rejects. - core/file/fileProcess: pin transform ordering invariants — removeComments → removeEmptyLines (blank lines from comment removal must be cleaned up; preserved when removeEmptyLines is off); truncateBase64 → removeEmptyLines (multi-line base64 squashed first); trim → showLineNumbers (no leading/trailing blanks numbered). Plus worker/lightweight path parity for inputs that don't need worker processing. - core/packager: pin metrics worker pool cleanup on parallel branch failures (validateFileSafety, produceOutput, calculateMetrics, warmup rejection). Verify prefetchSortData failure is isolated and does not block sortOutputFiles. - core/skill/skillTechStack: cover untested fix-commit invariants — root entry sorts first in monorepo output; configFiles deduplicated within a directory; first-seen packageManager wins per directory.	2026-04-26 19:43:53 +09:00
Kazuki Yamada	cbdfc29b4d	test: Cover error/edge paths in core (output, file, security, treeSitter) Lift the four most impactful uncovered files past 90% lines without introducing fragile or contrived tests. Each block targets real user-facing branches (error handling, optional features, init/dispose). - core/output/outputGenerate (78% -> ~90%): - buildOutputGeneratorContext: instructionFilePath success and missing-file paths; pre-computed vs. searchFiles fallback for empty directories; full-tree mode (success and listing failure); searchFiles failure wrap. - generateOutput: unsupported style throws RepomixError. - core/security/validateFileSafety (79% -> ~95%): - logSuspiciousContentWarning loop: header line per section, plus singular ("issue") and plural ("issues") suffix per result. - No-op behavior when no suspicious git diff/log entries exist. - core/file/fileSearch (88% -> ~92%): - handleGlobbyError: EPERM and EACCES translated to PermissionError; other error codes pass through. - Outer catch: generic Error wrapped with directory context; non-Error throw produces the generic fallback message. - core/treeSitter/languageParser (74% -> ~88%): - getResources before init() throws RepomixError. - init() is idempotent (Parser.init is called only once across two calls). - Parser.init() failure is wrapped as RepomixError. - dispose() resets state so subsequent calls require re-init. Coverage: - Statements 89.51% -> 90.23% - Branches 79.31% -> 80.26% - Functions 89.37% -> 89.69% - Lines 90.06% -> 90.80%	2026-04-26 19:35:00 +09:00
Kazuki Yamada	9aac452504	test: Raise overall coverage from 87.9% to 90.1% Cover previously-untested paths across the shared, cli, core, and mcp layers, focusing on branches that represent real user-facing behavior rather than line-coverage chasing. Highlights: - shared/errorHandle: cover handleError (RepomixError, unexpected Error, unknown values, duck-typed worker errors, debug-level branches) and the three error class constructors. - shared/logger: cover setLogLevelByWorkerData for env-var, workerData (array and object shapes), and invalid/missing inputs. - shared/memoryUtils: add a fresh test file covering stats, log helpers, and withMemoryLogging success/error paths. - shared/processConcurrency: cover cleanupWorkerPool (Node, Bun-skip, swallowed teardown errors) and the run/cleanup delegation. - shared/unifiedWorker: cover the cache-hit path and the workerData (array/object) and REPOMIX_WORKER_TYPE detection branches. - core/metrics/TokenCounter: cover the catch branch (Error, non-Error throws, with/without filePath). - core/file/fileManipulate: cover removeEmptyLines on inherited base and composite manipulators. - cli/cliReport: cover skill-directory and split-output summary lines. - mcp/tools/packRemoteRepositoryTool: add tests mirroring the packCodebaseTool pattern (success, runCli failure, runCli throw, workspace creation failure). - mcp/tools/fileSystemReadDirectoryTool: switch to mocking node:fs/promises so existing mocks actually intercept calls, and cover the file-vs-dir, listing, empty-directory, and readdir-error paths. Result: - Statements 87.29% -> 89.51% - Branches 76.16% -> 79.31% - Functions 87.60% -> 89.37% - Lines 87.89% -> 90.06%	2026-04-26 19:28:09 +09:00
Kazuki Yamada	44db93451d	test(core): Cover precondition guards in truncateBase64 intent(truncateBase64-tests): add explicit coverage for the new fast-path guards introduced alongside the regex-skip optimization decision(test-cases): focus on the four cases that exercise guard behavior not previously asserted — empty input, exactly-below-threshold (255 chars), run-reset on non-base64 separator, and non-base64 data URI without `;base64,` Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 18:09:02 +09:00
Claude	4bd257e280	perf(core): Speed up empty-directory detection in file search Merge the two separate globby traversals used by `searchFiles` into a single one and parallelize the per-directory `readdir` calls used to filter empty directories. Background ---------- When `output.includeEmptyDirectories` is enabled (the default for `repomix.config.json` in this repo, and any repo that wants an accurate directory tree), `searchFiles` previously walked the working tree twice: once with `onlyFiles: true` and a second time with `onlyDirectories: true`. Each call re-traversed the tree and re-parsed every `.gitignore` / `.repomixignore` file. `findEmptyDirectories` then issued `readdir` serially for every matched directory, awaiting each syscall before starting the next. Change ------ * Replace the two globby invocations with one `objectMode: true, onlyFiles: false` call. Partition the returned `GlobEntry[]` by `dirent.isFile()` / `dirent.isDirectory()`, matching the previous `onlyFiles: true` semantics for symlinks and other non-file non-directory entries. * Rewrite `findEmptyDirectories` to run the per-directory `readdir` checks concurrently via `Promise.all`. Ordering is preserved by the result array and the caller sorts the final list anyway. * When `includeEmptyDirectories` is disabled, keep the fast `onlyFiles: true` path unchanged so the default CLI run pays no cost. Benchmark (hyperfine, repomix packing itself, 30 runs, warmup 3) ---------------------------------------------------------------- Run 1: baseline 2.162s ± 0.042s → perf 2.017s ± 0.029s → -145ms (-6.7%) Run 2: baseline 2.161s ± 0.023s → perf 2.030s ± 0.027s → -131ms (-6.1%) Per-stage verbose timings: baseline: [globby files 200ms] + [globby dirs 85ms] + [empty dirs 61ms] perf: [combined globby 223ms] + [empty dirs 66ms] saved: -57ms consistently on the critical path	2026-04-26 16:41:38 +09:00
Kazuki Yamada	6fecdca6b3	test(core): Add combined worker + lightweight pipeline integration test Add test that exercises all transforms together: removeComments (worker) + truncateBase64 + removeEmptyLines + showLineNumbers (lightweight) to verify the full two-phase pipeline produces correct output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 18:27:55 +09:00
Kazuki Yamada	f1067163ec	refactor(core): Simplify into single applyLightweightTransforms and remove redundant trim Merge applyPreCompressTransforms and applyPostCompressTransforms into a single applyLightweightTransforms function. Move truncateBase64 to post-worker phase since tree-sitter handles string literals as single AST nodes regardless of content size. Remove redundant trim from worker processContent — the main thread applyLightweightTransforms already handles it. Final pipeline: Worker: removeComments → compress Main: truncateBase64 → removeEmptyLines → trim → showLineNumbers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 18:01:22 +09:00
Kazuki Yamada	47e4a65b61	fix(core): Move removeEmptyLines to post-compress to preserve ordering Move removeEmptyLines from applyPreCompressTransforms to applyPostCompressTransforms so it runs after removeComments. This ensures empty lines created by comment removal are cleaned up. Transform order: truncateBase64 (pre) → [removeComments → compress] (worker) → removeEmptyLines → trim → showLineNumbers (post) Simplify applyPreCompressTransforms to only handle truncateBase64 with an early return when disabled. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 17:57:00 +09:00
Kazuki Yamada	cac35d0465	fix(core): Preserve transform order by splitting into pre/post compress phases Split applyLightweightTransforms into applyPreCompressTransforms and applyPostCompressTransforms to preserve the original execution order: truncateBase64 → removeComments → removeEmptyLines → trim → compress → showLineNumbers Pre-compress transforms (truncateBase64, removeEmptyLines) must run before tree-sitter parsing to avoid performance regression with large base64 strings and to ensure empty line removal affects chunk merging. Action: split lightweight transforms into pre-compress and post-compress phases Why: previous refactor changed execution order, causing tree-sitter to receive untreated base64 and content with empty lines, altering compress output Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 17:40:11 +09:00
Kazuki Yamada	e978decb2b	test(core): Add regression tests for base64 truncation and lastIndex safety Add test for consecutive truncateBase64Content calls to verify global regex lastIndex reset works correctly. Add test for truncateBase64 config branch in applyLightweightTransforms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 17:33:54 +09:00
Kazuki Yamada	3e70628307	refactor(core): Separate lightweight transforms from worker processing Extract lightweight file transforms (truncateBase64, removeEmptyLines, trim, showLineNumbers) into applyLightweightTransforms() on the main thread, keeping only heavy operations (removeComments, compress) in worker processContent(). This eliminates dual management of the same logic across worker and main thread paths. Also pre-compile base64 regex patterns at module level to avoid re-creation per file call. Action: split processContent into heavy (worker) and lightweight (main thread) phases Action: extract applyLightweightTransforms() as single source of truth for lightweight ops Action: hoist regex patterns in truncateBase64.ts to module scope with lastIndex reset Why: lightweight transforms were duplicated in both processFilesMainThread and processContent Why: regex re-compilation per file added unnecessary overhead for large repos Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 17:24:32 +09:00
Kazuki Yamada	e2101a50d1	test(core): Add boundary test for exactly 256-char base64 string Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 22:59:45 +09:00
Kazuki Yamada	25ec27028e	fix(core): Reduce false positives in truncateBase64 for path-like strings Raise MIN_BASE64_LENGTH_STANDALONE from 60 to 256 since truncating short strings saves negligible tokens. Require digits in isLikelyBase64 heuristic since real base64-encoded binary data virtually always contains numbers, while XPath and file path strings typically do not. Closes #1298 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 22:52:33 +09:00
Kazuki Yamada	f38828aa90	fix(test): Use path.sep in fileSearch test for cross-platform compatibility Mock data and expected sort order now use path.sep instead of hardcoded '/' separators. On Windows, path.sep is '\' so sortPaths splits differently, producing a different sort order. Co-Authored-By: Claude Opus 4.6 (1M context) <koukun0120@gmail.com>	2026-03-20 01:16:01 +09:00
Kazuki Yamada	f5977b2e6a	test(core): Use exact sorted order assertion in fileSearch test Replace weak arrayContaining assertion with exact toEqual using the correct sorted order, so the test verifies both content and sort behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <koukun0120@gmail.com>	2026-03-20 01:08:03 +09:00
yamadashy	d9fa509ee6	perf(core): Optimize sortPaths with decorate-sort-undecorate pattern Pre-compute path.split() once per path before sorting, avoiding O(N log N) repeated string allocations during comparisons. Benchmark: 10,000 files 65ms → 11ms (6x speedup). Co-Authored-By: Claude Opus 4.6 (1M context) <koukun0120@gmail.com>	2026-03-20 00:54:33 +09:00
Kazuki Yamada	e97691dd36	perf(core): Replace worker threads with promise pool for file collection After the UTF-8 fast path optimization eliminated the CPU-heavy jschardet bottleneck, file collection became I/O-bound. Worker threads now add pure overhead (Tinypool init, structured clone, IPC) without benefit. Benchmark (954 files, M2 Pro 10-core): - Worker Threads: ~108ms → Promise Pool (c=50): ~37ms (2.9x faster) Changes: - Replace Tinypool worker dispatch with a simple promise pool (c=50) - Inject readRawFile via deps for testability - Remove unused concurrentTasksPerWorker from WorkerOptions - Simplify tests to use readRawFile mock instead of 5+ module mocks	2026-02-17 23:09:18 +09:00
Kazuki Yamada	7dcdbae24d	perf(core): Add UTF-8 fast path to skip expensive jschardet encoding detection Previously, every file went through jschardet.detect() which scans the entire buffer through multiple encoding probers (MBCS, SBCS, Latin1) with frequency table lookups — the most expensive CPU operation in file collection. Since ~99% of source code files are UTF-8, we now try TextDecoder('utf-8', { fatal: true }) first. If it succeeds, jschardet and iconv are skipped entirely. Non-UTF-8 files (e.g., Shift-JIS, EUC-KR) fall back to the original detection path. Additionally, set concurrentTasksPerWorker=3 for fileCollect workers to better overlap I/O waits within each worker thread. Benchmark results (838 files, 10 CPU cores): - Before: ~616ms - After: ~108ms (5.7x faster)	2026-02-17 23:09:18 +09:00
Kazuki Yamada	f41b75c560	fix(test): fix lint errors and update test signatures for filePathsByRoot - Remove unused imports (generateFileTree, treeToString) in fileTreeGenerate.test.ts - Add filePathsByRoot parameter to generateOutput and produceOutput calls in tests - Update expect assertions to include filePathsByRoot argument	2026-01-04 23:11:28 +09:00
spandan-kumar	3f2680e5d5	feat(tree): add multi-root directory labels When packing multiple directories, the directory tree output now shows labeled sections like [cli]/, [config]/ to clarify which files belong to which root directory. - Add FilesByRoot interface and generateTreeStringWithRoots function - Update output pipeline to pass file-to-root mapping - Add unit tests for new tree generation functions - Update existing tests for new function signatures Closes #1023	2026-01-04 22:57:10 +09:00
Kazuki Yamada	cdae79d115	fix(file): Replace strip-comments with @repomix/strip-comments Replace the original strip-comments package with @repomix/strip-comments, which provides enhanced support for: - Go directives (//go:build, //go:generate, etc.) - C++ document comments (///) - Python docstrings (""" and ''') and hash comments This removes the custom GoManipulator, PythonManipulator, and CppManipulator implementations in favor of the improved library support. Note: preserveNewlines option keeps newlines for line number preservation, so docstrings are replaced with empty lines rather than being fully removed.	2025-12-15 00:15:17 +09:00
Kazuki Yamada	47398ae820	test(file): Add test for legitimate U+FFFD character handling Verify that files containing intentional U+FFFD characters in the source are correctly read (not skipped), testing the TextDecoder validation path.	2025-12-14 19:44:47 +09:00
Kazuki Yamada	c4354e7745	fix(file): improve U+FFFD detection for UTF-8 encoding - Use TextDecoder('utf-8', { fatal: true }) to distinguish actual decode errors from legitimate U+FFFD characters in UTF-8 files - Change test temp directory from tests/fixtures to os.tmpdir() to avoid clobbering committed fixtures and reduce parallel-run collisions - Non-UTF-8 files still use iconv.decode() fallback behavior Addresses CodeRabbit review comments on PR #1007	2025-12-14 18:56:34 +09:00
Kazuki Yamada	72b27e4c9f	fix(file): remove jschardet confidence check for encoding detection Remove the confidence < 0.2 check that was causing valid UTF-8/ASCII files to be incorrectly skipped. Files are now only skipped if they contain actual decode errors (U+FFFD replacement characters). This fixes issues where: - Valid Python files were skipped with confidence=0.00 (#869) - HTML files with Thymeleaf syntax (~{}) were incorrectly detected as binary (#847) The isbinaryfile library (added in PR #1006) now handles binary detection more accurately, making the confidence-based heuristic unnecessary. Fixes #869	2025-12-14 18:44:48 +09:00
Kazuki Yamada	7f0d05d703	feat(core): Replace istextorbinary with is-binary-path and isbinaryfile Migrate from istextorbinary (last updated 2023-12) to actively maintained packages: - is-binary-path: Extension-based binary detection (updated 2024-04) - isbinaryfile: Content-based binary detection with zero dependencies (updated 2025-12) Improvements: - Binary extension coverage: 13 → 262 extensions (~20x increase) - Content detection: Better UTF-16/CJK support, statistical analysis (512 bytes vs 72 bytes) The two-stage detection logic (extension check → content check) is preserved.	2025-12-14 18:03:43 +09:00
Kazuki Yamada	b99d08398f	test(core): Add parent directory ignore file tests Add comprehensive tests for parent directory ignore file handling to address PR #964 review feedback (Risk: High concern from claude[bot]). Added three new test cases: 1. Parent directory .ignore file handling - Verifies .ignore files in parent directories are respected - Tests with useDotIgnore: true configuration - Ensures patterns apply to nested subdirectories 2. Parent directory .repomixignore file handling - Verifies .repomixignore files in parent directories are respected - Tests default configuration (.repomixignore always enabled) - Ensures patterns apply to nested subdirectories 3. Git worktree + parent .gitignore interaction - Verifies worktree environments handle parent .gitignore correctly - Combines worktree detection with parent .gitignore pattern application - Tests that .git file (not directory) is properly handled in worktree - Ensures gitignore: true option enables parent .gitignore handling All tests follow the same pattern as existing "should respect parent directory .gitignore patterns (v16 behavior)" test, providing consistent coverage for .gitignore, .ignore, and .repomixignore files. These tests ensure that globby v16's parent directory ignore file handling works correctly for all supported ignore file types, not just .gitignore.	2025-11-24 19:11:18 +09:00
Kazuki Yamada	dd25beccfd	test(core): Add type guards to globby options in tests Fix TS18048 errors in createBaseGlobbyOptions consistency tests by adding expect(options).toBeDefined() and if (!options) continue guards. This ensures type safety and prevents undefined access to globby call options. All three tests now properly guard against potentially undefined options: - should use consistent base options across all globby calls - should respect gitignore config consistently across all functions - should apply custom ignore patterns consistently across all functions This addresses the coderabbitai feedback on PR #964.	2025-11-24 17:44:58 +09:00
Kazuki Yamada	f0d8de48ca	refactor(core): Address PR feedback for globby v16 update This commit addresses three suggestions from AI code review bots on PR #964: 1. Remove unnecessary array spreads in createBaseGlobbyOptions - Removed defensive copying of ignorePatterns and ignoreFilePatterns - Arrays are already created fresh in calling functions, making spreads redundant - Minor performance optimization by avoiding unnecessary array allocations 2. Extract prepareIgnoreContext helper function - Centralized duplicate ignore pattern preparation logic - Eliminated code duplication across searchFiles, listDirectories, and listFiles - The new helper handles: * Getting ignore patterns and ignore file patterns * Normalizing patterns for consistent trailing slash handling * Git worktree special case handling - Improves maintainability and ensures consistency across all globby calls 3. Add explanatory comment to v16 behavior test - Documented why v16's behavior is superior (matches Git's standard behavior) - Clarifies that v16 respects parent directory .gitignore files - Helps future maintainers understand the intentional breaking change All 856 tests pass with no regressions.	2025-11-24 17:44:58 +09:00
Kazuki Yamada	c9d296eec6	chore: Fix linting errors - Add website/server/dist/ to .gitignore for secretlint - Fix TypeScript type errors in fileSearch.test.ts - Format imports in fileSearch.ts (biome)	2025-11-24 17:44:58 +09:00
Kazuki Yamada	4b2d8c12d0	test(core): Add regression tests for globby v16 update - Add test for parent directory .gitignore pattern handling (v16 behavior) - Add tests for createBaseGlobbyOptions consistency across all functions - Verify gitignore option is passed correctly to all globby calls - Ensure no regression from v15 to v16 upgrade These tests prove that: 1. Parent .gitignore files are respected with globby v16 2. All 4 globby calls (searchFiles files/dirs, listDirectories, listFiles) use consistent base options 3. gitignore configuration is applied uniformly across all functions All 856 tests pass, confirming no regression from the changes.	2025-11-24 17:44:58 +09:00
Kazuki Yamada	3e410ce4dd	feat(core): Improve .gitignore handling with globby v16 - Upgrade globby from v15 to v16 - Use gitignore option to respect parent directory .gitignore files - This matches Git's standard behavior where parent .gitignore patterns apply to subdirectories - Move .gitignore handling from ignoreFiles to gitignore option - Update tests to reflect the new behavior This change improves compatibility with Git and provides more accurate file filtering when running Repomix in subdirectories.	2025-11-24 17:44:58 +09:00
Kazuki Yamada	44d172bcb9	fix(core): Correct .ignore file priority order Fixed the priority order of ignore files to match the intended behavior: - .gitignore (lowest priority) - .ignore (medium priority) - .repomixignore (highest priority) The previous implementation had .repomixignore at the lowest priority, which was incorrect. Repomix-specific ignore rules should take precedence over generic ignore files. This ensures that: 1. .repomixignore can override .ignore and .gitignore rules 2. .ignore can override .gitignore rules 3. The priority order documented in README is correctly implemented	2025-11-08 19:45:58 +09:00
Kazuki Yamada	bb7fae2b45	feat(core): Add .ignore file support This PR adds support for .ignore files, which are used by tools like ripgrep and the silver searcher. This allows users to maintain a single .ignore file that works across multiple tools instead of maintaining separate ignore files. Changes: - Add ignore.useDotIgnore config option (default: true) - Add --no-dot-ignore CLI flag to disable .ignore file usage - Update ignore file priority: .repomixignore > .ignore > .gitignore > default patterns - Add comprehensive tests for .ignore file handling - Update documentation to reflect new .ignore file support The .ignore file is enabled by default but can be disabled via configuration or CLI flag, maintaining backward compatibility. Resolves #937	2025-11-08 15:51:21 +09:00
Kazuki Yamada	72735cfdb1	test(coverage): Improve test coverage for CLI and core modules Added comprehensive test coverage for critical CLI and core functionality: - Created new test file for cliSpinner with 15 tests covering: * Spinner start/stop/update operations * Quiet/verbose/stdout mode handling * Success/fail message display * Interval management - Enhanced initAction tests (11→17 tests): * Added isCancel handling for user cancellation * Added return value validation tests * Covered config and ignore file creation flows - Enhanced cliReport tests (8→15 tests): * Added git diffs/logs reporting tests * Added security check reporting for git content * Added single vs multiple issue message handling - Enhanced permissionCheck tests (13→16 tests): * Added macOS-specific error message tests * Added platform-specific error handling tests * Added unknown error code handling - Enhanced outputGenerate tests (7→12 tests): * Added git diffs/logs inclusion tests * Added JSON format output tests * Added file/directory structure exclusion tests Overall improvements: - Test count: 804 → 840 (+36 tests) - Code coverage: 70.63% → 71.00% (+0.37%) - Branch coverage: 77.64% → 78.55% (+0.91%) - Significant improvement in CLI modules (cliSpinner: 25% → 59.61%)	2025-10-31 01:18:21 +09:00
Kazuki Yamada	ea1cc485c2	chore(config): disable organizeImports for src/index.ts Added override configuration to disable Biome's organizeImports feature specifically for src/index.ts to allow manual import order management while keeping automatic import organization enabled for other files.	2025-09-21 13:54:12 +09:00
Kazuki Yamada	f87e00dbdf	chore(lint): upgrade biome to v2.2.4 and fix all lint errors Updated biome from v1.9.4 to v2.2.4 to take advantage of latest linting improvements. - Upgraded @biomejs/biome from ^1.9.4 to ^2.2.4 - Updated biome.json configuration for v2 compatibility: - Changed schema to 2.2.4 - Updated file includes/ignores syntax - Added Vue file overrides to disable noUnusedVariables/noUnusedImports - Fixed all lint errors: - Added radix parameter to parseInt calls - Prefixed unused parameters with underscore - Removed unused imports - Fixed biome suppression comments - Removed !important from CSS - Added type ignores for Vue component definitions All 325 files now pass lint with 0 warnings and 0 errors.	2025-09-21 13:39:43 +09:00
Kazuki Yamada	5898d6397c	refactor(workers): improve code quality and type safety Address PR review feedback: - Fix worker path to use relative path instead of lib directory - Add proper function overloads for defaultActionWorker - Remove unsafe type assertions in worker code - Improve error handling with optional stack property - Extract log level validation logic to reduce duplication - Add NaN check for environment variable parsing All tests pass and linting issues resolved.	2025-09-20 22:11:44 +09:00
Kazuki Yamada	78b25b86e7	feat(core): use direct globby import instead of worker isolation Replace executeGlobbyInWorker with direct globby calls since worker isolation is no longer necessary for globby execution. - Remove src/core/file/globbyExecute.ts wrapper - Remove src/core/file/workers/globbyWorker.ts - Update fileSearch.ts to import and use globby directly - Update tests to mock globby instead of executeGlobbyInWorker - Simplify integration tests by removing worker mocks	2025-09-18 23:53:27 +09:00
Kazuki Yamada	ddd2814f84	fix(tests): Update test mocks to use new WorkerOptions interface	2025-08-31 16:32:49 +09:00
Kazuki Yamada	8f07b63a61	feat(core): Add runtime selection support for worker pools Add WorkerRuntime type and configurable runtime parameter to createWorkerPool and initTaskRunner functions. This allows choosing between 'worker_threads' and 'child_process' runtimes based on performance requirements. - Add WorkerRuntime type definition for type safety - Add optional runtime parameter to createWorkerPool with child_process default - Add optional runtime parameter to initTaskRunner with child_process default - Configure fileCollectWorker to use worker_threads for better performance - Update all test files to use WorkerRuntime type - Add comprehensive tests for runtime parameter functionality - Maintain backward compatibility with existing code The fileCollectWorker now benefits from worker_threads faster startup and shared memory, while other workers continue using child_process for stability.	2025-08-31 16:18:12 +09:00
Kazuki Yamada	575ae2bca4	test(core): Remove misleading Go nested block comments test Removed the Go nested block comments test case as it was unnecessary and potentially misleading. Go block comments do not nest according to the language specification, so testing this behavior is not needed and could cause confusion about the expected behavior. The remaining tests adequately cover Go comment parsing functionality.	2025-08-31 00:39:05 +09:00
Kazuki Yamada	23a0f00005	fix(core): Correct Go block comment parsing to match language spec Go block comments do not nest according to the language specification. The first / sequence should close the comment, regardless of any / sequences within it. This change removes the blockCommentDepth tracking and ensures correct parsing behavior for Go code containing sequences like /* comment with /* nested / part /. Updated test expectations to reflect the correct Go language behavior.	2025-08-31 00:05:04 +09:00

1 2 3

141 Commits