repomix-mirror

mirror of https://github.com/yamadashy/repomix.git synced 2026-04-27 12:26:08 +02:00

Author	SHA1	Message	Date
Kazuki Yamada	6fecdca6b3	test(core): Add combined worker + lightweight pipeline integration test Add test that exercises all transforms together: removeComments (worker) + truncateBase64 + removeEmptyLines + showLineNumbers (lightweight) to verify the full two-phase pipeline produces correct output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 18:27:55 +09:00
Kazuki Yamada	f1067163ec	refactor(core): Simplify into single applyLightweightTransforms and remove redundant trim Merge applyPreCompressTransforms and applyPostCompressTransforms into a single applyLightweightTransforms function. Move truncateBase64 to post-worker phase since tree-sitter handles string literals as single AST nodes regardless of content size. Remove redundant trim from worker processContent — the main thread applyLightweightTransforms already handles it. Final pipeline: Worker: removeComments → compress Main: truncateBase64 → removeEmptyLines → trim → showLineNumbers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 18:01:22 +09:00
Kazuki Yamada	47e4a65b61	fix(core): Move removeEmptyLines to post-compress to preserve ordering Move removeEmptyLines from applyPreCompressTransforms to applyPostCompressTransforms so it runs after removeComments. This ensures empty lines created by comment removal are cleaned up. Transform order: truncateBase64 (pre) → [removeComments → compress] (worker) → removeEmptyLines → trim → showLineNumbers (post) Simplify applyPreCompressTransforms to only handle truncateBase64 with an early return when disabled. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 17:57:00 +09:00
Kazuki Yamada	cac35d0465	fix(core): Preserve transform order by splitting into pre/post compress phases Split applyLightweightTransforms into applyPreCompressTransforms and applyPostCompressTransforms to preserve the original execution order: truncateBase64 → removeComments → removeEmptyLines → trim → compress → showLineNumbers Pre-compress transforms (truncateBase64, removeEmptyLines) must run before tree-sitter parsing to avoid performance regression with large base64 strings and to ensure empty line removal affects chunk merging. Action: split lightweight transforms into pre-compress and post-compress phases Why: previous refactor changed execution order, causing tree-sitter to receive untreated base64 and content with empty lines, altering compress output Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 17:40:11 +09:00
Kazuki Yamada	e978decb2b	test(core): Add regression tests for base64 truncation and lastIndex safety Add test for consecutive truncateBase64Content calls to verify global regex lastIndex reset works correctly. Add test for truncateBase64 config branch in applyLightweightTransforms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 17:33:54 +09:00
Kazuki Yamada	3e70628307	refactor(core): Separate lightweight transforms from worker processing Extract lightweight file transforms (truncateBase64, removeEmptyLines, trim, showLineNumbers) into applyLightweightTransforms() on the main thread, keeping only heavy operations (removeComments, compress) in worker processContent(). This eliminates dual management of the same logic across worker and main thread paths. Also pre-compile base64 regex patterns at module level to avoid re-creation per file call. Action: split processContent into heavy (worker) and lightweight (main thread) phases Action: extract applyLightweightTransforms() as single source of truth for lightweight ops Action: hoist regex patterns in truncateBase64.ts to module scope with lastIndex reset Why: lightweight transforms were duplicated in both processFilesMainThread and processContent Why: regex re-compilation per file added unnecessary overhead for large repos Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 17:24:32 +09:00
Kazuki Yamada	2fac8d85ee	perf(core): Optimize chunk merging and avoid redundant string split in grep tool - Replace string += with array accumulation + join('\n') in mergeAdjacentChunks to avoid O(k²) copying when merging adjacent tree-sitter code chunks - Extract searchInLines from searchInContent in grepRepomixOutputTool so performGrepSearch splits content once and reuses the lines array for both search and formatting, avoiding a redundant O(n) split on large files	2026-03-28 15:12:29 +09:00
Claude	4d2bbcf6cc	perf(core): Pre-initialize metrics worker pool to overlap tiktoken WASM loading Pipeline-level optimizations that produce measurable end-to-end improvement: - Pre-initialize metrics worker pool during file collection phase so tiktoken WASM loading overlaps with security checks and file processing. First token count task dropped from 381ms to 22ms (worker already warmed). - Lazy-load Jiti via dynamic import — only loaded when TS/JS config files are detected, saving startup time for the common JSON/default config path. - Fix O(n²) file path re-grouping in packager by using Map + Set for O(1) membership checks instead of .find() + .includes(). - Move binary extension check before fs.stat in fileRead to skip unnecessary stat syscalls for binary files. - Parallelize split output file writes with Promise.all instead of sequential for-loop. Benchmark (15 runs each, median ± IQR, packing repomix repo ~1000 files): main branch: 3515ms (P25: 3443, P75: 3581) perf branch: 3318ms (P25: 3215, P75: 3383) Improvement: -197ms (-5.6%) Pipeline stage breakdown (instrumented): - Metrics first-file init: 381ms → 22ms (worker pre-warmed) - Total metrics stage: 793ms → ~450ms All 1096 tests pass. Lint clean. https://claude.ai/code/session_01JoNjFe7S2roMfHfNcw6bso	2026-03-28 01:15:43 +09:00
Kazuki Yamada	41ed574da1	fix(skill): Remove existing skill directory on overwrite confirmation Previously, the interactive overwrite prompt confirmed but did not remove the old directory, leaving stale files (e.g. renamed tech-stack.md) behind. Now the directory is removed before regeneration, consistent with --force behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 00:27:25 +09:00
Kazuki Yamada	7649725a21	refactor(skill): Rename tech-stack.md to tech-stacks.md with ## Tech Stack: <path> format Aligns with files.md pattern (## File: <path>). Each package is now a ## section under a single # Tech Stacks heading, with ### subsections for Languages, Frameworks, Dependencies, etc. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 00:20:49 +09:00
Kazuki Yamada	2a7139c89f	refactor(skill): Use '.' instead of '(root)' for root directory label More natural as a path value and consistent with filesystem conventions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 23:58:35 +09:00
Kazuki Yamada	b2191509ca	refactor(skill): Group tech stack detection by package directory Instead of merging all dependency files into a single flat list, detectTechStack now returns a TechStackInfo[] grouped by package directory. Each directory containing a dependency file produces its own entry with path, languages, frameworks, dependencies, etc. generateTechStackMd renders each package as a separate section with `path: (root)` or `path: packages/xxx`, separated by `---`. This gives AI consumers clearer per-package context and makes line-based retrieval easier. Removes deduplicateDependencies as dependencies are now scoped per-package and don't need cross-package deduplication. configFiles stores filenames only (not full paths) since the package path provides the directory context. Closes #1182 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 23:48:44 +09:00
Kazuki Yamada	6a9d028e75	Merge pull request #1310 from yamadashy/fix/skill-tech-stack-subdirectory-detection fix(skill): Detect tech stack from dependency files in subdirectories	2026-03-27 23:27:58 +09:00
Kazuki Yamada	005eb791eb	fix(skill): Address PR review feedback for tech stack detection - Use first-wins for packageManager to match other dedup strategies - Deduplicate dependencies by name:version to preserve version skew - Normalize Node.js version v prefix before runtime version dedup - Fix stale comment referencing root-level-only detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 00:32:02 +09:00
Kazuki Yamada	18e4e386c9	refactor(cli): Skip PicoSpinner construction in quiet mode Defer PicoSpinner instantiation to avoid unnecessary object allocation when the spinner will never be displayed (quiet, verbose, or stdout mode). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 00:29:08 +09:00
Kazuki Yamada	31bb9ed22d	refactor(cli): Replace log-update with picospinner for spinner implementation Replace log-update dependency with picospinner (from tinylibs) to reduce transitive dependencies. picospinner provides built-in spinner functionality (frames, symbols, succeed/fail states) that was previously manually implemented on top of log-update, simplifying cliSpinner.ts. This removes 12 transitive packages (ansi-escapes, cli-cursor, slice-ansi, wrap-ansi, string-width, etc.) from the dependency tree. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 00:23:01 +09:00
Kazuki Yamada	c4b096f996	fix(skill): Add deduplication for runtime versions in tech stack detection Deduplicate runtimeVersions by runtime:version pair to prevent duplicate entries when multiple version files exist across subdirectories in monorepos. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 00:09:46 +09:00
Kazuki Yamada	87949970da	fix(skill): Detect tech stack from dependency files in subdirectories Previously, detectTechStack() only checked root-level dependency files, causing tech-stack.md to be empty for monorepo setups using --include to target a specific package. Now all dependency files in processedFiles are checked regardless of directory depth. Since processedFiles is already filtered by --include/--ignore, this naturally scopes detection to the user's target. Also adds dependency deduplication for cases where multiple package.json files define the same package, and stores config file full paths to distinguish files across subdirectories. Closes #1182 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 00:01:05 +09:00
Kazuki Yamada	e2101a50d1	test(core): Add boundary test for exactly 256-char base64 string Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 22:59:45 +09:00
Kazuki Yamada	25ec27028e	fix(core): Reduce false positives in truncateBase64 for path-like strings Raise MIN_BASE64_LENGTH_STANDALONE from 60 to 256 since truncating short strings saves negligible tokens. Require digits in isLikelyBase64 heuristic since real base64-encoded binary data virtually always contains numbers, while XPath and file path strings typically do not. Closes #1298 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 22:52:33 +09:00
Florian Lefebvre	7ec67b181e	lint	2026-03-24 09:52:58 +01:00
Florian Lefebvre	4f487444cb	perf: migrate to tinyclip from clipboardy	2026-03-24 09:52:18 +01:00
autofix-ci[bot]	c5d104e5c1	[autofix.ci] apply automated fixes	2026-03-22 15:54:13 +00:00
Kazuki Yamada	329eda2832	refactor(test): Extract mock helper and fix missing env var docs - Extract duplicated DefaultActionRunnerResult mock into createMockDefaultActionResult() helper function - Add missing REPOMIX_REMOTE_TRUST_CONFIG env var mention in ko, pt-br, ru library usage docs for consistency with other languages Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 00:53:04 +09:00
Kazuki Yamada	908d0c1cb5	refactor(cli): Replace isRemote flag with skipLocalConfig Remove the intermediate isRemote flag that inverted remoteTrustConfig only to be re-inverted back to skipLocalConfig in defaultAction. Now remoteAction computes skipLocalConfig directly, reducing the internal flag chain from 3 concepts to 2 (remoteTrustConfig → skipLocalConfig). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 00:43:56 +09:00
Kazuki Yamada	89c7875732	fix(security): Move --config validation before download and deduplicate findConfigFile - Move absolute path validation for --config to before repository download/clone, avoiding wasted I/O on invalid input - Consolidate duplicate findConfigFile calls in skipLocalConfig branch into a single search with conditional handling - Add test for relative --config rejection even with --remote-trust-config Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 00:29:02 +09:00
Kazuki Yamada	17f5bb8062	fix(security): Require absolute path for --config in remote mode Relative --config paths in remote mode would resolve against the cloned temp directory, potentially loading and executing malicious config files (e.g., repomix.config.ts) from untrusted repositories. Now rejects relative paths with a clear error message guiding users to use absolute paths instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 23:35:20 +09:00
Kazuki Yamada	f2b41b875a	fix(security): Allow --config flag in remote mode and add skip log message The --config flag represents an explicit user choice and should not be blocked in remote mode. Only auto-detected config files in the cloned repo are skipped. Also adds a logger.note() message when a config file is found in the remote repository but skipped, guiding users to --remote-trust-config. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 23:35:20 +09:00
Kazuki Yamada	127f561d13	test(cli): Add tests for --remote-trust-config and env var opt-in Cover the previously untested opt-in paths: - archive-download path passes isRemote: true - --remote-trust-config flag sets isRemote to false - REPOMIX_REMOTE_TRUST_CONFIG=true env var sets isRemote to false - Non-"true" env var values (e.g., "yes") keep isRemote true Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 23:35:20 +09:00
Kazuki Yamada	186f74a85f	fix(security): Skip config file loading from remote repositories When using `repomix --remote <url>` or the MCP `pack_remote_repository` tool, config files (repomix.config.ts/js) from the cloned repository were executed via jiti, allowing a malicious repository to achieve arbitrary code execution on the user's machine. This commit skips all local config file loading when processing remote repositories. The `isRemote` flag is propagated from remoteAction through defaultAction to loadFileConfig, which skips local config auto-detection and --config flag resolution. Global config and CLI options continue to work normally. Users who need to trust remote configs can do so in a future release via an explicit opt-in flag (e.g., --trust-remote-config). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-22 23:35:20 +09:00
Kazuki Yamada	cdb6ab156c	test(output): Add strict XML error handling to DOMParser in tests Use a strict error handler for @xmldom/xmldom's DOMParser that throws on all severity levels (warning, error, fatalError). By default, xmldom silently continues parsing malformed XML, which could mask XMLBuilder regressions. This ensures tests fail immediately on any XML well-formedness issue. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 01:02:37 +09:00
Kazuki Yamada	04b277f269	refactor(deps): Replace fast-xml-parser with fast-xml-builder fast-xml-parser has accumulated 10 CVEs (6 in 2026 alone), with a recurring pattern of incomplete fixes in its DOCTYPE/entity parser. Since Repomix only uses the XMLBuilder functionality (not the parser), switching to fast-xml-builder — the standalone builder package that fast-xml-parser v5 internally delegates to — eliminates 9/10 parser-side CVE noise while maintaining identical behavior. - Replace fast-xml-parser (831KB) with fast-xml-builder (176KB) as dependency - Add @xmldom/xmldom as devDependency for XML validation in tests - Update import in outputGenerate.ts (named → default export) - Migrate test XML parsing from fast-xml-parser's XMLParser to @xmldom/xmldom's DOMParser, providing cross-implementation validation of generated XML Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 01:02:37 +09:00
Kazuki Yamada	f38828aa90	fix(test): Use path.sep in fileSearch test for cross-platform compatibility Mock data and expected sort order now use path.sep instead of hardcoded '/' separators. On Windows, path.sep is '\' so sortPaths splits differently, producing a different sort order. Co-Authored-By: Claude Opus 4.6 (1M context) <koukun0120@gmail.com>	2026-03-20 01:16:01 +09:00
Kazuki Yamada	f5977b2e6a	test(core): Use exact sorted order assertion in fileSearch test Replace weak arrayContaining assertion with exact toEqual using the correct sorted order, so the test verifies both content and sort behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <koukun0120@gmail.com>	2026-03-20 01:08:03 +09:00
yamadashy	d9fa509ee6	perf(core): Optimize sortPaths with decorate-sort-undecorate pattern Pre-compute path.split() once per path before sorting, avoiding O(N log N) repeated string allocations during comparisons. Benchmark: 10,000 files 65ms → 11ms (6x speedup). Co-Authored-By: Claude Opus 4.6 (1M context) <koukun0120@gmail.com>	2026-03-20 00:54:33 +09:00
Kazuki Yamada	2e98b5cc18	refactor(core): Remove dead isExtractionError check from retry logic With the streaming pipeline, errors propagate as native Error objects rather than RepomixError, so the isExtractionError check was always false. Retrying extraction errors is acceptable since the retry loop is bounded to 3 attempts.	2026-02-23 22:58:56 +09:00
Kazuki Yamada	44f15b6477	refactor(core): Remove unused getArchiveFilename function The streaming tar.gz extraction no longer uses temporary files, making this filename generation function unnecessary.	2026-02-18 22:41:26 +09:00
Kazuki Yamada	ef194b8eeb	perf(core): Replace ZIP archive download with streaming tar.gz extraction The previous ZIP-based archive download used fflate's in-memory extraction, which failed on large repositories (e.g. facebook/react) due to memory constraints and ZIP64 limitations. Switch to tar.gz format with Node.js built-in zlib + tar package, enabling a full streaming pipeline (HTTP response -> gunzip -> tar extract -> disk) with no temporary files and constant memory usage regardless of repo size. Key changes: - Replace fflate with tar package for archive extraction - Change archive URLs from .zip to .tar.gz - Use streaming pipeline instead of download-then-extract - Leverage tar's built-in strip and path traversal protection - Explicitly destroy streams after pipeline for Bun compatibility - Use child_process runtime under Bun to avoid worker_threads hang	2026-02-18 00:22:07 +09:00
Kazuki Yamada	05f11f46c7	refactor(core): Remove unused fileCollect worker infrastructure File collection was replaced with a promise pool approach in 96ff05dc, but the worker-related code remained. This removes the now-unused fileCollectWorker and all references to it from the worker system.	2026-02-17 23:09:18 +09:00
Kazuki Yamada	e97691dd36	perf(core): Replace worker threads with promise pool for file collection After the UTF-8 fast path optimization eliminated the CPU-heavy jschardet bottleneck, file collection became I/O-bound. Worker threads now add pure overhead (Tinypool init, structured clone, IPC) without benefit. Benchmark (954 files, M2 Pro 10-core): - Worker Threads: ~108ms → Promise Pool (c=50): ~37ms (2.9x faster) Changes: - Replace Tinypool worker dispatch with a simple promise pool (c=50) - Inject readRawFile via deps for testability - Remove unused concurrentTasksPerWorker from WorkerOptions - Simplify tests to use readRawFile mock instead of 5+ module mocks	2026-02-17 23:09:18 +09:00
Kazuki Yamada	7dcdbae24d	perf(core): Add UTF-8 fast path to skip expensive jschardet encoding detection Previously, every file went through jschardet.detect() which scans the entire buffer through multiple encoding probers (MBCS, SBCS, Latin1) with frequency table lookups — the most expensive CPU operation in file collection. Since ~99% of source code files are UTF-8, we now try TextDecoder('utf-8', { fatal: true }) first. If it succeeds, jschardet and iconv are skipped entirely. Non-UTF-8 files (e.g., Shift-JIS, EUC-KR) fall back to the original detection path. Additionally, set concurrentTasksPerWorker=3 for fileCollect workers to better overlap I/O waits within each worker thread. Benchmark results (838 files, 10 CPU cores): - Before: ~616ms - After: ~108ms (5.7x faster)	2026-02-17 23:09:18 +09:00
autofix-ci[bot]	1d5297c9a6	[autofix.ci] apply automated fixes	2026-02-17 13:55:26 +00:00
Kazuki Yamada	aef7cc1f4a	feat(cli): Add ssh:// and git:// protocol support to remote URL auto-detection The existing --remote flag already supports ssh:// and git:// protocols via git-url-parse, so auto-detection should cover them as well.	2026-02-17 22:54:03 +09:00
Kazuki Yamada	540e8dd2a3	feat(cli): Auto-detect explicit remote URLs in positional arguments Allow users to run `repomix https://github.com/user/repo` or `repomix git@github.com:user/repo.git` without the `--remote` flag. Only explicit URL formats (https:// and git@) are auto-detected. Shorthand format (owner/repo) is not auto-detected to avoid ambiguity with local directory paths. Closes #1120	2026-02-17 22:40:13 +09:00
Kazuki Yamada	66e572f62e	test(core): Add test for skipping retry on extraction error Verify that extraction errors cause immediate failure without retrying, since the same archive will produce the same extraction error.	2026-02-14 18:55:22 +09:00
Kazuki Yamada	7ea89f7fd0	fix(website): Remove Vue dependency from cliCommand utility to fix CI Define CliCommandPackOptions interface locally in cliCommand.ts instead of importing PackOptions from usePackOptions.ts which depends on Vue module. This prevents tsc from following the import chain to Vue in CI.	2026-02-03 00:06:09 +09:00
Kazuki Yamada	91c39bb5ee	fix(website): Avoid importing Vue-dependent module in test to fix CI lint Use local interface definition instead of importing PackOptions from usePackOptions.ts which depends on Vue and fails tsc in CI.	2026-02-03 00:04:13 +09:00
Kazuki Yamada	62280a6870	fix(website): Add shell escaping and ZIP upload handling to CLI command generator Address PR review comments: - Add shell escaping for user-controlled values (repositoryUrl, includePatterns, ignorePatterns) to prevent command injection when users copy-paste the generated command - Skip --remote flag for uploaded file names by validating with isValidRemoteValue - Add unit tests for generateCliCommand covering all option combinations	2026-02-02 00:21:16 +09:00
Kazuki Yamada	26f6c6c83a	Merge pull request #1098 from yamadashy/chore/optimize-tsconfig chore(config): Optimize tsconfig for TypeScript 5.x and Node.js 20+	2026-01-17 14:09:58 +09:00
Kazuki Yamada	4c7d8fbc99	build(config): Optimize tsconfig for TypeScript 5.x and Node.js 20+ - Update target from es2016 to es2022 (Node.js 20+ fully supports ES2022) - Add moduleDetection: "force" to treat all files as modules - Add verbatimModuleSyntax: true (TypeScript 5.0+ recommended setting) - Remove esModuleInterop (replaced by verbatimModuleSyntax) - Remove noImplicitAny (redundant, included in strict) - Remove compileOnSave (unused, VS Code ignores this option) - Remove redundant declaration: true from tsconfig.build.json - Fix repomix.config.cts to use CommonJS syntax (module.exports)	2026-01-17 14:06:54 +09:00

1 2 3 4 5 ...

498 Commits