Commit Graph

11 Commits

Author SHA1 Message Date
Kazuki Yamada 55d3293134 perf(core): Skip binary files during GitHub archive tar extraction
Add an archive entry filter that checks file extensions with isBinaryPath
before writing to disk, avoiding unnecessary I/O for binary files (images,
fonts, executables, etc.) that would be excluded later anyway.

The filter strips the leading tar segment (e.g. "repo-branch/") since tar's
filter callback receives paths before strip is applied.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:25:19 +09:00
Kazuki Yamada b7cf3f05e2 refactor(git): Simplify URL construction by leveraging codeload.github.com auto-resolution
codeload.github.com resolves branches, tags, and SHAs automatically
without refs/heads/ or refs/tags/ prefixes. This eliminates the tag
fallback URL entirely and simplifies buildGitHubArchiveUrl to a single
return statement, saving an extra round trip for tag-based downloads.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:47:30 +09:00
Kazuki Yamada 951249f9cf perf(git): Use codeload.github.com directly to skip 302 redirect
Download archives from codeload.github.com instead of github.com/archive
to eliminate the intermediate 302 redirect, saving ~100-300ms per request.
This is the same pattern used by create-react-app and degit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:35:34 +09:00
Kazuki Yamada 2e98b5cc18 refactor(core): Remove dead isExtractionError check from retry logic
With the streaming pipeline, errors propagate as native Error objects
rather than RepomixError, so the isExtractionError check was always
false. Retrying extraction errors is acceptable since the retry loop
is bounded to 3 attempts.
2026-02-23 22:58:56 +09:00
Kazuki Yamada ef194b8eeb perf(core): Replace ZIP archive download with streaming tar.gz extraction
The previous ZIP-based archive download used fflate's in-memory extraction,
which failed on large repositories (e.g. facebook/react) due to memory
constraints and ZIP64 limitations.

Switch to tar.gz format with Node.js built-in zlib + tar package, enabling
a full streaming pipeline (HTTP response -> gunzip -> tar extract -> disk)
with no temporary files and constant memory usage regardless of repo size.

Key changes:
- Replace fflate with tar package for archive extraction
- Change archive URLs from .zip to .tar.gz
- Use streaming pipeline instead of download-then-extract
- Leverage tar's built-in strip and path traversal protection
- Explicitly destroy streams after pipeline for Bun compatibility
- Use child_process runtime under Bun to avoid worker_threads hang
2026-02-18 00:22:07 +09:00
Kazuki Yamada 66e572f62e test(core): Add test for skipping retry on extraction error
Verify that extraction errors cause immediate failure without retrying,
since the same archive will produce the same extraction error.
2026-02-14 18:55:22 +09:00
Kazuki Yamada a94ce0f2ff fix(tests): Update test mocks for vitest v4 compatibility
Vitest v4 changed how vi.fn() and vi.mock() work with class constructors.
Arrow functions in mockImplementation no longer work as constructors
when called with 'new' keyword.

Changes:
- Use regular function syntax instead of arrow functions for constructor mocks
- Use vi.hoisted() to define class mocks that can be used in vi.mock() factories
- Replace vi.fn().mockReturnValue() with vi.fn().mockImplementation() for class mocks
- Update mock instance retrieval to use vi.mocked().mock.results[0].value
2026-01-03 16:28:31 +09:00
Kazuki Yamada ea1cc485c2 chore(config): disable organizeImports for src/index.ts
Added override configuration to disable Biome's organizeImports feature
specifically for src/index.ts to allow manual import order management
while keeping automatic import organization enabled for other files.
2025-09-21 13:54:12 +09:00
Kazuki Yamada d5c5cd8bdc test(git): Update gitHubArchive test to use HEAD instead of main
Update the test expectation to reflect the change from hardcoded 'main'
branch to using HEAD for the repository's default branch.
2025-08-26 01:01:01 +09:00
Kazuki Yamada 9708e1cf38 perf(test): Optimize gitHubArchive test execution time
Reduced test execution time from ~23s to ~3.5s (85% improvement) by:

- Reduced retry counts from 3 to 1-2 for error handling tests
- Shortened timeout values (100ms → 50ms) for timeout tests
- Removed unnecessary imports (createWriteStream, fs, Readable, pipeline)
- Fixed unused parameter warnings with underscore prefix (_data)

Performance improvements:
- "should retry on failure": 3005ms → 1002ms
- "should throw error after all retries fail": 2007ms → 2005ms (minor)
- "should handle ZIP extraction error": 6011ms → 3ms
- "should handle timeout": 408ms → 208ms
- Other error tests: 6000ms+ → 1-3ms each

The tests still validate all critical functionality including:
 Retry logic and exponential backoff behavior
 Error handling for network/ZIP failures
 Security protections and edge cases
 Timeout handling mechanisms

Trade-off: Slightly reduced retry testing depth for much faster CI/development cycles.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 00:20:43 +09:00
Kazuki Yamada f4f911e637 test(git): Add comprehensive test suite for gitHubArchive.ts
Created thorough unit tests covering all functionality of the GitHub archive
download and extraction module. Tests include:

- Successful download and extraction flow
- Progress callback handling
- Retry logic with exponential backoff
- URL fallback strategies (main → master → tag)
- Error handling for network failures, ZIP corruption, timeouts
- Security validations for path traversal and absolute paths
- Archive cleanup on both success and failure
- Multiple response scenarios (404, timeout, missing body)

Test coverage includes:
- downloadGitHubArchive function with various scenarios
- isArchiveDownloadSupported function
- All edge cases and error conditions
- Security protection mechanisms

Uses proper mocking with vitest for external dependencies:
- fetch API for HTTP requests
- fflate library for ZIP extraction
- Node.js fs operations
- Stream processing components

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 00:15:10 +09:00