Commit Graph

14 Commits

Author SHA1 Message Date
Kazuki Yamada fa06e5059c fix(website): Address PR review feedback on siteverify metric
Six items from gemini, claude initial review, and claude follow-up:

- turnstile.ts: Update misleading comment that claimed the metric filters
  on `event=turnstile_siteverify` and `outcome=success`. The actual
  Cloud Monitoring metrics in `monitoring/metrics/` filter on
  `siteverifyDurationMs` field presence, which uniformly captures both
  the parallel success log (event=turnstile_siteverify) and the four
  rejectAndLog failure paths (event=pack_completed). The comment
  contradicted README and YAML and would mislead future readers.
- turnstile.ts: Wrap rejectAndLog in a local `rejectWithDuration` helper
  so every post-siteverify branch automatically carries
  `siteverifyDurationMs`. Prevents drift if a fifth reject reason gets
  added later.
- client.ts: Split the wire-protocol `PackProgressStage` (server-emitted
  SSE values) from the display-only `DisplayProgressStage` superset that
  adds `verifying`. Keeping the synthetic stage out of the wire type
  prevents silent divergence with the server's `PackProgressStage`.
- usePackRequest.ts, TryItLoading.vue, TryItResult.vue: Switch the
  display-side type to `DisplayProgressStage`. `onProgress` callbacks
  still take the wire `PackProgressStage`.
- usePackRequest.ts: Clear `progressStage` on token-acquisition failure
  branches (aborted / error). Functionally invisible since loading=false
  hides the loading UI, but prevents the next submit's verifying flash
  from briefly showing the previous run's stale state.
- monitoring/metrics/turnstile_siteverify_duration.yaml: Retune the
  exponential bucket layout for the 100ms-1s SLO band where decisions
  get made. Doubling buckets only placed ~3 boundaries between 100ms
  and 1s; growthFactor=1.5 with scale=10 places ~8 boundaries there.
  18 finite buckets cover 10ms to ~9.85s, comfortably above the 5s
  siteverify timeout so timeouts don't land in overflow.
- monitoring/README.md: Document that pre-network rejections
  (secret_missing, missing_token, token_too_long) intentionally don't
  carry siteverifyDurationMs, so they're excluded from both metrics
  but still appear in the existing pack_requests metric.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 21:42:21 +09:00
Kazuki Yamada 35c56abd02 perf(website): Show verifying step + emit siteverify duration metric
Two changes targeting the visible "..." gap between Pack click and the
first SSE progress event observed after PR #1544 landed:

- Client: add a synthetic `verifying` PackProgressStage so the loading
  UI displays "Verifying request..." while the server runs Turnstile
  siteverify (typically 100-1000ms before the first 'cache-check' SSE
  event arrives). The first onProgress callback from handlePackRequest
  overwrites it with the real server-reported stage.

- Server: time the siteverify round-trip in `turnstileMiddleware` and
  emit `siteverifyDurationMs` on every outcome (success / network
  failure / rejected / action mismatch / hostname mismatch). Success
  path adds a structured log with `event: turnstile_siteverify` so
  Cloud Monitoring can build a log-based distribution metric for
  p50/p95/p99 latency and alert on regressions during Cloudflare
  incidents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 19:19:01 +09:00
Kazuki Yamada 523fb07111 feat(website): Add Cloudflare Turnstile verification to /api/pack
intent(pack-defense): Repomix を「コード抽出 API」として大量に叩く匿名クローラ対策。GA + Cloud Run logs の調査で 2026-04 末に 21K+ unique GitHub repos を 7 日で系統列挙されたインシデントが確認された。IP ベースの rate limit (3/min, 30/day) は機能していたが residential proxy + Tencent Cloud SG で 2,256+ unique IPs に分散され実質無力化されていた。
decision(turnstile-vs-asn): Turnstile (JS challenge) を採用。ASN ブロックは residential proxy で無力、daily limit 強化は IP 数で水増し可能。invisible JS challenge は residential proxy 経由でも通せないので一番 cost asymmetric。
constraint(scope): /api/pack のみ。docs / health / ホームページ閲覧は無関係 — Googlebot / GPTBot / ClaudeBot 等は GET HTML しか叩かないので SEO/LLMO に影響なし。
decision(fail-policy): TURNSTILE_SECRET_KEY 未設定なら fail-open(dev/preview を壊さない、警告ログは出す)。設定済みでトークン欠落・siteverify 失敗・ネットワーク失敗は全て fail-closed の 403。
decision(token-transport): X-Turnstile-Token ヘッダで送信。FormData フィールドにすると packRequestSchema (valibot) を汚染するため、cross-cutting concern として layer を分離。
decision(client-widget): invisible 不可視ウィジェットを TryIt.vue マウント時にレンダ、submit 直前に turnstile.execute() で 1-shot トークン取得。トークンは 5 分有効・1 回限りなので毎 pack で reset → execute。
rejected(form-field-token): cfTurnstileToken を FormData に入れる案 — packRequestSchema が strict object のため新フィールド追加が必要、ビジネスロジックと認証が混ざる。
rejected(asn-block): Tencent Cloud SG (AS132203) WAF ブロック — バルク部分には効くが residential proxy 部分(家庭 ISP・モバイル・大学ネット)が世界中に散らばっており ASN 単位で弾けない、正規ユーザを巻き込むリスク。
rejected(daily-limit-tightening): 30 → 10/day per IP — IP 数で水増しできる相手には無意味、人間ユーザの体験のみ悪化。
constraint(observability): outcome="turnstile_failed" として既存の pack_completed イベントに乗せる。新 metric 不要、既存ダッシュボードに自動で reject reason として現れる。
learned(rate-limit-effectiveness): スパイク期間中の SG pack 成功率は 0.15% (32/21,501)。app-level rate limit は処理は止めていたが入口の負荷(TCP/TLS/Upstash check)は受けていた。Turnstile は CDN 層に近く、より早く弾ける利点がある。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 21:41:04 +09:00
Kazuki Yamada 902f87d353 feat(website): Stream pack progress via NDJSON
Add real-time progress streaming to the pack endpoint using Hono's
stream() helper with NDJSON format. Users now see stage-specific
messages during processing instead of a static "Processing repository..."

Server changes:
- packAction uses Hono stream() with NDJSON (one JSON per line)
- processRemoteRepo split into git clone + runDefaultAction for
  separate cloning/processing stages
- processZipFile reports extracting/processing stages
- Content-Encoding: identity skips compress for real-time delivery

Client changes:
- packRepository parses NDJSON stream with onProgress callback
- TryItLoading displays stage messages (Checking cache, Cloning
  repository, Processing files, etc.)
- message field prepared for future detailed progress from pack()

Progress stages:
- URL: cache-check → cloning → processing → result
- ZIP: extracting → processing → result

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:08:48 +09:00
Kazuki Yamada cf959039fe perf(website): Use character counts instead of token counts for file selection
Remove tokenCountTree option from website server to avoid expensive
token count calculation for all files. The file selection UI now uses
character counts (which are computed from content.length at no cost)
instead of token counts. Summary totalTokens and top files token counts
remain accurate as they are calculated independently.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:15:43 +09:00
Kazuki Yamada a19518e916 feat(website): Display security alert for suspicious files in pack results
When security check detects files containing potentially sensitive
information in zip uploads, show a warning section in the result metadata
panel listing the excluded files and detection reasons.
2026-02-26 22:43:43 +09:00
Kazuki Yamada 7b79ea8292 fix(website): revert API_BASE_URL configuration to standard pattern
Reverts the DEV/PROD logic change in API_BASE_URL back to the standard
pattern used throughout the codebase. The original change was
unintentional and inconsistent with project conventions.
2025-08-24 13:02:50 +09:00
spandan-kumar 27626d42dd feat(website): add file selection checkboxes for selective re-packing
- Add FileInfo interface for individual file metadata
- Extend PackResult to include allFiles array with complete file information
- Create TryItFileSelection component with checkboxes for each file
- Add bulk selection controls (Select All/Deselect All)
- Implement re-packing functionality for selected files only
- Add live statistics showing selected files and token counts
- Fix TypeScript configuration for proper import.meta.env support
- Add responsive design for mobile and desktop
- Include scrollable file list with proper overflow handling

Resolves the GitHub issue requesting checkboxes for file inclusion/exclusion
on the website UI. Provides tree-based selection interface after initial
packing process as suggested by maintainer.

# Conflicts:
#	website/client/components/Home/TryItResult.vue
2025-08-24 13:02:50 +09:00
Kazuki Yamada 5109f3529d feat(website): implement compress functionality on website 2025-04-02 23:23:34 +09:00
paperboardofficial 87d7204be0 lint changes 2025-02-19 00:04:27 +09:00
paperboardofficial 9dd7407b88 added zip file processing feature 2025-02-19 00:04:27 +09:00
Yamada Dev 14aaf980c6 feat(website): Improve performance of large repository output using Ace Editor 2025-01-25 17:17:31 +09:00
Yamada Dev 34301ebda0 feat(website): Add parsableStyle option 2025-01-20 22:30:37 +09:00
Kazuki Yamada 6a3704363d refact(website): Changed folder structure for multi-language support in documentation 2025-01-14 00:05:30 +09:00