Six items from claude's incremental review (`12:48:43Z`):
- monitoring/dashboard.json: Group the outcomes widget by both
`metric.label.outcome` and `metric.label.reason`. Previously all
failures collapsed into a single `turnstile_failed` series, which
contradicted the README claim that the `reason` label drives the
breakdown.
- monitoring/metrics/*.yaml: Narrow the metric filter to
`jsonPayload.event=("turnstile_siteverify" OR "pack_completed")`.
Without this anchor, any future code path attaching
`siteverifyDurationMs` to an unrelated log silently joins the
distribution and creates new metric label values.
- usePackRequest.ts: Mirror `progressMessage.value = null` alongside
the `progressStage.value = null` clear on token-acquisition aborted /
error branches. Prevents a future edit setting a verifying message
from leaking prior-run state.
- turnstile.test.ts: Add a focused `describe` block with five tests
asserting `siteverifyDurationMs` is attached to every post-siteverify
log (one success path + four reject branches). The metric YAML
filters on field presence, so a refactor that drops the field on any
branch would silently break the metric without other tests failing.
Uses the existing `vi.spyOn(logger, ...)` pattern; no clock injection
needed.
- monitoring/README.md: Note that the metric filter pins
`service_name="repomix-server-us"`, so future regions (`-eu`,
`-asia`) silently drop out until the filter is broadened or
per-region counterparts applied.
- monitoring/README.md: Add a `gcloud logging metrics describe` snippet
for verifying a YAML edit was actually applied (gcloud update is
silent on no-op vs effective change).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six items from gemini, claude initial review, and claude follow-up:
- turnstile.ts: Update misleading comment that claimed the metric filters
on `event=turnstile_siteverify` and `outcome=success`. The actual
Cloud Monitoring metrics in `monitoring/metrics/` filter on
`siteverifyDurationMs` field presence, which uniformly captures both
the parallel success log (event=turnstile_siteverify) and the four
rejectAndLog failure paths (event=pack_completed). The comment
contradicted README and YAML and would mislead future readers.
- turnstile.ts: Wrap rejectAndLog in a local `rejectWithDuration` helper
so every post-siteverify branch automatically carries
`siteverifyDurationMs`. Prevents drift if a fifth reject reason gets
added later.
- client.ts: Split the wire-protocol `PackProgressStage` (server-emitted
SSE values) from the display-only `DisplayProgressStage` superset that
adds `verifying`. Keeping the synthetic stage out of the wire type
prevents silent divergence with the server's `PackProgressStage`.
- usePackRequest.ts, TryItLoading.vue, TryItResult.vue: Switch the
display-side type to `DisplayProgressStage`. `onProgress` callbacks
still take the wire `PackProgressStage`.
- usePackRequest.ts: Clear `progressStage` on token-acquisition failure
branches (aborted / error). Functionally invisible since loading=false
hides the loading UI, but prevents the next submit's verifying flash
from briefly showing the previous run's stale state.
- monitoring/metrics/turnstile_siteverify_duration.yaml: Retune the
exponential bucket layout for the 100ms-1s SLO band where decisions
get made. Doubling buckets only placed ~3 boundaries between 100ms
and 1s; growthFactor=1.5 with scale=10 places ~8 boundaries there.
18 finite buckets cover 10ms to ~9.85s, comfortably above the 5s
siteverify timeout so timeouts don't land in overflow.
- monitoring/README.md: Document that pre-network rejections
(secret_missing, missing_token, token_too_long) intentionally don't
carry siteverifyDurationMs, so they're excluded from both metrics
but still appear in the existing pack_requests metric.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the metric setup from prose-only README instructions to checked-in
YAML files under `monitoring/metrics/` so the dashboard, the metrics it
depends on, and the apply commands all live next to each other.
- `turnstile_siteverify_duration.yaml`: distribution metric on
`jsonPayload.siteverifyDurationMs`, exponential buckets 1ms-32s.
- `turnstile_siteverify_outcomes.yaml`: counter metric with `outcome`
and `reason` labels for the success-vs-failure breakdown widget.
README updated with the gcloud commands.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wire up two new tiles to surface the `siteverifyDurationMs` field added
in the previous commit:
- "Turnstile siteverify latency P50 / P95 / P99" — line chart with a
1s threshold marker so a steady regression jumps off the chart.
- "Turnstile siteverify outcomes (by outcome)" — stacked area
breaking down success vs turnstile_failed counts over time.
Both depend on log-based metrics `turnstile_siteverify_duration`
(distribution) and `turnstile_siteverify_outcomes` (counter) that need
to be created once in the GCP Console — README documents the filter,
field, and label extractors so the setup is reproducible.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(readability): dashboard panels for options usage and cache hit
ratio were rendering with empty legend values (e.g. "compress=",
"removeComments=") making it impossible to distinguish true vs false
series. Every other signal on the dashboard was fine — this was a
subtle API mismatch
decision(labels-plural): GCP Cloud Monitoring widget legend templates
use `${metric.labels.X}` (plural). The dashboard had
`${metric.label.X}` (singular), which silently resolves to the empty
string instead of erroring out. Replaced all 5 occurrences
Applied to the live GCP dashboard via `gcloud monitoring dashboards
update` before committing — legends now render correctly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses PR #1493 review comments from gemini-code-assist,
devin-ai-integration, coderabbitai, and claude.
intent(classifier-fix): fix classifyRejectReason returning 'unknown'
for every validation_error — gemini and devin both flagged that
validateRequest wraps ZodError in AppError, which strips the
.issues array the classifier needs. This made the new "Validation
rejections (by reason)" dashboard panel effectively useless, all
rejections collapsing into one 'unknown' bucket
decision(cause-chain): use the native `Error.cause` option
(ES2022) to attach the original ZodError when wrapping in AppError.
Less invasive than adding a custom field; classifyRejectReason now
checks both `.issues` directly on the error AND on `.cause`, so
callers don't need to know which layer wrapped the error
decision(json-catch-preserve): change the JSON.parse bare `catch {}`
to `catch (jsonError)` and pass the original SyntaxError to
logError. Without this the `invalid_json` bucket shows only a
count, leaving an operator no breadcrumbs when it spikes. Also
drops the classifyRejectReason round-trip — rejectReason is
statically known at this call site
rejected(packOptions-snake-case): gemini suggested renaming the
packOptions log fields to snake_case so they literally match the
GCP metric label names. Keeping camelCase in TS code (project
convention) + snake_case in GCP labels (platform convention) is
deliberate — the `EXTRACT(...)` path on the metric bridges them
explicitly. Forcing one to match the other breaks one of the two
ecosystems' idioms
improve(options-panels): coderabbit noted only `compress` had a
dedicated panel even though pack_options_usage exposes 4 labels.
Split into four quarter-width tiles covering compress,
removeComments, outputParsable, and hasIncludePatterns. Also
dropped the "(Tree-sitter)" parenthetical from the compress title
— meaningful to the author, confusing to an on-caller
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(observability): answer three questions that the initial
observability PR couldn't:
(1) what drives OOM risk — size distribution heatmap needs the
data out of pack_completed's `metrics.totalTokens` as a
distribution metric, not just a scalar log field
(2) which differentiating features users actually adopt
(compress / removeComments / outputParsable / custom
include-patterns)
(3) what inputs validation rejects and why — surfaces feature
requests (non-github hosts, oversized files) and abuse patterns
decision(packOptions-in-log): flatten option booleans under a
`packOptions` nested object on pack_completed success logs so a
log-based metric with option labels can extract them without
exploding the existing pack_requests cardinality. Patterns are
logged as `hasIncludePatterns` / `hasIgnorePatterns` booleans only —
raw user input never enters metric labels
decision(rejectReason-enum): classifyRejectReason is a switch on zod
error message strings since the messages are stable (defined in
packRequestSchema.ts in this repo). Non-zod errors (notably the
pre-zod `Invalid JSON in options` path) get an explicit branch so
they don't fall into the generic `unknown` bucket
decision(invalid-json-logging): the JSON-parse catch previously
returned 400 without logging, so `rejectReason=invalid_json` would
have been invisible. Added an explicit logError call on that path
rejected(per-option-metrics): creating separate metrics like
`pack_with_compress` was considered and rejected — 1 metric with
4 boolean labels gives 16 series total (manageable) and lets the
dashboard group by any option without schema sprawl
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(scope-reduction): this repo operates a single GCP project — IaC
for log-based metrics adds maintenance cost (Console drift, stale
YAMLs) without payoff. Metrics are recreated ad-hoc from the Console
when needed
decision(keep-dashboard): the dashboard definition is still worth
committing because it's the user-facing layout, easy to damage by
accidental Console edit, and trivial to restore via
`gcloud monitoring dashboards create --config-from-file=dashboard.json`
decision(readme-slim): strip README to just the restore command —
longer prose belonged to the earlier IaC-lite layout that's now gone
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
intent(observability): make the new pack_completed / source / durationMs
log fields queryable as time series in Cloud Monitoring without
re-running ad-hoc gcloud logging read pipelines during incidents
decision(iac-lite): store metric definitions as standalone YAMLs under
website/server/monitoring/metrics/ + an idempotent bash apply script,
rather than Terraform. The project has no existing IaC for monitoring
and adding a full Terraform setup would dwarf the actual change
decision(labels): constrain pack_requests labels to
outcome/source/input_type/cached/format (max 96 time series). Omitted
repoHost because its cardinality is unbounded — raw value remains in
the log jsonPayload for ad-hoc queries
decision(oom-split): separate oom_terminations and container_killed
counters. They fire on distinct textPayloads from Cloud Run's memory
enforcement paths; keeping them separate makes alert tuning easier
rejected(repoHost-as-label): too many unique hostnames — would create a
time series per repo host and cost scale with traffic diversity
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>