Files
Kazuki Yamada 54c6a3d238 fix(website): Address claude third-pass review on siteverify metric
Six items from claude's incremental review (`12:48:43Z`):

- monitoring/dashboard.json: Group the outcomes widget by both
  `metric.label.outcome` and `metric.label.reason`. Previously all
  failures collapsed into a single `turnstile_failed` series, which
  contradicted the README claim that the `reason` label drives the
  breakdown.
- monitoring/metrics/*.yaml: Narrow the metric filter to
  `jsonPayload.event=("turnstile_siteverify" OR "pack_completed")`.
  Without this anchor, any future code path attaching
  `siteverifyDurationMs` to an unrelated log silently joins the
  distribution and creates new metric label values.
- usePackRequest.ts: Mirror `progressMessage.value = null` alongside
  the `progressStage.value = null` clear on token-acquisition aborted /
  error branches. Prevents a future edit setting a verifying message
  from leaking prior-run state.
- turnstile.test.ts: Add a focused `describe` block with five tests
  asserting `siteverifyDurationMs` is attached to every post-siteverify
  log (one success path + four reject branches). The metric YAML
  filters on field presence, so a refactor that drops the field on any
  branch would silently break the metric without other tests failing.
  Uses the existing `vi.spyOn(logger, ...)` pattern; no clock injection
  needed.
- monitoring/README.md: Note that the metric filter pins
  `service_name="repomix-server-us"`, so future regions (`-eu`,
  `-asia`) silently drop out until the filter is broadened or
  per-region counterparts applied.
- monitoring/README.md: Add a `gcloud logging metrics describe` snippet
  for verifying a YAML edit was actually applied (gcloud update is
  silent on no-op vs effective change).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 21:59:53 +09:00

61 lines
2.3 KiB
Markdown

# Repomix server monitoring
Cloud Monitoring dashboard definition for `repomix-server-us`.
Log-based metrics used by the dashboard are managed directly in the GCP Console
(`logging.googleapis.com/user/oom_terminations` and `container_killed`). They
persist on the project and do not need to be redefined here.
## Turnstile siteverify metrics
The dashboard's "Turnstile siteverify latency" and "Turnstile siteverify
outcomes" widgets depend on two log-based metrics. Definitions live in
`metrics/` and are applied once per project:
```bash
gcloud logging metrics create turnstile_siteverify_duration \
--config-from-file=metrics/turnstile_siteverify_duration.yaml \
--project=repomix
gcloud logging metrics create turnstile_siteverify_outcomes \
--config-from-file=metrics/turnstile_siteverify_outcomes.yaml \
--project=repomix
```
To update an existing metric (e.g. after editing the filter or buckets),
swap `create` for `update`. Both metrics filter on `siteverifyDurationMs`
field presence so success and failure paths are captured uniformly; the
`outcome` and `reason` labels on the counter metric drive the breakdown
in the "outcomes" widget.
Pre-network rejections (`secret_missing`, `missing_token`,
`token_too_long`) intentionally don't carry `siteverifyDurationMs`
they short-circuit before the timer starts, so they're excluded from
both metrics. Those reject reasons still appear in the existing
`pack_requests` metric (under `outcome=turnstile_failed`) for
operational counting, just not in the latency distribution.
The metric filter pins `service_name="repomix-server-us"`, so requests
served from any future region (`-eu`, `-asia`) silently drop out of the
distribution until either the filter is broadened or per-region
counterparts are applied. Surface this in the migration plan when
adding the next region.
To verify a YAML edit was actually applied to the live metric (gcloud
update is silent on no-op vs effective change):
```bash
gcloud logging metrics describe turnstile_siteverify_duration --project=repomix
```
## Apply the dashboard
```bash
# Create
gcloud monitoring dashboards create --config-from-file=dashboard.json --project=repomix
# Update (use the dashboard ID from `gcloud monitoring dashboards list`)
gcloud monitoring dashboards update projects/repomix/dashboards/<ID> \
--config-from-file=dashboard.json --project=repomix
```