fix(ce-doc-review): tighten suggested_fix and why_it_matters rules

Personas were emitting suggested_fix as `(a)/(b)/(c)` menus that punt the choice to Apply time, and why_it_matters as quote-sandwich openings that buried the consequence. Both surface in the per-finding walk-through where the user has to decide Apply / Defer / Skip without a clear picture of what lands. Adds an emit-time rule requiring suggested_fix to commit to one recommendation -- single, multi-facet, or composite (A+C) -- and forbids alternative menus. Strengthens the consequence-first rule for why_it_matters to call out quote-sandwich structure and cap embedded quotes at ~30 words.
2026-06-19 15:41:46 +02:00 · 2026-04-26 14:26:18 -07:00
parent e8c118e28f
commit fd99019b2b
1 changed files with 8 additions and 1 deletions
@@ -94,6 +94,13 @@ Rules:
  - Completeness additions mechanically implied by the document's own explicit decisions (not high-level goals — a goal can be satisfied by multiple valid requirements)

 - `suggested_fix` is required for `safe_auto` and `gated_auto` findings. For `manual` findings, include only when the fix is obvious.
+
+- **`suggested_fix` commits to one recommendation — no menus of alternatives.** The user's decision at the walk-through is binary (Apply / Defer / Skip), so the fix text must describe what specifically lands when they pick Apply — not a list of possibilities for the agent to choose from afterward. The committed recommendation can be:
+  - A single action — `Drop the Advisory tier from the enum.`
+  - A multi-facet action where one fix touches several named pieces — `Add a Validation section enumerating correction-vs-confirm rate, redirect rate, and PR-size shift.`
+  - A composite where you considered alternatives and concluded the right move combines two or more (e.g., A+C, not A alone) — name the combination as the fix without framing the elements as options.
+
+  What's not allowed is an alternative menu that punts the choice to Apply time: `(a)/(b)/(c)` lists, "either X or Y", "consider A, B, or C", "add A or, alternatively, B." The test: at Apply time, would the agent still need to pick which sub-option to implement? If yes, rewrite as the committed choice (single, multi-facet, or composite). If the alternatives are genuinely independent and each worth taking on its own, emit N findings instead. Negative example to avoid: `Add a Validation section that (a) confirms the mechanism works, (b) flags ritualization, and (c) gates Phase B` — leaves the user guessing whether Apply will write all three, pick one, or paraphrase. If the persona's actual recommendation is "do (a) and (c) together," the fix should say so directly: `Add a Validation section that names correction-vs-confirm rate as the working signal and gates Phase B on Phase A's observed value.`
 - If you find no issues, return an empty findings array. Still populate residual_risks and deferred_questions if applicable.
 - Use your suppress conditions. Do not flag issues that belong to other personas.

@@ -101,7 +108,7 @@ Writing `why_it_matters` (required field, every finding):

 The `why_it_matters` field is how the reader — a developer triaging findings, a reader returning to the doc months later, a downstream automated surface — understands the problem without re-reading the file. Treat it as the most important prose field in your output; every downstream surface (walk-through questions, bulk-action previews, Open Questions entries, headless output) depends on it being good.

- **Lead with observable consequence.** Describe what goes wrong from the reader's or implementer's perspective — what breaks, what gets misread, what decision gets made wrong, what the downstream audience experiences. Do not lead with document structure ("Section X on line Y says..."). Start with the effect ("Implementers will disagree on which tier applies when..."). Section references appear later, only when the reader needs them to locate the issue.
+- **Lead with observable consequence.** Describe what goes wrong from the reader's or implementer's perspective — what breaks, what gets misread, what decision gets made wrong, what the downstream audience experiences. Do not lead with document structure ("Section X on line Y says...") or with quoted document text — a "The plan says X. The brainstorm says Y. Despite this, [problem]" structure buries the consequence behind a quote sandwich, even when the consequence eventually appears later in the field. Start with the effect ("Implementers will disagree on which tier applies when..."), and cite document quotes only as supporting evidence after the consequence is named. Cap embedded quotes at roughly 30 words combined; paraphrase or summarize beyond that. Section references and quotes appear later, only when the reader needs them to locate the issue.
 - **Explain why the fix resolves the problem.** If you include a `suggested_fix`, the `why_it_matters` should make clear why that specific fix addresses the root cause. When a similar pattern exists elsewhere in the document or codebase (a parallel section, an established convention, a cited code pattern), reference it so the recommendation is grounded in what the team has already chosen.
 - **Keep it tight.** Approximately 2-4 sentences. Longer framings are a regression — downstream surfaces have narrow display budgets, and verbose content gets truncated or skimmed.
 - **Always produce substantive content.** `why_it_matters` is required by the schema. Empty strings, nulls, and single-phrase entries are validation failures. If you found something worth flagging at anchor `50` or higher, you can explain it — the field exists because every finding needs a reason.