# Feature Specification: Operator Reason Code Translation and Humanization Contract

**Feature Branch**: `157-reason-code-translation`  
**Created**: 2026-03-22  
**Status**: Draft  
**Input**: User description: "Operator Reason Code Translation and Humanization Contract"

## Spec Scope Fields *(mandatory)*

- **Scope**: workspace + tenant + canonical-view
- **Primary Routes**:
  - `/admin/operations`
  - `/admin/operations/{run}`
  - `/admin/t/{tenant}/...` adopted tenant governance surfaces that currently show blocked, denied, degraded, skipped, or failed reasons
  - `/system/...` adopted system-console health and onboarding surfaces that expose execution or prerequisite reasons to platform operators
- **Data Ownership**:
  - This feature does not introduce a new business-domain record. It defines the shared operator-facing translation contract for reason-bearing outcomes already produced by existing workspace-owned and tenant-owned workflows.
  - Workspace-owned records affected include operation runs, system-console summaries, notification payloads, and workspace-scoped diagnostic surfaces.
  - Tenant-owned records affected include tenant governance records whose blocked, degraded, validation, or readiness states already carry reason codes.
  - Internal machine-readable reason codes remain stable; only their operator-facing translation and resolution shape changes.
- **RBAC**:
  - Existing workspace membership, tenant entitlement, platform access, and capability rules remain the access boundary for all adopted surfaces.
  - This feature changes explanation quality, not access rights.
  - Non-members and cross-scope actors remain deny-as-not-found. Humanized labels, next-step hints, filter values, and summaries must not reveal hidden tenant or workspace state.

For canonical-view specs, the spec MUST define:

- **Default filter behavior when tenant-context is active**: Canonical workspace views open prefiltered to the current tenant when entered from tenant context, but only for records the actor is entitled to inspect. Reason labels and next-step facets must respect the same prefilter.
- **Explicit entitlement checks preventing cross-tenant leakage**: Humanized reason labels, next-step links, severity groupings, notification text, and summary counts must be derived only from authorized records. Unauthorized reasons must not become inferable through shared wording such as `Blocked`, `Permission required`, `Reconnect provider`, or `Retry later`.

## Operator Surface Contract *(mandatory when operator-facing surfaces are changed)*

| Surface | Primary Persona | Surface Type | Primary Operator Question | Default-visible Information | Diagnostics-only Information | Status Dimensions Used | Mutation Scope | Primary Actions | Dangerous Actions |
|---|---|---|---|---|---|---|---|---|---|
| Operations list and run detail | Tenant or workspace operator | List/detail | Why did this run block, fail, or degrade, and what should I do next? | Human-readable reason label, short explanation, retryability or next-step signal, affected scope | Raw machine reason code, low-level payload fragments, internal identifiers | execution outcome, operator actionability, retryability | TenantPilot only / Microsoft tenant / simulation only depending on the adopted run type | View run, follow next step, retry when allowed | Existing dangerous follow-up actions only; no new dangerous action added by this spec |
| Provider and tenant governance surfaces using reason-bearing states | Tenant operator | Detail/list/summary | What is preventing progress or reducing trust on this record? | Human-readable reason label, concise explanation, whether action is required | Raw technical code, provider metadata, internal classification detail | readiness, prerequisite state, operator actionability | TenantPilot only or Microsoft tenant depending on existing workflow | Resolve prerequisite, inspect detail | Existing dangerous actions remain unchanged |
| Adopted system-console health and onboarding surfaces | Platform operator | Detail/triage | Is this a transient issue, a configuration issue, or a missing prerequisite? | Human-readable reason class, short explanation, next-step guidance | Raw code, stack-oriented detail, internal IDs | execution outcome, retryability, actionability | TenantPilot only / Microsoft tenant / simulation only depending on adopted action | Inspect diagnostics, follow remediation path | Existing dangerous actions remain unchanged |

## User Scenarios & Testing *(mandatory)*

### User Story 1 - Understand why work was blocked or failed (Priority: P1)

As an operator, I want blocked, denied, degraded, and failed states to explain themselves in plain language, so that I can understand the cause and next step without decoding internal reason strings.

**Why this priority**: This is the direct operator-trust gap. When reason strings leak through unchanged, the product already knows the truth but fails to communicate it.

**Independent Test**: Can be fully tested by inspecting adopted blocked and failed examples on operations and governance surfaces and verifying that the primary surface shows a human-readable label, a short explanation, and action guidance without exposing only raw internal codes.

**Acceptance Scenarios**:

1. **Given** an adopted surface currently carries a machine-readable blocked or failed reason, **When** the operator opens that surface, **Then** the primary message uses a human-readable label and short explanation instead of the raw internal code.
2. **Given** an adopted reason represents a recoverable prerequisite problem, **When** the operator views it, **Then** the surface tells the operator what to do next or where to go next.
3. **Given** an adopted reason is non-actionable or diagnostic-only, **When** the operator views it, **Then** the surface makes that clear instead of presenting it like an unexplained warning.

---

### User Story 2 - Preserve backend precision without polluting the primary surface (Priority: P1)

As a maintainer and senior operator, I want internal reason codes to remain stable and available for diagnostics, so that logs, tests, and audits keep their precision while the primary UI stays operator-first.

**Why this priority**: The solution must not trade away machine contracts or audit clarity. The requirement is translation, not lossy replacement.

**Independent Test**: Can be fully tested by verifying that adopted surfaces show humanized labels by default while raw reason codes remain available through diagnostics, logs, or secondary detail.

**Acceptance Scenarios**:

1. **Given** an adopted record has a machine-readable reason code, **When** the operator views the primary surface, **Then** the raw code is not the primary label.
2. **Given** a maintainer or advanced operator needs diagnostic precision, **When** they inspect secondary detail, **Then** the original internal reason code remains available and unchanged.
3. **Given** an audit or regression test already depends on the machine-readable reason, **When** this feature is adopted, **Then** the internal code contract remains stable.

---

### User Story 3 - Reuse one translation contract across domains (Priority: P2)

As a product owner, I want one shared reason translation contract used across operations, provider, baseline, verification, RBAC, restore, and onboarding surfaces, so that each domain does not invent a separate explanation format.

**Why this priority**: Per-domain cleanup would recreate inconsistency. The strategic value is the common contract.

**Independent Test**: Can be fully tested by reviewing a bounded first-slice adoption set across multiple domains and confirming that each adopted reason provides the same minimum resolution shape: label, explanation, actionability, and next-step semantics where applicable.

**Acceptance Scenarios**:

1. **Given** two different adopted domains expose reasons for blocked or degraded states, **When** the operator compares them, **Then** both use the same translation pattern rather than domain-specific ad-hoc wording.
2. **Given** an adopted reason is transient and retryable in one domain and permanent in another, **When** the operator views each result, **Then** the translation contract distinguishes the retryability class clearly.

### Edge Cases

- A reason has a valid internal code but no operator-facing translation yet; the fallback must remain understandable and must not leak a bare internal code as the only message.
- A reason is shared by multiple surfaces but needs different surrounding context; the core translated label must stay consistent while the explanation may be surface-specific.
- A reason is non-actionable and should explicitly say that no action is required.
- A summary surface aggregates multiple reasons; the display must remain human-readable without exposing raw internal keys.
- A translated next step points to a protected surface; the product must not reveal inaccessible remediation paths to unauthorized users.
- A transient reason later becomes permanent because the underlying state changed; the translation must be recalculated from current classification rather than cached as stale prose.

## Requirements *(mandatory)*

**Constitution alignment (required):** This feature introduces no new Microsoft Graph call path and no new mutation workflow. It defines the shared explanation contract for existing reason-bearing outcomes across current workflows. Existing write and queue safety rules remain unchanged. If adopted surfaces are DB-only, they still remain auditable through current audit and operations records; this spec changes how reasons are translated and surfaced, not whether those records exist.

**Constitution alignment (OPS-UX):** This feature reuses existing operation and notification surfaces without changing the Ops-UX three-surface contract. Start toasts remain intent-only. Progress remains confined to active-run surfaces. Terminal notifications remain driven by the existing run lifecycle. This spec changes how terminal and diagnostic reasons are translated for operators, not how runs are created or transitioned. `OperationRun.status` and `OperationRun.outcome` remain service-owned. Any humanized summary labels derived from numeric counts must keep the underlying numeric contract stable.

**Constitution alignment (RBAC-UX):** This feature affects both tenant/admin and platform/system explanation surfaces. Cross-plane access remains deny-as-not-found. Non-members and wrong-scope actors remain `404`; in-scope actors missing required capabilities remain `403`. Humanized labels, explanations, filter values, summaries, next-step hints, and notification text must not reveal inaccessible records or remediation surfaces. Authorization remains server-side and independent from translation.

**Constitution alignment (OPS-EX-AUTH-001):** Not applicable. This feature does not touch authentication handshakes.

**Constitution alignment (BADGE-001):** Where adopted reason translations affect status-like labels, they must consume the centralized outcome taxonomy rather than introduce page-local severity mappings. The translation contract must classify actionability and retryability in a way that stays aligned with the shared operator outcome taxonomy.

**Constitution alignment (UI-NAMING-001):** The target objects are operator-facing reason labels shown in run details, notifications, banners, summaries, and governance surfaces. Operator-facing wording must preserve the shared vocabulary established by the outcome taxonomy. Implementation-first terms and raw internal code names remain diagnostics-only.

**Constitution alignment (OPSURF-001):** This feature materially refines existing operator-facing surfaces by moving raw reason strings out of the default-visible path. The primary surface must show operator-first meaning: label, short explanation, and next action or explicit no-action-needed guidance. Raw internal codes, low-level payloads, and engineering detail remain secondary diagnostics. Existing mutation scope language for adopted actions remains unchanged.

**Constitution alignment (Filament Action Surfaces):** This feature does not introduce a new Filament action family. Existing action surfaces remain responsible for capability gating, confirmation, and audit. The Action Surface Contract remains satisfied because the change is limited to how reasons are explained on existing surfaces.

**Constitution alignment (UX-001 — Layout & Information Architecture):** This feature does not introduce a new screen category. Adopted surfaces keep their current layouts while replacing raw or overly technical reason text in the default-visible hierarchy with operator-first explanation. Diagnostics remain explicitly secondary.

### Functional Requirements

- **FR-157-001**: The system MUST define one shared reason translation contract for adopted reason-bearing states across operations, provider, baseline, execution, operability, verification, RBAC, restore, onboarding, and system-console surfaces.
- **FR-157-002**: The shared contract MUST preserve machine-readable internal reason codes as stable backend and audit contracts.
- **FR-157-003**: The primary operator-facing surface for an adopted reason MUST use a human-readable label rather than exposing the raw internal reason code as the primary message.
- **FR-157-004**: Each adopted translated reason MUST provide, at minimum, a human-readable label and a short explanation suitable for operator-facing surfaces.
- **FR-157-005**: Each adopted translated reason MUST also declare whether it is retryable-transient, permanent-configuration, prerequisite-missing, or intentionally non-actionable.
- **FR-157-006**: When a translated reason implies a useful next step, the contract MUST provide actionable guidance, a destination, or an explicit instruction.
- **FR-157-007**: When a translated reason does not require operator action, the contract MUST say so explicitly rather than leaving the operator to infer that from silence.
- **FR-157-008**: Internal reason codes MUST remain available in diagnostic or secondary detail areas for logs, support, and audit-oriented troubleshooting.
- **FR-157-009**: The system MUST NOT rename internal reason codes for cosmetic operator-facing wording changes.
- **FR-157-010**: Adopted notification payloads MUST use the translated label and explanation contract rather than raw or heuristically sanitized reason fragments.
- **FR-157-011**: Adopted run-detail, banner, and summary surfaces MUST consume the shared contract rather than building local ad-hoc string formatting rules for reasons.
- **FR-157-012**: The system MUST humanize operator-facing summary labels derived from internal metric or reason keys so raw backend keys do not appear as the default-visible wording.
- **FR-157-013**: The system MUST provide one consistent fallback behavior for adopted reasons that have not yet received a domain-specific translation, and that fallback MUST remain understandable to operators by producing a sentence-case label that is not the raw internal code, a concise explanation, and either an explicit next step or an explicit no-action-needed marker.
- **FR-157-014**: The first implementation slice MUST cover a bounded adoption set that includes operations, notifications, and at least two additional reason-bearing domain families beyond operations.
- **FR-157-015**: The first implementation slice MUST define migration guidance for the existing reason families so downstream domains can adopt the shared contract without inventing parallel translation patterns.
- **FR-157-016**: The system MUST retire heuristic, string-matching-only operator reason formatting as the primary translation path on adopted surfaces.
- **FR-157-017**: Humanized reason text MUST use the shared vocabulary established by the operator outcome taxonomy and MUST NOT introduce conflicting synonyms for blocked, partial, missing, stale, unsupported, denied, or retry states.
- **FR-157-018**: Humanized next-step guidance MUST remain entitlement-safe and MUST NOT reveal inaccessible remediation surfaces or protected tenant information.
- **FR-157-019**: The feature MUST include regression coverage proving that translated labels appear on adopted surfaces while raw internal codes remain available in diagnostics.
- **FR-157-020**: The feature MUST include regression coverage for retryable, permanent, prerequisite, and non-actionable reason classes.
- **FR-157-021**: The feature MUST include at least one positive and one negative authorization regression test proving that translation-backed summaries and next-step hints do not leak unauthorized records or scopes.
- **FR-157-022**: The feature MUST allow domain-owned translations to vary in explanation detail when necessary, but the minimum contract shape and operator vocabulary MUST remain consistent across all adopted domains.

## UI Action Matrix *(mandatory when Filament is changed)*

| Surface | Location | Header Actions | Inspect Affordance (List/Table) | Row Actions (max 2 visible) | Bulk Actions (grouped) | Empty-State CTA(s) | View Header Actions | Create/Edit Save+Cancel | Audit log? | Notes / Exemptions |
|---|---|---|---|---|---|---|---|---|---|---|
| Operations list and run detail | Existing operations surfaces | Existing controls unchanged | Existing run inspection remains primary | None added by this feature | None added by this feature | Existing CTA unchanged | Existing run actions unchanged | N/A | Existing audit model unchanged | This feature changes the explanation contract for reasons, not the action set |
| Adopted tenant governance surfaces | Existing tenant-context detail and summary surfaces with reason-bearing states | Existing controls unchanged | Existing row or detail inspection unchanged | None added by this feature | None added by this feature | Existing CTA unchanged | Existing actions unchanged | N/A | Existing audit model unchanged | Applies to reason labels, explanation text, and next-step wording only |
| Adopted system-console health and onboarding surfaces | Existing system/operator triage surfaces | Existing controls unchanged | Existing diagnostic drill-in unchanged | None added by this feature | None added by this feature | Existing CTA unchanged | Existing actions unchanged | N/A | Existing audit model unchanged | The change is operator-first reason wording with diagnostics boundary preserved |

### Key Entities *(include if feature involves data)*

- **Reason Code**: The stable machine-readable identifier that captures why a workflow was blocked, denied, degraded, skipped, or failed.
- **Reason Translation**: The operator-facing label and short explanation derived from a stable reason code.
- **Reason Resolution Envelope**: The shared operator-facing shape that combines label, explanation, retryability class, and next-step guidance.
- **Diagnostic Reason Detail**: Secondary information that preserves the original internal code and low-level context for troubleshooting without becoming the default-visible message.

## Success Criteria *(mandatory)*

### Measurable Outcomes

- **SC-157-001**: In the first implementation slice, 100% of adopted reason-bearing primary surfaces show a translated human-readable label instead of a raw internal code as the primary message.
- **SC-157-002**: In focused regression coverage, 100% of adopted translated reasons are classified into one of the declared operator-facing actionability classes.
- **SC-157-003**: In focused regression coverage, 100% of adopted notification messages use translated reason wording rather than raw or heuristically sanitized reason fragments.
- **SC-157-004**: In the manual review protocol defined in `quickstart.md`, 12 curated adopted examples made up of 4 operations cases, 4 provider-guidance cases, 2 tenant-operability cases, and 2 adopted system-console RBAC or onboarding cases MUST be scored against a pass/fail checklist for cause clarity and next-step clarity, and at least 11 of the 12 examples MUST pass.
- **SC-157-005**: In focused regression coverage, 100% of adopted surfaces preserve access-safe behavior so that translated reasons and next-step hints do not reveal unauthorized tenant or workspace state.
- **SC-157-006**: In the first implementation slice, no adopted surface relies on heuristic free-form string matching as its primary reason-humanization mechanism.

## Assumptions

- Spec 156 provides the shared operator vocabulary this feature translates into.
- Existing machine-readable reason codes across domains are worth preserving as stable contracts for logs, tests, and audit trails.
- Not every domain must adopt the contract in one release; the value comes from a shared contract plus a bounded first-slice rollout.
- Some adopted surfaces may need surface-specific explanation text, but they should not diverge on label meaning, retryability class, or next-step semantics.

## Dependencies

- Spec 156 - Operator Outcome Taxonomy and Cross-Domain State Separation
- Existing reason-bearing workflows across operations, provider, baseline, verification, restore, onboarding, RBAC, and system-console surfaces
- Existing notification and summary surfaces that currently expose raw or overly technical reason wording

## Non-Goals

- Creating new business-domain reason codes
- Changing the semantic meaning of existing reason codes
- Renaming backend machine contracts for cosmetic reasons
- Redesigning the visual component system or badge infrastructure
- Reworking the broader operation naming taxonomy
- Extending provider preflight or dispatch gating itself; that remains a downstream consumer of this contract

## Final Direction

This spec is the strategically next step after the operator outcome taxonomy because it turns backend truth into operator-usable language without sacrificing backend precision. It defines one shared contract for translating reason codes into label, explanation, retryability, and next-step guidance, so later domain work can explain problems consistently instead of leaking raw internal fragments or inventing one-off wording rules.