ahmido 92f39d9749 feat: add shared reason translation contract (#187 )

## Summary
- introduce a shared reason-translation contract with envelopes, presenter helpers, fallback handling, and provider translation support
- adopt translated operator-facing reason presentation across operation runs, notifications, provider guidance, tenant operability, and RBAC-related surfaces
- add Spec 157 design artifacts and targeted regression coverage for translation quality, diagnostics retention, and authorization-safe guidance

## Validation
- `vendor/bin/sail bin pint --dirty --format agent`
- `vendor/bin/sail artisan test --compact tests/Architecture/ReasonTranslationPrimarySurfaceGuardTest.php tests/Unit/Support/ReasonTranslation/ReasonResolutionEnvelopeTest.php tests/Unit/Support/ReasonTranslation/ExecutionDenialReasonTranslationTest.php tests/Unit/Support/ReasonTranslation/TenantOperabilityReasonTranslationTest.php tests/Unit/Support/ReasonTranslation/RbacReasonTranslationTest.php tests/Unit/Support/ReasonTranslation/ProviderReasonTranslationTest.php tests/Feature/Notifications/OperationRunNotificationTest.php tests/Feature/Operations/OperationRunBlockedExecutionPresentationTest.php tests/Feature/Operations/TenantlessOperationRunViewerTest.php tests/Feature/ReasonTranslation/GovernanceReasonPresentationTest.php tests/Feature/Authorization/ReasonTranslationScopeSafetyTest.php tests/Feature/Monitoring/OperationRunBlockedSpec081Test.php tests/Feature/ProviderConnections/ProviderOperationBlockedGuidanceSpec081Test.php tests/Feature/ProviderConnections/ProviderGatewayRuntimeSmokeSpec081Test.php`

## Notes
- Livewire v4.0+ compliance remains unchanged within the existing Filament v5 stack.
- No new panel was added; provider registration remains in `bootstrap/providers.php`.
- No new globally searchable resource was introduced.
- No new destructive action family was introduced.
- No new assets were added; the existing `filament:assets` deployment behavior remains unchanged.

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #187

2026-03-22 20:19:43 +00:00

11 KiB

Raw Blame History

Research: Operator Reason Code Translation and Humanization Contract

Decision 1: Preserve internal reason codes and translate them through a shared envelope

Decision: Keep stable internal reason codes as machine contracts for logs, audits, tests, and existing records, and add one shared operator-facing resolution envelope that derives label, explanation, actionability, retryability, and next-step guidance from those codes.
Rationale: The repo already stores and compares raw reason codes in operations, onboarding, provider resolution, and verification flows. Renaming them for operator copy would create unnecessary churn and risk breaking audit semantics.
Alternatives considered:
- Rename all reason codes to more human-readable strings: rejected because backend precision and compatibility matter more than cosmetic internal naming.
- Keep reason translation purely page-local: rejected because that would reproduce inconsistency across operations, provider, baseline, RBAC, and onboarding flows.
- Collapse raw codes into human prose only: rejected because diagnostics and tests need stable machine-readable contracts.

Decision 2: Treat current reason-code families as one structural problem with multiple shapes

Decision: Model the current repo as a set of distinct artifact families that all need the same translation contract despite structural differences.
Rationale: The codebase uses at least four structural patterns today: string-constant registries (ProviderReasonCodes, BaselineReasonCodes), enums without translation methods (TenantOperabilityReasonCode, RbacReason), enums with message() (BaselineCompareReasonCode, ExecutionDenialReasonCode), and localized helper or options patterns (RunbookReason). The contract must span all of them.
Alternatives considered:
- Standardize only enum-based families first and ignore string-constant registries: rejected because provider and baseline preconditions are some of the most visible raw-code leak sources.
- Build separate contracts per family: rejected because the feature's value is one shared operator-facing explanation shape.

Decision 3: Use existing central seams as the first adoption slice

Decision: Start adoption where the repo already centralizes operator-facing reason UX: OperationUxPresenter, OperationRunCompleted, ProviderNextStepsRegistry, and enum-backed families such as ExecutionDenialReasonCode.
Rationale: These seams already shape cross-domain notifications and blocked-prerequisite guidance, which makes them high-leverage proof points for the contract.
Alternatives considered:
- Start with low-level job classes: rejected because that would spread translation logic outward instead of centralizing it.
- Start with only one domain such as baseline compare: rejected because it would not prove the cross-domain contract.
- Start by rewriting every reason-bearing surface in one pass: rejected because the rollout would be too large to validate safely.

Decision 4: Reduce heuristic string matching to fallback-only status on adopted surfaces

Decision: RunFailureSanitizer should remain available for sanitization and bounded normalization fallback, but it must stop being the primary explanation path on first-slice adopted surfaces.
Rationale: The current normalizeReasonCode() method relies on heuristic string matching for throttling, auth, timeout, permission, validation, and conflict patterns. That is useful as a compatibility layer, but it is too weak and opaque to remain the product's primary operator explanation mechanism.
Alternatives considered:
- Delete all heuristics immediately: rejected because the repo still receives raw throwable messages from multiple jobs and services.
- Keep heuristics as the main translation path: rejected because the feature explicitly exists to move beyond heuristic operator wording.
- Replace sanitization and normalization together: rejected because sanitization still has a valid security purpose even after structured translation is introduced.

Decision 5: Next-step guidance belongs inside the same contract as label and explanation

Decision: The reason-translation contract must include next-step guidance or an explicit no-action-needed marker rather than leaving next steps to ad-hoc presenter logic.
Rationale: Provider flows already prove the need through ProviderNextStepsRegistry, and operations surfaces already try to infer next steps through OperationUxPresenter. The operator-facing contract is incomplete if it explains the cause but not the expected action.
Alternatives considered:
- Keep next steps in separate registries and translate only labels: rejected because surfaces would still need to stitch together multiple inconsistent sources.
- Always require a navigation link: rejected because some reasons need instruction text or an explicit no-action-needed signal instead of a link.

Decision 6: The first slice should prove both translation and diagnostic boundary behavior

Decision: The first slice must assert both that primary surfaces show translated labels and that diagnostics still preserve the original internal reason code.
Rationale: The feature is about explanation quality without losing backend truth. A rollout that only improves the UI but discards raw diagnostic precision would fail the spec's second P1 story.
Alternatives considered:
- Hide raw codes entirely: rejected because support, audit, and regression use cases still need them.
- Leave raw codes primary on some surfaces while humanizing others: rejected because that would keep the current inconsistency alive.

Decision 7: Start with enum-backed families before adapting string-constant registries deeply

Decision: Begin implementation with enum-backed families such as ExecutionDenialReasonCode, TenantOperabilityReasonCode, and RbacReason, then extend the same contract to string-constant registries such as ProviderReasonCodes and BaselineReasonCodes through adapters or registries.
Rationale: Enums already package reason identity in one type and some already have behavior (message(), denialClass()), so they provide the safest and fastest proving ground for the shared contract.
Alternatives considered:
- Start with provider constants first: rejected because it requires a broader adapter decision before the contract is proven.
- Delay provider adoption entirely: rejected because provider reasons are among the highest-volume operator-facing blocked states.

Decision 8: The first slice should cover operations, provider guidance, tenant-operability, and adopted system-console governance

Decision: The bounded first slice should include operations notifications and run detail, provider next-step guidance, tenant-operability governance, and adopted system-console RBAC or onboarding health surfaces.
Rationale: This proves the contract across different explanation patterns: run lifecycle messaging, prerequisite guidance, tenant-context governance logic, and platform or system-surface health messaging.
Alternatives considered:
- Operations only: rejected because the feature would look too local and would not prove cross-domain reuse.
- Provider only: rejected because it would leave the highest-leverage notification path unchanged.

Decision 9: Summary humanization must stay aligned with existing numeric contracts

Decision: Adopted summary labels should become more operator-readable, but summary metrics remain governed by OperationSummaryKeys::all() and numeric-only normalization.
Rationale: SummaryCountsNormalizer already humanizes labels such as Failed items and Completed successfully. This feature should extend clarity around reason-bearing summaries without changing the operations metrics contract.
Alternatives considered:
- Introduce free-form summary prose per operation: rejected because it would weaken determinism and complicate testing.
- Leave summary wording untouched: rejected because raw or overly technical labels are part of the same operator-trust problem.

Decision 10: Authorization-safe translation is part of the contract, not a follow-up concern

Decision: The translation contract must treat next-step hints, labels, summaries, and notification wording as authorization-sensitive output, and the first slice must include explicit non-leakage regression coverage.
Rationale: The repo uses both tenant-context and canonical workspace views, and translated reason hints can become a leak surface if they reveal inaccessible remediation paths or hidden tenant state.
Alternatives considered:
- Treat authorization as purely the page layer's concern: rejected because the translated payload itself can leak information.
- Defer non-leakage testing to a later hardening pass: rejected because the spec explicitly spans canonical and tenant-context surfaces now.

Decision 11: Fallback output and shared vocabulary need explicit quality floors

Decision: The first slice should define a minimum fallback quality floor and regression guards for shared operator vocabulary, not just raw-code suppression.
Rationale: A fallback that avoids raw codes but still emits inconsistent wording such as mixed blocked or denied synonyms would satisfy the letter of translation while still degrading operator trust across domains.
Alternatives considered:
- Validate only that raw internal codes are hidden: rejected because that still allows drift in blocked, missing, stale, unsupported, denied, partial, and retry phrasing.
- Leave fallback readability to reviewer judgment alone: rejected because the spec needs deterministic quality thresholds that future adopters can follow.

Implementation Notes For Future Adopters

Adopted surfaces should resolve reasons through App\Support\ReasonTranslation\ReasonTranslator or App\Support\ReasonTranslation\ReasonPresenter rather than formatting raw reason strings inline.
Provider-domain next-step links should continue to flow through ProviderNextStepsRegistry; it now delegates to the shared translation contract instead of owning wording itself.
Operation-run surfaces should prefer the persisted context.reason_translation envelope when present, while still keeping context.reason_code and failure_summary[*].reason_code stable for diagnostics.
New domain families should add translation behavior close to the reason-code source type first, then wire the resulting envelope into presenters or notifications.
unknown_error should remain a bounded fallback for explicitly reason-bearing flows only; generic failure codes from unrelated domains should continue to use their existing follow-up messaging until they adopt the shared contract directly.

11 KiB Raw Blame History

Research: Operator Reason Code Translation and Humanization Contract

Decision 1: Preserve internal reason codes and translate them through a shared envelope

Decision 2: Treat current reason-code families as one structural problem with multiple shapes

Decision 3: Use existing central seams as the first adoption slice

Decision 4: Reduce heuristic string matching to fallback-only status on adopted surfaces

Decision 5: Next-step guidance belongs inside the same contract as label and explanation

Decision 6: The first slice should prove both translation and diagnostic boundary behavior

Decision 7: Start with enum-backed families before adapting string-constant registries deeply

Decision 8: The first slice should cover operations, provider guidance, tenant-operability, and adopted system-console governance

Decision 9: Summary humanization must stay aligned with existing numeric contracts

Decision 10: Authorization-safe translation is part of the contract, not a follow-up concern

Decision 11: Fallback output and shared vocabulary need explicit quality floors

Implementation Notes For Future Adopters

11 KiB

Raw Blame History