TenantAtlas/specs/373-diagnostic-surface-separation/plan.md
ahmido 94877c9a66 feat(ui): implement diagnostic surface separation (#444)
Applied the decision-first diagnostic surface IA contract to EnvironmentDiagnostics and SupportDiagnostics bundles. Added recommended_first_check and separated technical metadata as per Spec 373.

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #444
2026-06-12 20:31:17 +00:00

267 lines
18 KiB
Markdown

# Implementation Plan: Spec 373 - Diagnostic Surface Separation v1
**Branch**: `373-diagnostic-surface-separation` | **Date**: 2026-06-12 | **Spec**: `specs/373-diagnostic-surface-separation/spec.md`
**Input**: User-provided Spec 373 draft narrowed by repo truth, completed Spec 353, Spec 368 Candidate D, and Spec 370 contract artifacts.
## Summary
Productize the remaining diagnostic/support surfaces after the completed provider-readiness work. The implementation should make Environment Diagnostics and support diagnostics modal content lead with:
- what failed or whether no action is needed
- why it matters
- what the operator/support user should check next
- which existing provider, permission, operation, evidence, or audit context is related
- where technical details remain available
Provider Connections and Required Permissions are not primary implementation targets. Spec 353 already implemented their readiness-guidance layer and UI reports, so this spec uses them only as completed context and regression constraints.
## Technical Context
**Language/Version**: PHP 8.4.15, Laravel 12.x
**Primary Dependencies**: Filament v5, Livewire v4, Pest v4, Laravel Sail
**Storage**: PostgreSQL; no schema change planned
**Testing**: Pest Feature/Livewire and bounded Browser smoke
**Validation Lanes**: confidence + browser + static diff/format checks
**Target Platform**: Laravel monolith under `apps/platform`
**Project Type**: web application
**Performance Goals**: DB-local rendering; no provider/Graph calls during page or modal render
**Constraints**: no new persistence, no provider/permission/backend logic changes, no `/system` auth/fixture repair
**Scale/Scope**: one environment diagnostic page, existing support diagnostics modal/bundle, completed provider-readiness regression context only
## Current Repo Truth That Constrains The Slice
- `EnvironmentDiagnostics` is a small Filament page with two existing destructive repair actions:
- `bootstrapOwner`
- `mergeDuplicateMemberships`
- Those actions already use `Action::make(...)->action(...)`, `->requiresConfirmation()`, `UiEnforcement`, `Capabilities::TENANT_MANAGE`, and destructive UI treatment. Implementation must preserve and verify this posture.
- `environment-diagnostics.blade.php` currently renders a simple header, missing-owner card, duplicate-membership card, or "All good" card. It does not yet compose one guided diagnostic summary.
- `SupportDiagnosticBundleBuilder` already derives redacted support context from existing tenant, provider connection, OperationRun, findings, stored reports, reviews, review packs, and audit logs.
- `support-diagnostic-bundle.blade.php` already shows headline, dominant issue, context, redaction, notes, and sections. The intended change is hierarchy/copy/order, not a data model change.
- Spec 353 already completed Provider Connections and Required Permissions readiness guidance:
- `docs/ui-ux-enterprise-audit/page-reports/ui-009-provider-connections.md`
- `docs/ui-ux-enterprise-audit/page-reports/ui-077-required-permissions.md`
- `specs/353-provider-connections-resolution-guidance-v1/tasks.md` has checked tasks and browser-oriented proof.
- `docs/ui-ux-enterprise-audit/route-inventory.md` already has UI-012 Environment Diagnostics, UI-072 Provider Connections, and UI-077 Required Permissions entries.
## Domain / Model Implications
- No new model, table, enum, status family, provider health engine, permission engine, or persisted diagnostic artifact is planned.
- Environment diagnostic cases remain derived from existing booleans and service results:
- missing owner
- duplicate memberships for current user
- no known issue
- Support diagnostics remains derived from `SupportDiagnosticBundleBuilder` output and existing related records.
- If implementation needs a helper, prefer a narrow page-local derived summary helper that replaces duplicated Blade conditionals. Do not introduce a generic diagnostic resolver/framework.
## UI / Surface Guardrail Plan
- **Guardrail scope**: existing Environment Diagnostics page and existing support diagnostics modal/bundle.
- **Affected routes/pages/actions/states/navigation/panel/provider surfaces**:
- `/admin/workspaces/{workspace}/environments/{environment}/diagnostics`
- `openSupportDiagnostics` modal on existing authorized host surfaces
- completed Provider Connections/Required Permissions only if a shared change creates regression risk
- **No-impact class, if applicable**: N/A; existing UI surfaces change.
- **Native vs custom classification summary**: Filament page/action/modal with Blade content; preserve native action primitives.
- **Shared-family relevance**: diagnostics summary, support diagnostics, redaction, contextual help, OperationRun/reference links.
- **State layers in scope**: page derived state and modal/bundle presentation state.
- **Audience modes in scope**: operator-MSP and support-platform.
- **Decision/diagnostic/raw hierarchy plan**: guidance first, technical/support references second, raw/provider/debug data third or unavailable.
- **Raw/support gating plan**: support diagnostics remains capability-gated and default-redacted; technical details remain lower-priority or collapsed where added.
- **One-primary-action / duplicate-truth control**: one top diagnostic summary owns the current blocker/no-action state; lower cards/sections add proof and action detail.
- **Handling modes by drift class or surface**:
- Environment Diagnostics: implementation target.
- Support diagnostics modal: implementation target if reachable with existing actions/fixtures.
- Provider Connections and Required Permissions: report-only/regression context because Spec 353 completed them.
- System panel: deferred, not implementation target.
- **Repository-signal treatment**: completed-spec guardrail is mandatory for Spec 353, 371, and 372.
- **Special surface test profiles**: `standard-native-filament` for Environment Diagnostics actions; `shared-detail-family` for support diagnostics modal.
- **Required tests or manual smoke**: Feature/Livewire tests for page/modal hierarchy; browser smoke for first viewport/modal reachability.
- **Exception path and spread control**: if support diagnostics modal cannot be reached with existing fixture, record blocked status and stop rather than building fixture/auth infrastructure.
- **Active feature PR close-out entry**: Guardrail / Exception / Smoke Coverage.
- **UI/Productization coverage decision**: coverage artifacts should be updated only for UI-012 or support diagnostics if implementation materially changes registry status/evidence. Do not churn UI-009/UI-077 unless a regression is found.
- **Coverage artifacts to update**:
- `docs/ui-ux-enterprise-audit/route-inventory.md` if UI-012 screenshot/report references change
- `docs/ui-ux-enterprise-audit/page-reports/ui-012-environment-diagnostics.md` if created or refreshed
- `docs/ui-ux-enterprise-audit/unresolved-pages.md` only for documented blocked scope
- **Screenshot or page-report need**: yes for Environment Diagnostics; yes for support diagnostics modal if reachable.
## Shared Pattern And System Fit
- **Cross-cutting feature marker**: yes.
- **Systems touched**:
- `App\Filament\Pages\EnvironmentDiagnostics`
- `apps/platform/resources/views/filament/pages/environment-diagnostics.blade.php`
- `App\Support\SupportDiagnostics\SupportDiagnosticBundleBuilder`
- `apps/platform/resources/views/filament/modals/support-diagnostic-bundle.blade.php`
- existing support diagnostics action hosts/tests
- `ManagedEnvironmentLinks` / `OperationRunLinks` only for existing related links
- **Shared abstractions reused**:
- `UiEnforcement`
- `Capabilities`
- `SupportDiagnosticBundleBuilder`
- `RedactionIntegrity`
- `ContextualHelpResolver`
- `OperationRunLinks`
- `ManagedEnvironmentLinks`
- `ProviderReasonTranslator`
- **New abstraction introduced? why?**: none planned. A page-local helper is allowed only if it simplifies Environment Diagnostics view state without becoming a framework.
- **Why the existing abstraction was sufficient or insufficient**: Existing services expose truth and authorization. The missing piece is presentation hierarchy.
- **Bounded deviation / spread control**: no completed Spec 353 code path should be reused as a generic diagnostic engine; use it as pattern/context only.
## OperationRun UX Impact
- **Touches OperationRun start/completion/link UX?**: no start/completion behavior.
- **Central contract reused**: existing `OperationRunLinks` for references only.
- **Delegated UX behaviors**: N/A.
- **Surface-owned behavior kept local**: deciding when to show an existing related operation link as diagnostic context.
- **Queued DB-notification policy**: N/A.
- **Terminal notification path**: N/A.
- **Exception path**: none.
## Provider Boundary And Portability Fit
- **Shared provider/platform boundary touched?**: yes, presentation only.
- **Provider-owned seams**: provider connection reason codes, required permissions, Microsoft Graph permission names, provider target identifiers.
- **Platform-core seams**: environment diagnostics, support diagnostics summary, related operation/evidence links, provider-neutral top-level copy.
- **Neutral platform terms / contracts preserved**: provider connection, required permissions, managed environment, operation, evidence, diagnostics.
- **Retained provider-specific semantics and why**: provider-specific permission/reason detail may appear in lower technical/support sections because operators/support users need exact provider context when troubleshooting.
- **Bounded extraction or follow-up path**: no extraction in this spec; deeper provider-health/onboarding redesign remains follow-up.
## Constitution Check
*GATE: Must pass before implementation. Re-check after implementation if requirements change.*
- Inventory-first / snapshots-second: N/A; no inventory/snapshot truth changes.
- Read/write separation: Environment Diagnostics has existing repair actions; preserve confirmation, capability gating, authorization, and audit ownership. No new write action.
- Graph contract path: no Graph calls added; page/modal render must remain DB-local.
- Deterministic capabilities: existing capability checks remain authoritative.
- RBAC-UX: workspace/environment entitlement and support diagnostics capability remain server-side truth; non-member route/context access remains deny-as-not-found.
- Workspace/Tenant isolation: environment route and modal context must remain workspace/environment scoped.
- Run observability: no new OperationRun type/start behavior; existing operation links remain links only.
- OperationRun start UX: N/A beyond existing links.
- Data minimization: no secrets, tokens, raw provider payloads, or unrestricted logs in default-visible content.
- Test governance: confidence + browser lanes are explicit and bounded; no hidden heavy lane default.
- Proportionality: no new persisted truth or framework.
- No premature abstraction: page-local helper only if strictly needed; no diagnostic framework.
- Persisted truth: none added.
- Behavioral state: no new status/reason family.
- UI semantics: direct mapping from existing diagnostic truth to UI.
- Shared pattern first: reuse `SupportDiagnosticBundleBuilder`, redaction, contextual help, OperationRun links, and Filament action helpers.
- Provider boundary: provider-specific values stay in provider-owned or technical/support detail.
- UI-COV-001: reachable UI surface impact is declared; coverage artifact handling is proportional.
- DECIDE-001 / DECIDE-AUD-001: diagnostic/support surfaces are tertiary; default-visible content is guided and raw detail remains secondary.
- Filament v5 / Livewire v4: required; no legacy APIs.
- Panel provider registration: unchanged in `apps/platform/bootstrap/providers.php`.
- Global search: no resource global-search behavior changes; `ProviderConnectionResource` remains disabled as completed Spec 353 context.
- Destructive actions: existing Environment Diagnostics repair actions stay `->action(...)` + `->requiresConfirmation()` + capability-gated + server-authorized.
- Asset strategy: no new asset registration planned; no `filament:assets` deployment change expected.
## Test Governance Check
- **Test purpose / classification by changed surface**:
- Environment Diagnostics: Feature/Livewire + Browser.
- Support diagnostics modal: Feature/Livewire + Browser if reachable.
- Completed provider surfaces: regression/source audit only unless shared changes touch them.
- **Affected validation lanes**: confidence and browser.
- **Why this lane mix is the narrowest sufficient proof**: DOM/render behavior and modal action state are provable with Feature/Livewire tests; first-viewport signal hierarchy needs browser proof because the source finding is visual/productization-oriented.
- **Narrowest proving commands**:
- `cd apps/platform && ./vendor/bin/sail artisan test --compact --filter=EnvironmentDiagnostics`
- `cd apps/platform && ./vendor/bin/sail artisan test --compact --filter=SupportDiagnostics`
- bounded Browser smoke for Spec 373 surfaces
- `git diff --check`
- `cd apps/platform && ./vendor/bin/sail pint --dirty` if PHP changes
- **Fixture / helper / factory / seed / context cost risks**: keep support/browser fixture reuse explicit; do not widen shared smoke-login setup.
- **Expensive defaults or shared helper growth introduced?**: no.
- **Heavy-family additions, promotions, or visibility changes**: one bounded browser smoke only.
- **Surface-class relief / special coverage rule**: standard-native-filament for Environment Diagnostics action safety; shared-detail-family for support diagnostics modal.
- **Closing validation and reviewer handoff**: reviewers should inspect source audit, affected files, browser screenshots/report, diagnostic safety checklist, and completed-spec guardrail notes.
- **Budget / baseline / trend follow-up**: none expected.
- **Review-stop questions**: Does the implementation introduce a framework, persistence, or hidden fixture cost? Does it reopen Spec 353? Does it weaken destructive-action safety?
- **Escalation path**: document-in-feature for bounded fixture limitations; follow-up-spec for system-panel reachability or UI bloat automation.
- **Active feature PR close-out entry**: Guardrail / Exception / Smoke Coverage.
- **Why no dedicated follow-up spec is needed**: Spec 373 is the dedicated diagnostic surface productization slice. Only UI bloat guard and fixture coverage remain separate follow-ups.
## Project Structure
### Documentation (this feature)
```text
specs/373-diagnostic-surface-separation/
├── spec.md
├── plan.md
├── tasks.md
├── checklists/
│ └── requirements.md
└── artifacts/
├── source-audit-summary.md
├── affected-files.md
├── diagnostic-surface-contracts.md
├── before-after-screenshot-index.md
├── implementation-notes.md
├── browser-verification-report.md
├── diagnostic-safety-checklist.md
├── validation-report.md
└── screenshots/
```
### Source Code (repository root)
Likely affected runtime surfaces for later implementation:
```text
apps/platform/app/Filament/Pages/EnvironmentDiagnostics.php
apps/platform/resources/views/filament/pages/environment-diagnostics.blade.php
apps/platform/app/Support/SupportDiagnostics/SupportDiagnosticBundleBuilder.php
apps/platform/resources/views/filament/modals/support-diagnostic-bundle.blade.php
apps/platform/tests/Feature/TenantRBAC/TenantDiagnosticsAccessTest.php
apps/platform/tests/Feature/SupportDiagnostics/
apps/platform/tests/Browser/
docs/ui-ux-enterprise-audit/route-inventory.md
docs/ui-ux-enterprise-audit/page-reports/
```
**Structure Decision**: Single Laravel web application. Implementation should stay in existing page/view/support-diagnostics/test locations and spec-local artifacts. Do not create new base folders.
## Complexity Tracking
No constitutional violation is planned.
| Violation | Why Needed | Simpler Alternative Rejected Because |
|---|---|---|
| N/A | N/A | N/A |
## Proportionality Review
- **Current operator problem**: diagnostics/support surfaces need guided first-viewport troubleshooting rather than sparse technical cards or raw references first.
- **Existing structure is insufficient because**: the truth exists but the current presentation does not organize it into outcome, impact, and next check.
- **Narrowest correct implementation**: reorder/copy existing page/modal output and add focused tests/browser proof.
- **Ownership cost created**: focused tests, screenshots, and spec-local artifacts only.
- **Alternative intentionally rejected**: generic diagnostic framework or provider-readiness engine reuse beyond pattern/context.
- **Release truth**: current-release productization of existing surfaces.
## Implementation Phases
### Phase 0 - Repo Truth And Source Audit
Re-read active spec artifacts, Spec 368/370/371/372/353 context, current Environment Diagnostics/support diagnostics code, route inventory, and relevant tests. Create source audit and surface contract artifacts before runtime edits.
### Phase 1 - Tests First
Add or update focused tests for Environment Diagnostics states and support diagnostics modal hierarchy. Preserve existing authorization, telemetry, audit, and redaction tests.
### Phase 2 - Environment Diagnostics Guidance
Recompose the Environment Diagnostics first viewport around one diagnostic summary. Preserve existing repair actions and avoid new backend logic.
### Phase 3 - Support Diagnostics Modal Guidance
Adjust modal/bundle presentation so the dominant issue, redaction/completeness, and first check precede lower reference sections. Keep raw/support detail secondary and redacted.
### Phase 4 - Browser Smoke And Artifacts
Use existing browser harness to capture reachable scoped pages/modals. Document blocked surfaces honestly. Populate spec-local artifacts and any proportional UI audit registry updates.
### Phase 5 - Validation And Close-Out
Run targeted tests, browser smoke, `git diff --check`, and Pint if PHP changed. The approval/prep report must state when no app implementation happened during that prep step; the later implementation report must include the standard Filament v5/Livewire v4, provider location, global search, destructive action, asset, test, and deployment impact summary.