TenantAtlas/specs/186-tenant-registry-recovery-triage/plan.md
ahmido 9fbd3e5ec7 Spec 186: implement tenant registry recovery triage (#217)
## Summary
- turn the tenant registry into a workspace-scoped recovery triage surface with backup posture and recovery evidence columns
- preserve workspace overview backup and recovery drilldown intent by routing multi-tenant cases into filtered tenant registry slices
- add the Spec 186 planning artifacts, focused regression coverage, and shared triage presentation helpers

## Testing
- `cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent`
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Filament/TenantRegistryRecoveryTriageTest.php tests/Feature/Filament/WorkspaceOverviewSummaryMetricsTest.php tests/Feature/Filament/WorkspaceOverviewDrilldownContinuityTest.php tests/Feature/Filament/TenantResourceIndexIsWorkspaceScopedTest.php tests/Feature/Filament/WorkspaceOverviewAuthorizationTest.php tests/Feature/Guards/ActionSurfaceContractTest.php tests/Feature/Guards/FilamentTableStandardsGuardTest.php`

## Notes
- no schema change
- no new persisted recovery truth
- branch includes the full Spec 186 spec, plan, research, data model, contract, quickstart, and tasks artifacts

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #217
2026-04-09 19:20:48 +00:00

257 lines
23 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Implementation Plan: Tenant Registry Recovery Triage
**Branch**: `186-tenant-registry-recovery-triage` | **Date**: 2026-04-09 | **Spec**: `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/186-tenant-registry-recovery-triage/spec.md`
**Input**: Feature specification from `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/186-tenant-registry-recovery-triage/spec.md`
## Summary
Turn the existing tenant registry into the operators portfolio recovery-triage surface without creating a second recovery-truth system. The implementation will keep `TenantBackupHealthResolver` and `RestoreSafetyResolver` as the only truth sources for tenant backup posture and tenant recovery evidence, project those states onto `TenantResource` as bounded list columns, add exact posture filters and deterministic worst-first ordering, and change workspace backup and recovery multi-tenant drilldowns from `ChooseTenant` to the filtered tenant registry while preserving single-tenant direct-to-dashboard behavior. The slice stays read-only, introduces no schema change, no new persistence, no new resolver truth, no new panel or asset registration, and no broader dashboard or chooser redesign.
Key approach: work inside the existing `TenantResource`, `ListTenants`, `WorkspaceOverviewBuilder`, and workspace widget seams; batch-load posture state via `assessMany()` and `dashboardRecoveryEvidenceForTenants()` across the current visible workspace tenant set; preserve intent through exact query-string filter parameters on `/admin/tenants`; keep the Filament list action-surface contract intact; and validate the result with focused Pest, Livewire, truthfulness, and RBAC coverage.
## Technical Context
**Language/Version**: PHP 8.4, Laravel 12, Blade, Filament v5, Livewire v4
**Primary Dependencies**: Filament v5 resources and table filters, Livewire v4 `ListRecords`, Pest v4, Laravel Sail, existing `TenantResource`, `ListTenants`, `WorkspaceOverviewBuilder`, `TenantBackupHealthResolver`, `TenantBackupHealthAssessment`, `RestoreSafetyResolver`, `RecoveryReadiness`, and shared badge infrastructure
**Storage**: PostgreSQL with existing tenant-owned `tenants`, `backup_sets`, `backup_items`, `restore_runs`, `policies`, and membership records; no schema change planned
**Testing**: Pest feature tests, Livewire resource-page tests, and focused unit coverage only if a narrow shared presentation or ranking seam is extracted, all run through Sail
**Target Platform**: Laravel web application in Sail locally and containerized Linux deployment in staging and production
**Project Type**: Laravel monolith web application rooted at `apps/platform`
**Performance Goals**: Keep registry rendering DB-only and query-bounded, avoid per-row resolver N+1 behavior, preserve current table pagination and session-persisted filter or sort behavior, and keep the operators first-scan triage answer within 5 to 10 seconds
**Constraints**: No new persisted portfolio posture table, no new recovery score, no chooser redesign, no tenant-level resolver truth change, no cross-tenant leakage, no new dashboard widgets, no extra registry inspect action, no page-local badge language, and no new Filament assets
**Scale/Scope**: One workspace overview builder, one tenant registry resource and page pair, one existing workspace summary or attention destination contract, optional shared badge or copy extension for existing posture states, and focused regression coverage across filters, ordering, drilldowns, truthfulness, and RBAC
## Constitution Check
*GATE: Passed before Phase 0 research. Re-checked after Phase 1 design and still passing.*
| Principle | Status | Notes |
|-----------|--------|-------|
| Inventory-first | Pass | The feature reads existing tenant backup and restore evidence only; no new source of truth or snapshot behavior is introduced |
| Read/write separation | Pass | This is a read-first list and drilldown hardening slice with no new write path |
| Graph contract path | Pass | No Microsoft Graph call or `config/graph_contracts.php` change is required |
| Deterministic capabilities | Pass | Existing capability registry and current tenant or workspace scope rules remain authoritative |
| RBAC-UX planes and 404 vs 403 | Pass | Workspace surfaces stay under `/admin`, tenant follow-up stays under `/admin/t/{tenant}`, non-members remain `404`, and deeper capability checks remain server-side |
| Workspace isolation | Pass | Only current-workspace and in-scope tenants may appear in the registry or drilldowns |
| Tenant isolation | Pass | Registry posture is still derived per tenant and filtered to visible memberships only |
| Destructive confirmation standard | Pass | No new destructive action is introduced; existing destructive tenant actions remain unchanged and confirmed |
| Global search safety | Pass | TenantResource already has view and edit pages; this slice does not broaden global-search scope or semantics |
| Run observability | Pass | No new queued work, no new `OperationRun`, and no lifecycle transition is added |
| Ops-UX 3-surface feedback | Pass | No run feedback surface is added or changed |
| Ops-UX lifecycle ownership | Pass | `OperationRun.status` and `OperationRun.outcome` remain untouched |
| Ops-UX summary counts | Pass | No `summary_counts` change is needed |
| Data minimization | Pass | Registry posture uses existing derived summaries and does not expose deeper payload detail |
| Proportionality (PROP-001) | Pass | Changes stay inside existing list and builder seams; no new persistence, no new page shell, and no new portfolio framework |
| No premature abstraction (ABSTR-001) | Pass | The plan prefers local resource or page logic plus existing resolvers; any shared badge or copy extraction stays within already-existing seams |
| Persisted truth (PERSIST-001) | Pass | No new table, column, cache, or materialized posture mirror is introduced |
| Behavioral state (STATE-001) | Pass | Registry reuses existing posture states only; no new domain state family is added |
| UI semantics (UI-SEM-001) | Pass | The registry projects existing domain truth directly; no new explanation or confidence framework is introduced |
| Badge semantics (BADGE-001) | Pass | Status-like registry badges must use shared badge or copy semantics rather than ad-hoc local colors or labels |
| Filament-native UI (UI-FIL-001) | Pass | Existing Filament table columns, filters, and widgets remain the implementation seams |
| UI Action Surface Contract | Pass | `TenantResource` remains a list-first resource with row-click inspect and at most one inline safe shortcut; no redundant View action is added |
| Filament UX-001 | Pass with documented variance | No create or edit layout change is involved; this slice refines list scanability, filters, and drillthrough continuity only |
| Filament v5 / Livewire v4 compliance | Pass | The implementation stays inside the current Filament v5 and Livewire v4 stack |
| Provider registration location | Pass | No panel or provider change is required; Laravel 11+ provider registration remains in `bootstrap/providers.php` |
| Asset strategy | Pass | No new assets are planned; existing deployment `filament:assets` behavior remains unchanged |
## Phase 0 Research
Research outcomes are captured in `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/186-tenant-registry-recovery-triage/research.md`.
Key decisions:
- Use workspace-scoped batch posture maps built from `TenantBackupHealthResolver::assessMany()` and `RestoreSafetyResolver::dashboardRecoveryEvidenceForTenants()` instead of per-row resolver calls or a persisted registry summary.
- Preserve workspace drilldown intent through exact `/admin/tenants` query parameters and `ListTenants` initialization rather than `ChooseTenant`, hidden session-only state, or a new page shell.
- Use multi-select posture filters so workspace backup attention can preselect `absent`, `stale`, and `degraded`, and workspace recovery attention can preselect `weakened` and `unvalidated`, without inventing a pseudo `needs attention` filter value.
- Keep the TenantResource action-surface contract intact: full-row click remains inspect, and the existing one-inline-shortcut pattern remains the fast next click.
- Reuse or extract one shared mapping seam from `RecoveryReadiness` for label, tone, and fallback-URL semantics so the registry and dashboard cannot drift into parallel local mapping tables for the same posture states.
- Extend the current workspace summary-metric tests, workspace continuity tests, tenant registry scope tests, and table-guard coverage rather than introducing a browser-first harness.
## Phase 1 Design
Design artifacts are created under `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/186-tenant-registry-recovery-triage/`:
- `research.md`: implementation and design decisions for posture loading, exact filter intent, and continuity behavior
- `data-model.md`: existing persisted inputs plus the derived registry row and triage-intent model
- `contracts/tenant-registry-recovery-triage.openapi.yaml`: internal route contract for filtered registry and single-tenant dashboard drilldowns
- `quickstart.md`: focused automated and manual validation workflow for registry posture triage
Design decisions:
- No schema migration is required. The registry will consume existing tenant truth and derive row-level posture and rank at render time.
- `TenantResource` remains workspace-scoped and continues to own the canonical `/admin/tenants` surface. The feature adds columns, filters, and triage ordering, not a new resource or a new shell.
- Triage filtering should be exact, not vague. Backup drilldowns preselect `absent`, `stale`, and `degraded`; recovery drilldowns preselect `weakened` and `unvalidated`; both default to worst-first sorting.
- Worst-first ordering must operate over the filtered visible tenant set before pagination, using a deterministic tier map and a stable secondary order.
- `WorkspaceOverviewBuilder` is the only place where backup and recovery multi-tenant drilldown destinations need to change. Existing workspace widgets already honor `destination_url` and should not need a new interaction model.
- The tenant dashboard remains the safe fallback next step whenever deeper backup or restore surfaces would lose the same posture reason or fail permission checks.
## Project Structure
### Documentation (this feature)
```text
specs/186-tenant-registry-recovery-triage/
├── spec.md
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│ └── tenant-registry-recovery-triage.openapi.yaml
├── checklists/
│ └── requirements.md
└── tasks.md
```
### Source Code (repository root)
```text
apps/platform/
├── app/
│ ├── Filament/
│ │ ├── Pages/
│ │ │ └── WorkspaceOverview.php
│ │ ├── Resources/
│ │ │ ├── TenantResource.php
│ │ │ └── TenantResource/
│ │ │ └── Pages/
│ │ │ └── ListTenants.php
│ │ └── Widgets/
│ │ ├── Dashboard/
│ │ │ └── RecoveryReadiness.php
│ │ └── Workspace/
│ │ ├── WorkspaceNeedsAttention.php
│ │ └── WorkspaceSummaryStats.php
│ └── Support/
│ ├── BackupHealth/
│ │ ├── TenantBackupHealthAssessment.php
│ │ └── TenantBackupHealthResolver.php
│ ├── RestoreSafety/
│ │ └── RestoreSafetyResolver.php
│ ├── Workspaces/
│ │ └── WorkspaceOverviewBuilder.php
│ └── Badges/
│ ├── BadgeCatalog.php
│ ├── BadgeDomain.php
│ └── BadgeRenderer.php
└── tests/
├── Feature/
│ └── Filament/
│ ├── TenantResourceIndexIsWorkspaceScopedTest.php
│ ├── WorkspaceOverviewAuthorizationTest.php
│ ├── WorkspaceOverviewDrilldownContinuityTest.php
│ ├── WorkspaceOverviewSummaryMetricsTest.php
│ └── TenantRegistryRecoveryTriageTest.php
└── Feature/
└── Guards/
├── ActionSurfaceContractTest.php
└── FilamentTableStandardsGuardTest.php
```
**Structure Decision**: Standard Laravel monolith under `apps/platform`. The implementation stays inside the current tenant registry resource, current workspace overview builder, existing workspace widgets, and existing tests. If a shared presentation seam is required for the new registry badges, it must extend an existing badge or copy surface rather than introducing a new framework.
## Implementation Strategy
### Phase A — Build Workspace-Scoped Posture Snapshots For The Registry
**Goal**: Make backup posture, recovery evidence, filter sets, and triage rank available to the registry without per-row resolver calls or a new persisted summary.
| Step | File | Change |
|------|------|--------|
| A.1 | `apps/platform/app/Filament/Resources/TenantResource.php` | Add one request-scoped posture snapshot path over the already scoped tenant set using `assessMany()` and `dashboardRecoveryEvidenceForTenants()`, keyed by tenant ID and reusable by columns, filters, and worst-first ordering |
| A.2 | `apps/platform/app/Filament/Resources/TenantResource.php` | Derive exact filter buckets and triage rank buckets from the snapshot map so filtering and ordering operate on the same truth source |
| A.3 | `apps/platform/app/Filament/Widgets/Dashboard/RecoveryReadiness.php` and `apps/platform/app/Filament/Resources/TenantResource.php` | Reuse or extract one shared bounded label or tone mapping seam for `absent`, `stale`, `degraded`, `healthy`, `weakened`, `unvalidated`, and `no recent issues visible` rather than local page-only mappings |
### Phase B — Turn TenantResource Into A Working Triage Surface
**Goal**: Add visible posture truth, exact filters, and deterministic weak-first ordering while keeping the current list action surface stable.
| Step | File | Change |
|------|------|--------|
| B.1 | `apps/platform/app/Filament/Resources/TenantResource.php` | Add default-visible `Backup posture` and `Recovery evidence` columns near tenant identity and lifecycle, keeping metadata and recovery truth visually separate |
| B.2 | `apps/platform/app/Filament/Resources/TenantResource.php` | Add multi-select backup-posture and recovery-evidence filters using exact state values and workspace-scoped `whereIn` filtering over snapshot-derived tenant ID sets |
| B.3 | `apps/platform/app/Filament/Resources/TenantResource.php` | Add deterministic worst-first ordering using the triage-rank map and stable secondary ordering by tenant name while preserving current default calm browsing when triage ordering is not requested |
| B.4 | `apps/platform/app/Filament/Resources/TenantResource.php` | Keep row click as the canonical inspect model and reuse the existing safe inline shortcut for fast next-step clarity instead of adding a new View action |
### Phase C — Preserve Drilldown Intent On `/admin/tenants`
**Goal**: Ensure workspace backup and recovery drilldowns open the registry in a visibly filtered, weak-first state rather than losing cause in `ChooseTenant`.
| Step | File | Change |
|------|------|--------|
| C.1 | `apps/platform/app/Filament/Resources/TenantResource/Pages/ListTenants.php` | Add narrow query-string initialization for exact posture filter arrays and `triage_sort=worst_first`, translating them into the pages table filter or sort state without changing the core page model |
| C.2 | `apps/platform/app/Filament/Resources/TenantResource/Pages/ListTenants.php` | Preserve existing session-backed list state while allowing explicit query-string intent to override it on first load for workspace drilldowns |
| C.3 | `apps/platform/app/Filament/Resources/TenantResource.php` and `ListTenants.php` | Add a bounded empty-state or subheading strategy for zero-match filtered triage views so the registry stays scoped and honest even when no visible tenant matches the current filter |
### Phase D — Change Workspace Backup And Recovery Multi-Tenant Destinations
**Goal**: Move multi-tenant backup and recovery attention drilldowns from `ChooseTenant` to the filtered registry while keeping exact single-tenant behavior.
| Step | File | Change |
|------|------|--------|
| D.1 | `apps/platform/app/Support/Workspaces/WorkspaceOverviewBuilder.php` | Replace multi-tenant `choose_tenant` destinations for `backup_attention_tenants` and `recovery_attention_tenants` with exact `/admin/tenants` URLs carrying posture filter arrays and `triage_sort=worst_first` |
| D.2 | `apps/platform/app/Support/Workspaces/WorkspaceOverviewBuilder.php` | Keep single-tenant backup or recovery drilldowns routed directly to the tenant dashboard when only one visible tenant is affected |
| D.3 | `apps/platform/app/Support/Workspaces/WorkspaceOverviewBuilder.php` and existing workspace widgets | Keep the existing `destination_url` and destination-kind contract unchanged and do not introduce a new `tenant_registry` kind label for this slice |
### Phase E — Lock Semantics With Focused Regression Coverage
**Goal**: Protect list truth, triage ordering, workspace continuity, and scope safety against regression.
| Step | File | Change |
|------|------|--------|
| E.1 | `apps/platform/tests/Feature/Filament/TenantRegistryRecoveryTriageTest.php` | Add focused coverage for row posture rendering, exact backup filters, exact recovery filters, worst-first ordering, overclaim avoidance, and metadata separation |
| E.2 | `apps/platform/tests/Feature/Filament/WorkspaceOverviewSummaryMetricsTest.php` | Update the backup-attention multi-tenant expectation from `choose_tenant` to the filtered tenant registry and add equivalent multi-tenant recovery coverage where needed |
| E.3 | `apps/platform/tests/Feature/Filament/WorkspaceOverviewDrilldownContinuityTest.php` | Extend continuity coverage so backup and recovery attention preserve meaning through the tenant-registry destination as well as single-tenant dashboard destinations |
| E.4 | `apps/platform/tests/Feature/Filament/WorkspaceOverviewAuthorizationTest.php` and `TenantResourceIndexIsWorkspaceScopedTest.php` | Verify no hidden-tenant leakage across posture filters, triage ordering, or workspace drilldowns |
| E.5 | `apps/platform/tests/Feature/Guards/ActionSurfaceContractTest.php` and `FilamentTableStandardsGuardTest.php` | Keep existing list-surface and table-standards guards passing after the registry column and filter changes |
| E.6 | `cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent` and focused Pest runs | Required formatting and targeted verification before implementation is considered complete |
## Key Design Decisions
### D-001 — Registry posture stays fully derived
The tenant registry consumes existing backup and recovery truth at render time. There is no new tenant summary table, no cached portfolio score, and no new recovery-confidence persistence.
### D-002 — Exact states beat vague attention flags
Workspace drilldowns should preselect exact posture states, not a generic `needs attention` filter. That keeps the registry honest and makes the visible filter state itself the explanation.
### D-003 — Weak-first ordering must happen before pagination
If ranking happens only on the current page, the worst tenant may remain hidden on another page. The rank map therefore needs to shape the filtered query order over the full scoped tenant set before pagination.
### D-004 — Existing list interaction model remains intact
`TenantResource` already satisfies the list-first inspect model. The feature must not turn it into a multi-action dashboard or add a redundant View affordance.
### D-005 — Recovery-readiness semantics should not fork
The codebase already has one place that renders backup posture and recovery evidence together: `RecoveryReadiness`. The registry should call the same shared bounded mapping seam, either by reusing existing methods or by extracting a narrow helper into the same seam, rather than creating another local mapping language inside `TenantResource`.
### D-006 — Workspace widgets keep the existing destination contract
The workspace widgets already consume `destination_url` correctly. This slice changes those URLs for multi-tenant backup and recovery drilldowns, but it does not add a new destination-kind value or require a widget interaction redesign.
### D-007 — Workspace overview only changes destination logic
The workspace overview already computes backup and recovery attention correctly. This slice changes where multi-tenant clicks land, not what those metrics mean.
## Risk Assessment
| Risk | Impact | Likelihood | Mitigation |
|------|--------|------------|------------|
| Registry filtering or ranking is computed only for the current page and hides weaker tenants on later pages | High | Medium | Build filter buckets and triage rank across the full scoped tenant set before pagination and cover with ordering tests |
| Registry adds new local badge or tone mappings that drift from existing dashboard posture semantics | High | Medium | Reuse existing shared badge or copy seams and add truthfulness coverage for labels and tones |
| Multi-tenant workspace drilldowns still lose cause because filter intent is not visibly applied on first load | High | Medium | Initialize ListTenants from exact query parameters and cover continuity through summary-metric and drilldown tests |
| Metadata such as lifecycle status or last sync visually reads like recovery truth after the new columns are added | Medium | Medium | Place backup posture and recovery evidence explicitly, keep them separately labeled, and add no-metadata-substitution coverage |
| Batch posture loading over the scoped tenant set becomes too expensive in larger workspaces | Medium | Medium | Reuse existing batch resolvers, keep query shape bounded to the current scoped tenant set, and cover the feature with query-bounded regression expectations |
## Test Strategy
- Add a focused tenant-registry triage feature test as the primary acceptance harness for row rendering, filters, worst-first ordering, and truthfulness.
- Add explicit query-bounded regression coverage so registry posture loading does not degrade into uncontrolled per-row resolver fanout.
- Extend current workspace summary-metric and drilldown continuity tests so multi-tenant backup and recovery destinations move from `ChooseTenant` to the filtered registry without regressing the single-tenant dashboard path.
- Extend workspace authorization and tenant registry scope coverage so posture filters and registry drilldowns remain bounded to visible tenants only.
- Keep existing action-surface and table-standards guards green so the tenant registry stays compliant with the Filament list contract after the new columns and filters are added.
- Prefer Livewire or feature-level verification over browser-first testing because the codebase already has direct coverage seams for `ListTenants`, `WorkspaceOverviewBuilder`, and related widgets.
- Run all focused tests through Sail and finish with `cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent`.