TenantAtlas/specs/186-tenant-registry-recovery-triage/research.md
ahmido 9fbd3e5ec7 Spec 186: implement tenant registry recovery triage (#217)
## Summary
- turn the tenant registry into a workspace-scoped recovery triage surface with backup posture and recovery evidence columns
- preserve workspace overview backup and recovery drilldown intent by routing multi-tenant cases into filtered tenant registry slices
- add the Spec 186 planning artifacts, focused regression coverage, and shared triage presentation helpers

## Testing
- `cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent`
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Filament/TenantRegistryRecoveryTriageTest.php tests/Feature/Filament/WorkspaceOverviewSummaryMetricsTest.php tests/Feature/Filament/WorkspaceOverviewDrilldownContinuityTest.php tests/Feature/Filament/TenantResourceIndexIsWorkspaceScopedTest.php tests/Feature/Filament/WorkspaceOverviewAuthorizationTest.php tests/Feature/Guards/ActionSurfaceContractTest.php tests/Feature/Guards/FilamentTableStandardsGuardTest.php`

## Notes
- no schema change
- no new persisted recovery truth
- branch includes the full Spec 186 spec, plan, research, data model, contract, quickstart, and tasks artifacts

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #217
2026-04-09 19:20:48 +00:00

5.4 KiB

Research: Tenant Registry Recovery Triage

Decision 1: Use workspace-scoped batch posture maps from existing resolvers

  • Decision: Build tenant-registry posture data from TenantBackupHealthResolver::assessMany() and RestoreSafetyResolver::dashboardRecoveryEvidenceForTenants() over the already scoped visible tenant set instead of resolving posture per row.
  • Rationale: Both resolvers already provide the exact truth the registry needs and are already used together in WorkspaceOverviewBuilder. Reusing them keeps backup posture and recovery evidence aligned with existing workspace and tenant surfaces and avoids N+1 list rendering.
  • Alternatives considered:
    • Per-row resolver calls inside table column closures: rejected because it creates predictable N+1 behavior and duplicates work between columns, filters, and sorting.
    • Reimplement posture logic in SQL joins or subqueries: rejected because it duplicates domain truth outside the existing resolver path and risks semantic drift.
    • Persist a tenant posture summary table: rejected because the spec explicitly forbids a second persisted truth.

Decision 2: Preserve multi-tenant drilldown intent through exact query parameters on /admin/tenants

  • Decision: Change workspace backup and recovery multi-tenant destinations to /admin/tenants with query parameters that encode exact posture selections and triage_sort=worst_first.
  • Rationale: The codebase already uses query-string intent on list and page surfaces, and this approach preserves meaning directly in the URL without adding a new page, hidden state, or session-only transfer mechanism.
  • Alternatives considered:
    • Keep ChooseTenant for multi-tenant drilldowns: rejected because it drops the recovery or backup cause and sends the operator back to manual tenant-by-tenant inspection.
    • Use only session state and no URL signal: rejected because it makes drilldown state invisible and harder to test.
    • Add a dedicated portfolio recovery page: rejected because it introduces a larger IA change than the current release needs.

Decision 3: Use multi-select posture filters with exact state values

  • Decision: Represent triage filters as exact state arrays rather than introducing needs attention or other pseudo-filter values.
  • Rationale: The spec explicitly requires clear posture-based filtering and rejects semantically vague attention filters. Multi-select filters allow workspace backup drilldowns to preselect absent, stale, and degraded, and recovery drilldowns to preselect weakened and unvalidated, while keeping the filter chips themselves truthful.
  • Alternatives considered:
    • Single-select filters plus an attention option: rejected because it hides the underlying state composition.
    • Custom tabs or page variants per posture family: rejected because they expand the surface model unnecessarily.
    • Hard-coded workspace-only registry variants: rejected because the registry should stay one canonical collection route.

Decision 4: Keep the TenantResource list action surface contract intact

  • Decision: Preserve full-row click as the one primary inspect model and keep at most one inline safe shortcut for fast next-step navigation.
  • Rationale: TenantResource is already governed by the current action-surface contract and guard tests. The triage enhancement should be expressed through visible posture truth, filters, and ordering rather than additional row actions.
  • Alternatives considered:
    • Add a dedicated View action: rejected because it duplicates the existing row-click inspect model.
    • Add multiple inline shortcuts to backup sets, restore runs, and dashboard: rejected because it would turn the registry into a crowded multi-action surface and violate the list hierarchy rules.

Decision 5: Reuse existing recovery-readiness presentation semantics instead of inventing registry-local posture language

  • Decision: Reuse or absorb the label, tone, and fallback-navigation semantics already present in RecoveryReadiness for Backup posture and Recovery evidence when the registry renders the same states.
  • Rationale: The codebase already has one operator-facing composition that normalizes these exact states. Reusing that semantics avoids drift between dashboard and registry surfaces and keeps any new status-like rendering inside existing shared seams.
  • Alternatives considered:
    • Local match statements inside TenantResource: rejected because they would create another page-local status language.
    • A new presentation framework or registry-specific presenter: rejected because there are only two concrete surfaces and the constitution prefers small extensions of existing seams over new frameworks.

Decision 6: Use existing feature and guard tests as the primary acceptance harness

  • Decision: Extend current workspace overview and tenant resource tests and add one focused registry triage test file instead of introducing a browser-first harness.
  • Rationale: The repo already has precise seams for workspace summary metrics, drilldown continuity, tenant registry scope, and Filament table standards. Those seams are sufficient to prove truthfulness, scope safety, and ordering.
  • Alternatives considered:
    • New browser tests: rejected because the relevant behaviors are already observable at the builder and Livewire list layers.
    • Manual-only QA: rejected because the spec explicitly requires regression-safe automated coverage.