TenantAtlas/specs/334-nested-filament-context-contract-hardening/spec.md
ahmido f967db7983 Spec 334: harden nested Filament Livewire context contract (#395)
## Summary
- harden nested Filament and Livewire tenant-context handling across the backup schedule operation runs relation manager, managed-environment triage arrival continuity, the backup set policy picker table, and the Operate Hub shell
- add architecture, feature, and browser coverage for nested Filament tenant-context continuity and restore-run resource behavior
- add the Spec 334 artifacts (`spec.md`, `plan.md`, `tasks.md`, and the requirements checklist)

## Testing
- Not run as part of this commit/push/PR workflow

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #395
2026-05-24 21:33:19 +00:00

344 lines
20 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Feature Specification: Spec 334 - Nested Filament / Livewire Context Contract Hardening
- Feature Branch: `334-nested-filament-context-contract-hardening`
- Created: 2026-05-24
- Status: Draft
- Type: Platform integrity / Livewire lifecycle hardening / Filament context contract / productization blocker removal
- Runtime posture: Narrow platform hardening. No tenancy rewrite.
- Input: User-provided Spec 334 draft + repo inspection for path truth.
## Dependencies And Historical Context
This spec exists because two independent operator flows exposed the same underlying defect class:
1) **Backup Set → “Add Policies” picker modal**
- `Filament::getTenant()` can be `null` in nested/modal contexts.
- Authorization then fails false-negative for a bulk action / row selection.
- Filament hides the row selection checkbox column.
- Operator sees policies as non-selectable.
2) **Restore Run → Create wizard (Livewire update lifecycle)**
- Livewire update request is a `POST /livewire-.../update` request with no route parameters.
- Option closures can evaluate in that lifecycle.
- Context resolution that hard-requires ambient Filament tenant can throw, blocking the wizard.
Repo truth note: the user draft references a “Spec 332 Restore Flow Productization WIP” as the direct dependency. On `platform-dev` there is currently no `specs/332-*` directory. This spec still stands as a narrow platform hardening slice, and should be linked to the active restore productization spec/branch once it exists in-repo.
Related existing work (context only; do not modify closed specs here):
- `specs/152-livewire-context-locking` (draft): broader “trusted state” hardening. This spec is narrower and explicitly tenant-context-centric.
- `specs/302-tenant-owned-surface-route-audit`: established the current `ResolvesPanelTenantContext` seam and route-owned environment posture.
- `docs/research/filament-v5-notes.md`: source of truth when Filament/Livewire behavior is uncertain.
## Spec Candidate Check *(mandatory - SPEC-GATE-001)*
- **Problem**: Nested Filament / Livewire surfaces sometimes execute without reliable ambient tenant context, causing false-negative authorization, hidden selection controls, or hard runtime failures in operator-critical flows.
- **Today's failure**: Authorized operators cannot select policies in the Add Policies picker (checkbox column missing) and restore create flows can crash during Livewire updates with “no tenant context selected”.
- **User-visible improvement**: Operator-critical nested surfaces remain stable (no missing checkboxes, no wizard crash) and fail closed with honest UI states when context is genuinely unavailable.
- **Smallest enterprise-capable version**: Harden only confirmed high-risk nested surfaces and the shared context seam; add guardrails/tests so unsafe patterns do not reappear. No route architecture rewrite, no tenancy model redesign.
- **Explicit non-goals**: No new tenancy concept, no workspace model redesign, no panel rewrite, no broad repo-wide replacement of `Filament::getTenant()`, no RBAC rebuild, no speculative general framework beyond what at least two confirmed surfaces require.
- **Permanent complexity imported**: Targeted context resolver adjustments, a small optional helper (only if reused by ≥2 confirmed surfaces), focused Pest tests (Feature/Livewire + one architecture guard), and browser smoke steps for the two originally user-visible bugs.
- **Why now**: This is a platform-level productization blocker for restore workflows and breaks a visible day-to-day operator action (“Add policies”).
- **Why not local**: The same defect class occurs across independent surfaces (modal table, wizard update, relation manager, widget). Without an explicit contract + guardrails, it will recur.
- **Approval class**: Core Enterprise.
- **Red flags triggered**: (2) new meta-infrastructure risk (context helper + guard tests), (3) multiple surfaces. **Defense**: scope is explicitly limited to confirmed surfaces; helper is optional and must satisfy ABSTR-001 (≥2 real consumers); no new persisted truth, routes, or global “magic tenant switch”.
- **Score**: Nutzen: 2 | Dringlichkeit: 2 | Scope: 2 | Komplexitaet: 1 | Produktnaehe: 2 | Wiederverwendung: 2 | **Gesamt: 11/12**
- **Decision**: approve.
## Spec Scope Fields *(mandatory)*
- **Scope**: tenant (managed-environment scoped operator surfaces under `/admin/workspaces/{workspace}/environments/{environment}/...`) + Livewire update lifecycle hardening.
- **Primary Routes / Surfaces** (representative, non-exhaustive):
- Backup Set detail → “Add policies” modal table (Livewire component).
- Restore Runs → Create wizard (Filament resource create page).
- Backup Schedule detail → Operation Runs relation manager table.
- Managed Environment dashboard → triage widget.
- **Data Ownership**:
- Tenant/environment-owned records: `ManagedEnvironment`, `BackupSet`, `BackupSchedule`, `RestoreRun`, related policy inventory models.
- Workspace ownership remains authoritative; context recovery must validate workspace membership/entitlement.
- No new persisted entity/table is introduced by this spec.
- **RBAC**:
- Workspace membership + environment entitlement required to view tenant-scoped surfaces.
- Mutations (attach policies, restore run create/confirm, etc.) require explicit capability checks and policy/gate enforcement; UI visibility is not authorization.
For canonical-view specs: N/A. This spec hardens environment/tenant-scoped nested surfaces, not a workspace-owned canonical viewer.
## UI Surface Impact *(mandatory - UI-COV-001)*
Does this spec add, remove, rename, or materially change any reachable UI surface?
- [ ] No UI surface impact
- [x] Existing page changed
- [ ] New page/route added
- [ ] Navigation changed
- [ ] Filament panel/provider surface changed
- [x] New modal/drawer/wizard/action added
- [x] New table/form/state added
- [ ] Customer-facing surface changed
- [x] Dangerous action changed
- [ ] Status/evidence/review presentation changed
- [x] Workspace/environment context presentation changed
## UI/Productization Coverage
- **Route/page/surface**: Add Policies modal table, Restore Run Create wizard, Backup Schedule Operation Runs relation manager, Managed Environment triage widget.
- **Current or new page archetype**: Domain Pattern Surface (nested CRUD tooling and wizards), not a new strategic page archetype.
- **Design depth**: Internal/Hidden (but operator-reachable and workflow-critical).
- **Repo-truth level**: repo-verified for file paths and current helper seams; runtime behavior must be validated by tests + browser smoke.
- **Existing pattern reused**: current `ResolvesPanelTenantContext` seam, route-owned environment posture, `UiEnforcement`, policies/gates, and owner-record scoping patterns.
- **New pattern required**: explicit nested context resolution order for high-risk surfaces; optional small helper only if used by ≥2 surfaces.
- **Screenshot required**: no new design screenshots required; browser smoke is required for the two user-visible bugs.
- **Page audit required**: no UI audit registry update expected unless implementation changes navigation or surface archetype.
- **Customer-safe review required**: no. These are operator surfaces; still must fail closed and never broaden scope.
- **Dangerous-action review required**: yes. Restore-related and attach-related actions are high-impact and must keep explicit confirmation + authorization + audit posture.
- **Coverage files updated or explicitly not needed**:
- [ ] `docs/ui-ux-enterprise-audit/route-inventory.md`
- [ ] `docs/ui-ux-enterprise-audit/design-coverage-matrix.md`
- [ ] `docs/ui-ux-enterprise-audit/page-reports/...`
- [ ] `docs/ui-ux-enterprise-audit/strategic-surfaces.md`
- [ ] `docs/ui-ux-enterprise-audit/grouped-follow-up-candidates.md`
- [ ] `docs/ui-ux-enterprise-audit/unresolved-pages.md`
- [x] `N/A - no new routes/pages; coverage handled via tests + browser smoke + PR close-out note`
## Cross-Cutting / Shared Pattern Reuse
- **Cross-cutting feature?**: yes.
- **Interaction class(es)**: authorization/visibility, table row selection, bulk actions, wizard step transitions, select option closures, widget visibility/mutation.
- **Systems touched**: Filament tenant resolution seam, route-owned environment posture, remembered environment context (validated only), and nested Livewire lifecycle behavior.
- **Existing pattern(s) to extend**: owner-record scoping, `UiEnforcement`, deny-as-not-found (404) vs forbidden (403) semantics, and existing environment context middleware/helpers.
- **Shared contract / presenter / builder / renderer to reuse**: `apps/platform/app/Filament/Concerns/ResolvesPanelTenantContext.php`, `apps/platform/app/Support/OperateHub/OperateHubShell.php`, `apps/platform/app/Support/Workspaces/WorkspaceContext.php`, `apps/platform/app/Support/Rbac/UiEnforcement.php`, and existing policies.
- **Why the existing shared path is sufficient or insufficient**: sufficient for route-owned page loads; insufficient for nested Livewire update requests and nested surfaces that currently assume ambient tenant is always present.
- **Allowed deviation and why**: bounded “nested context contract” logic may be added only where confirmed surfaces prove it is needed; no global ambient-tenant-as-truth fallback is allowed.
- **Review focus**: fail-closed semantics (no broadening), no unsafe `Filament::setTenant($model->...)` without validated ownership/capability, and stable UX for authorized operators when context is recoverable.
## Summary
Harden TenantPilots nested Filament and Livewire context handling so critical nested surfaces do not depend on unreliable ambient `Filament::getTenant()` state during modal, wizard, relation manager, widget, and Livewire update lifecycles.
This spec does **not** rewrite tenancy, routing, RBAC, or the workspace model. It introduces targeted hardening for confirmed high-risk surfaces and installs guardrails so the same issue does not keep reappearing.
## Problem Statement
Nested Filament/Livewire surfaces can execute outside the full page routing lifecycle:
- modal tables
- relation managers
- wizard step transitions
- select option closures
- Livewire update requests (`POST /livewire-.../update`)
- widgets
- bulk actions
- row selection
- table refresh/search/filter/pagination
In those contexts:
- route parameters may be missing
- ambient Filament tenant may be `null`
- authorization can fail false-negative
- UI controls may disappear
- option closures may hard-fail
Root problem:
> Ambient framework context is being used where explicit domain context is required.
Desired nested context contract:
1. Owner/domain context first.
2. Validated route/workspace/environment context second.
3. Remembered/session environment only if validated.
4. Ambient Filament tenant only as fallback convenience.
5. Fail closed with a clear product state.
6. Never switch tenant from a model before ownership/capability is proven.
## Goals
### G1 — Define and enforce a nested context contract
Create a consistent rule for nested Filament/Livewire surfaces:
> Domain context is authoritative. Filament tenant is runtime convenience.
The contract must cover: table queries, option closures, visibility/disabled checks, row selection, bulk actions, mutation handlers, relation managers, widgets, and wizard transitions.
### G2 — Fix the confirmed Add Policies / BackupSet picker class
`apps/platform/app/Livewire/BackupSetPolicyPickerTable.php` must be self-protecting:
- does not trust a passed BackupSet ID before validating workspace/environment ownership and access
- authorized operators can select policies even if ambient tenant is initially null
- forged cross-workspace/cross-environment IDs fail closed
- no unguarded `Filament::setTenant(...)`
### G3 — Fix the confirmed Restore Run Create Wizard Livewire context loss
`apps/platform/app/Filament/Resources/RestoreRunResource.php` and shared context resolution must not hard-fail during Livewire update requests when context is recoverable from a validated source.
### G4 — Harden owner-derived relation/widget surfaces
At minimum:
- `apps/platform/app/Filament/Resources/BackupScheduleResource/RelationManagers/BackupScheduleOperationRunsRelationManager.php`
- `apps/platform/app/Filament/Widgets/ManagedEnvironment/ManagedEnvironmentTriageArrivalContinuity.php`
must resolve context from owner record / widget record before using ambient tenant.
### G5 — Install guardrails
Add guardrails/tests that prevent reintroducing unsafe patterns such as model-derived `Filament::setTenant(...)` in nested surfaces without prior ownership/capability validation.
### G6 — Unblock restore productization validation
After this spec is implemented, restore flow productization work should be able to run restore create wizard browser validation without a tenantless Livewire update crash.
## Non-Goals
- no workspace model redesign
- no managed environment model redesign
- no Filament panel rewrite
- no route architecture rewrite
- no RBAC rebuild
- no broad resource authorization audit
- no restore UX redesign beyond what is needed to make the wizard stable
- no broad repo-wide replacement of `Filament::getTenant()` usage
## Core Rules
### R1 — Domain context before Filament tenant
Nested surfaces must use owner/domain records first.
### R2 — No unguarded model-derived tenant switch
Switching tenant based on a loaded model is forbidden unless ownership/capability is validated and mismatch handling fails closed.
### R3 — UI visibility is not authorization
Hidden checkboxes/buttons are not security boundaries. Mutations must be authorization-checked at execution time.
### R4 — Fail closed, but not mysteriously
If context is invalid: no data leak, no mutation, no unscoped query, and no silent cross-tenant switch.
### R5 — Livewire update requests need recoverable context
During Livewire update (`POST /livewire-.../update`), route params may be absent. Recovery must come from validated component state / owner record / validated remembered context. Referer may be a candidate only, never authority.
### R6 — Mutation-time recheck
Every mutation handler must re-resolve and re-check context and authorization.
## Confirmed Scope
### S1 — BackupSetPolicyPickerTable
- File: `apps/platform/app/Livewire/BackupSetPolicyPickerTable.php`
- Issue class: tenantless modal context → false-negative authorization → checkbox column hidden.
- Required outcome: authorized users can select and attach in-scope policies; cross-scope mounts fail closed; no unsafe tenant switching.
### S2 — RestoreRunResource Livewire update context loss
- Files:
- `apps/platform/app/Filament/Resources/RestoreRunResource.php`
- `apps/platform/app/Filament/Concerns/ResolvesPanelTenantContext.php`
- Issue class: Livewire update has no route params → tenant resolution throws.
- Required outcome: wizard does not crash; option closures do not depend exclusively on ambient tenant; context recovery is validated.
### S3 — BackupScheduleOperationRunsRelationManager
- File: `apps/platform/app/Filament/Resources/BackupScheduleResource/RelationManagers/BackupScheduleOperationRunsRelationManager.php`
- Required outcome: owner schedule context drives query and URLs; missing/wrong ambient tenant does not broaden scope.
### S4 — ManagedEnvironmentTriageArrivalContinuity
- File: `apps/platform/app/Filament/Widgets/ManagedEnvironment/ManagedEnvironmentTriageArrivalContinuity.php`
- Required outcome: record context first, ambient tenant fallback only; missing context fails closed; visibility and mutation use the same resolver.
### S5 — Guardrail tests
New or existing guard tests must live alongside existing architecture tests at:
- `apps/platform/tests/Architecture/FilamentTenantContextContractTest.php` (new)
## Acceptance Criteria
### AC1 — Restore Run Create Wizard no longer crashes on Livewire update
- Livewire updates do not throw “no tenant context selected” when context is recoverable.
- Backup set options remain environment-scoped.
- Cross-workspace or cross-environment options are never visible.
- If context is not recoverable, the UI fails closed (disabled/empty) without leaking scope.
### AC2 — BackupSet Policy Picker is self-protecting
- Authorized tenantless modal mount shows selectable in-scope policies.
- Checkbox column remains visible for authorized users.
- Cross-scope or forged mounts fail closed.
- Cross-environment policy IDs cannot be attached.
### AC3 — BackupSchedule Operation Runs relation is owner-scoped
- Relation uses owner BackupSchedule context.
- Missing ambient tenant does not crash.
- Wrong ambient tenant does not broaden query or URLs.
### AC4 — Triage widget uses explicit context
- Record context is preferred; ambient tenant is fallback only.
- Missing context fails closed.
- Authorized operators do not lose actions due to ambient context loss.
### AC5 — Guardrails exist
- Unsafe model-derived `Filament::setTenant(...)` in nested surfaces is detected.
- Allowlist is explicit and limited to infrastructure.
### AC6 — No broad unrelated changes
- No panel/routing rewrite, no workspace/managed-environment redesign, no new persisted truth, and no unrelated UX/productization expansion.
## Proportionality Review
- **New source of truth?**: no.
- **New persisted entity/table/artifact?**: no.
- **New abstraction?**: optional small helper only if it is reused by ≥2 confirmed surfaces; otherwise local resolvers only.
- **New enum/status family?**: no.
- **Current operator problem**: missing row selection and crashing restore wizard due to context loss.
- **Narrowest correct implementation**: validate and re-resolve context from owner/domain and canonical workspace/environment truth; fall back only when safe; add guard tests.
- **Ownership cost**: limited helper/tests + browser smoke for two user-visible issues.
- **Alternative rejected**: tenancy rewrite, route architecture changes, broad repo sweep of `Filament::getTenant()`, or “always set tenant from model” fallback.
## Testing / Lane / Runtime Impact *(mandatory for runtime behavior changes)*
- **Test purpose / classification**: Feature + Livewire + Architecture (guard); browser smoke for the two user-visible bugs.
- **Validation lane(s)**: confidence + browser (only for the named smoke flows).
- **Why this lane mix is sufficient**: Feature/Livewire tests prove scoped behavior and prevent regressions; browser smoke verifies the original “checkbox missing” and “wizard crash” end-to-end.
- **No Graph calls during UI render**: must remain true; context recovery must be DB-only and authorization-safe.
## Risks & Mitigations
- **Risk**: Resolver becomes too broad or “magic”.
- **Mitigation**: Fix only confirmed seams; keep helper optional and bounded; add explicit tests and allowlists.
- **Risk**: Referer parsing introduces security risk.
- **Mitigation**: Referer can only be a candidate; any extracted identifiers must be validated against DB membership and scope.
- **Risk**: Guardrail is too noisy.
- **Mitigation**: Start by guarding unsafe `setTenant` patterns; classify `getTenant` usage before hard failing.
## Open Questions
- Which exact capability constants gate:
- attaching policies to a backup set
- creating a restore run
- viewing schedule operation runs
- acting in triage widget actions
(Implementation must use the existing capability registry; no new strings.)
- Does the repo already have a safe “current environment” helper for nested Livewire updates beyond `ResolvesPanelTenantContext`? If yes, reuse it.
## Follow-Up Spec Candidates *(out of scope for Spec 334)*
- Broader “environment-resource-context-follow-through” rollout beyond the 4 confirmed surfaces.
- Harmonize with Spec 152 “trusted state” hardening if both proceed; avoid duplicated helper layers.