TenantAtlas/specs/055-ops-ux-rollout/spec.md
ahmido d90fb0f963 065-tenant-rbac-v1 (#79)
PR Body
Implements Spec 065 “Tenant RBAC v1” with capabilities-first RBAC, tenant membership scoping (Option 3), and consistent Filament action semantics.

Key decisions / rules

Tenancy Option 3: tenant switching is tenantless (ChooseTenant), tenant-scoped routes stay scoped, non-members get 404 (not 403).
RBAC model: canonical capability registry + role→capability map + Gates for each capability (no role-string checks in UI logic).
UX policy: for tenant members lacking permission → actions are visible but disabled + tooltip (avoid click→403).
Security still enforced server-side.
What’s included

Capabilities foundation:
Central capability registry (Capabilities::*)
Role→capability mapping (RoleCapabilityMap)
Gate registration + resolver/manager updates to support tenant-scoped authorization
Filament enforcement hardening across the app:
Tenant registration & tenant CRUD properly gated
Backup/restore/policy flows aligned to “visible-but-disabled” where applicable
Provider operations (health check / inventory sync / compliance snapshot) guarded and normalized
Directory groups + inventory sync start surfaces normalized
Policy version maintenance actions (archive/restore/prune/force delete) gated
SpecKit artifacts for 065:
spec.md, plan/tasks updates, checklists, enforcement hitlist
Security guarantees

Non-member → 404 via tenant scoping/membership guards.
Member without capability → 403 on execution, even if UI is disabled.
No destructive actions execute without proper authorization checks.
Tests

Adds/updates Pest coverage for:
Tenant scoping & membership denial behavior
Role matrix expectations (owner/manager/operator/readonly)
Filament surface checks (visible/disabled actions, no side effects)
Provider/Inventory/Groups run-start authorization
Verified locally with targeted vendor/bin/sail artisan test --compact …
Deployment / ops notes

No new services required.
Safe change: behavior is authorization + UI semantics; no breaking route changes intended.

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@MacBookPro.fritz.box>
Reviewed-on: #79
2026-01-28 21:09:47 +00:00

214 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Feature Specification: Ops-UX Constitution Rollout (v1.3.0 Alignment)
**Feature Branch**: `055-ops-ux-rollout`
**Created**: 2026-01-18
**Status**: Draft
**Input**: Repo-wide migration to align all existing operation feedback with the Operations UX Constitution (v1.3.0).
## Clarifications
### Session 2026-01-18
- Q: For the Progress Widget, what should be the visibility scope? → A: All active runs for the current tenant (visible to users who can access Monitoring → Operations).
- Q: For R7 metrics/summary contract, which field is the canonical source across the app? → A: Treat `operation_runs.summary_counts` as the canonical “metrics” source for this rollout.
- Q: If an existing run record contains an unknown `operation_type`, what should the UI do at runtime? → A: Soft fail: show `Unknown operation` (tests/CI still fail fast for code-produced operation types).
- Q: Who should receive the terminal DB notification for a run? → A: Only the initiator.
- Q: For this rollout, do we ban queued DB notifications entirely in favor of queued toast + terminal DB notification? → A: Yes, ban queued DB notifications.
## User Scenarios & Testing *(mandatory)*
<!--
IMPORTANT: User stories should be PRIORITIZED as user journeys ordered by importance.
Each user story/journey must be INDEPENDENTLY TESTABLE - meaning if you implement just ONE of them,
you should still have a viable MVP (Minimum Viable Product) that delivers value.
Assign priorities (P1, P2, P3, etc.) to each story, where P1 is the most critical.
Think of each story as a standalone slice of functionality that can be:
- Developed independently
- Tested independently
- Deployed independently
- Demonstrated to users independently
-->
### User Story 1 - Consistent “I started it” feedback (Priority: P1)
As a tenant admin who triggers a long-running operation, I want immediate confirmation that my action was accepted and a single, consistent way to follow progress, so I dont retry actions or lose track.
**Why this priority**: Prevents duplicate operations and reduces confusion/support load.
**Independent Test**: Starting any operation produces a queued-only intent feedback with a canonical “View run” destination.
**Acceptance Scenarios**:
1. **Given** an operation is started and the run is created or reused in `queued`, **When** feedback is shown, **Then** it is a queued-only toast with title `{OperationLabel} queued` and body `Running in the background.`
2. **Given** an operation completes quickly (<2 seconds), **When** feedback is shown, **Then** the queued toast may be suppressed but the terminal DB notification still appears.
---
### User Story 2 - Live awareness of active operations (Priority: P2)
As a tenant admin, I want a single progress widget that shows only active operations (queued/running) with strict, predictable wording, so I can understand whats happening without noise.
**Why this priority**: Creates a unified whats running?” view and eliminates feature-specific progress UIs.
**Independent Test**: The progress widget lists only active runs, with strict Queued/Running text and canonical View run links.
**Acceptance Scenarios**:
1. **Given** there are active operations in `queued` or `running`, **When** the widget is visible, **Then** it shows at most 5 runs and each row includes a canonical View run”.
2. **Given** an operation is terminal (`succeeded|partial|failed`), **When** the widget queries its data, **Then** that run is never included.
---
### User Story 3 - Audit + outcome without spam (Priority: P3)
As a tenant admin, I want exactly one persistent notification when an operation finishes (success/partial/failure), with a consistent title/body and safe summary, so I can audit outcomes and troubleshoot.
**Why this priority**: Delivers reliable outcomes and reduces notification noise.
**Independent Test**: Terminal runs always create exactly one DB notification with canonical copy and safe summary rules.
**Acceptance Scenarios**:
1. **Given** an operation run transitions into a terminal outcome, **When** notifications are emitted, **Then** exactly one terminal DB notification exists for that run.
2. **Given** valid numeric metrics exist for an operation, **When** the notification body includes a summary, **Then** the summary renders only whitelisted numeric keys and never renders free-text.
---
### User Story 4 - Regression-safe by default (Priority: P4)
As a maintainer, I want automated guards that fail fast when the app deviates from the constitution (labels, links, surfaces, summary rules), so drift is prevented across future features.
**Why this priority**: This is a migration; without guards, the codebase will regress quickly.
**Independent Test**: A test suite enforces invariants (catalog coverage, canonical View run”, terminal notification idempotency, widget filtering, summary whitelist).
**Acceptance Scenarios**:
1. **Given** a new/unknown `operation_type` is introduced, **When** tests run, **Then** the build fails until the OperationCatalog is updated.
2. **Given** any View run link is generated, **When** it is resolved, **Then** it matches the canonical Monitoring Operations Run Detail destination.
---
### Edge Cases
- Unknown `operation_type` appears in an existing run record.
- Multiple operations start simultaneously (more than 5 active runs).
- Runs with missing/invalid metrics (nested objects, strings, non-whitelisted keys, negative values).
- Runs that transition `queued → terminal` quickly (<2 seconds).
- UI is backgrounded/hidden while operations are active.
- A modal dialog is open while an operation is active.
## Requirements *(mandatory)*
**Constitution alignment (required):** If this feature introduces any Microsoft Graph calls, any write/change behavior,
or any long-running/queued/scheduled work, the spec MUST describe contract registry updates, safety gates
(preview/confirmation/audit), tenant isolation, run observability (`OperationRun` type/identity/visibility), and tests.
If security-relevant DB-only actions intentionally skip `OperationRun`, the spec MUST describe `AuditLog` entries.
**Operations UX alignment (required when applicable):** If this feature creates/reuses `OperationRun` records or affects
operations feedback (toasts, progress widget, DB notifications, Monitoring Operations, run detail), the spec MUST
explicitly confirm:
- Three surfaces only (toast + progress widget + DB notification) no feature-specific patterns
- DB is source of truth: UI renders from `operation_runs` + structured fields (`metrics`, `reason_code`, `message`)
- Labels come from a central OperationCatalog (no embedded labels/strings in feature code)
- View run links always target the canonical route (Monitoring Operations Run Detail)
- Dedupe/noise control (max 1 queued toast; exactly 1 terminal DB notification; no running notifications)
- Calm polling constraints (no polling while modals are open; pause when tab hidden; stop on terminal)
- Test invariants for notifications, summary whitelist, and canonical navigation
### Assumptions
- The application already records tenant-scoped operations as OperationRuns.
- Monitoring Operations Run Detail is the canonical destination for viewing a run.
- Operation feedback is intended for tenant admins with access to Monitoring/Operations.
- The progress widget is tenant-wide (within the current tenant) and respects the same access constraints as Monitoring/Operations.
- The constitutions metrics terminology maps to `operation_runs.summary_counts` for this rollout (no schema rename required).
- Persistent notifications are user-scoped to the run initiator; tenant-wide audit remains the Monitoring Operations hub.
- Queued feedback is provided via toast only; persistent DB notifications are terminal-only.
- The constitution (v1.3.0) is the authoritative definition for copy/behavior; feature-specific variants are not allowed.
### Dependencies
- A single shared OperationCatalog exists and can be treated as the source of truth for operation labels.
- A canonical View run helper can be used by all operation feedback surfaces.
- Existing operation producers can be migrated without changing the operation status model.
### Functional Requirements
**FR-001 (Three surfaces only)**: The system MUST express operation feedback via exactly three surfaces: toast (intent), progress widget (active awareness), and persistent notification (audit + terminal outcome).
**FR-002 (OperationCatalog label source of truth)**: The system MUST provide a central OperationCatalog mapping `operation_type → label`, and all operation labels shown in UI MUST be resolved from it.
**FR-003 (Fail-fast catalog coverage)**: The system MUST fail fast (via automated checks) if any code-produced `operation_type` used in the application is not present in the OperationCatalog.
**FR-003b (Runtime behavior for unknown types)**: If an existing run record contains an unknown `operation_type`, the UI MUST render the label `Unknown operation` (and MUST NOT render the raw type string).
**FR-004 (Canonical “View run” everywhere)**: The system MUST generate View run links exclusively via one canonical helper, and that destination MUST always be Monitoring Operations Run Detail.
**FR-005 (Centralized presentation)**: The system MUST centralize the user-facing copy for operation toasts, widget status text, and persistent notifications. Feature code MUST NOT define operation feedback strings.
**FR-006 (Toast queued-only)**: The system MUST show a toast only when a run is created or reused in `queued`, MUST NOT show toasts for `running` or any terminal outcome, MUST auto-dismiss within 35 seconds, and MUST use:
- Title: `{OperationLabel} queued`
- Body: `Running in the background.`
**FR-007 (Progress widget queued/running only)**: The progress widget MUST display only active runs (`queued`, `running`) for the current tenant (not just the initiating user) and MUST never display terminal runs. Status text MUST be exactly `Queued` or `Running`.
**FR-008 (Progress calculation)**: The widget MUST show a deterministic progress percentage only when numeric `total` and `processed` counts are present and valid in `summary_counts`. Otherwise it MUST show indeterminate progress. Deterministic progress MUST be clamped to 0100%.
**FR-009 (Widget run limit + overflow)**: The widget MUST show at most 5 active runs. If more exist, it MUST show a single overflow row `+N more operations running` linking to the operations index.
**FR-010 (Terminal persistent notifications only)**: Each run MUST produce exactly one persistent notification when it becomes terminal (`succeeded|partial|failed`).
**FR-010b (Notification audience)**: Terminal persistent notifications MUST be delivered only to the run initiator (no tenant-wide notification fan-out).
**FR-010c (No queued DB notifications)**: The system MUST NOT emit queued DB notifications as part of this rollout. Queued feedback MUST be provided via the queued-only toast surface.
**FR-010d (Status normalization for Ops-UX (compatibility))**: Ops-UX surfaces MUST render terminal outcomes using the canonical statuses: `succeeded | partial | failed`.
If a run record contains legacy values (e.g. `status=completed` with `outcome=partially_succeeded`), the UI MUST normalize as follows:
- completed + outcome=succeeded -> succeeded
- completed + outcome=partially_succeeded -> partial
- failed (or outcome=failed) -> failed
This is a presentation/normalization rule for the rollout; it does not mandate a schema refactor.
**FR-011 (Notification copy templates)**: Persistent notifications MUST use canonical titles and bodies:
- succeeded: `{OperationLabel} completed` / `Completed successfully.`
- partial: `{OperationLabel} completed with warnings` / `Completed with warnings.`
- failed: `{OperationLabel} failed` / `Failed.` + optional sanitized message
**FR-012 (Metrics/summaries are structured and safe)**: Operation summary counts (`operation_runs.summary_counts`) MUST be flat, numeric-only, and limited to whitelisted keys. Summary rendering MUST use only normalized/validated `summary_counts` and MUST NOT render free-text.
### Canonical allowed summary keys (single source of truth)
The following keys are the ONLY allowed summary keys for Ops-UX rendering:
`total, processed, succeeded, failed, skipped, compliant, noncompliant, unknown, created, updated, deleted, items, tenants`
All normalizers/renderers MUST reference this canonical list (no duplicated lists in multiple places).
**FR-013 (Calm polling policy)**: Polling is allowed only for the progress widget (when visible) and run detail (only while active). Polling MUST pause when a modal is open, pause when the tab is hidden, follow the backoff schedule (1s for first 10s, then 5s, then 10s after 60s), and stop immediately on terminal.
**FR-014 (Migration scope)**: All existing operation feedback across the application MUST be migrated to these shared rules without introducing new operation types or changing the run status model.
### Key Entities *(include if feature involves data)*
- **OperationRun**: A tenant-scoped record of an operations type, status, timestamps, outcome, and structured metrics.
- **OperationCatalog**: A central registry of valid `operation_type` values and their user-facing labels.
- **Operation Feedback Surfaces**:
- **Toast**: short-lived intent confirmation for queued runs.
- **Progress Widget**: live awareness of active runs.
- **Persistent Notification**: audit + terminal outcome notification with canonical “View run”.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: 100% of tenant-scoped operations use exactly the three approved surfaces (toast, widget, persistent notification), with no feature-specific alternatives.
- **SC-002**: 100% of “View run” links resolve to Monitoring → Operations → Run Detail.
- **SC-003**: For terminal runs, 100% produce exactly one persistent notification (no duplicates) and 0% produce “running” notifications.
- **SC-004**: For active runs, the progress widget returns 0 terminal runs and shows strict status text (`Queued`/`Running`) 100% of the time.
- **SC-005**: Summary rendering shows only whitelisted numeric keys; invalid metrics render no summary in 100% of tested cases.
- **SC-006**: Automated guards fail fast when any new `operation_type` is not registered in OperationCatalog.