## Summary
Implements and polishes the Platform Ops Runbooks feature (Spec 113) — the operator control plane for safe backfills and data repair from `/system`.
## Changes
### UX Polish (Phase 7 — US4)
- **Filament-native components**: Rewrote `runbooks.blade.php` and `view-run.blade.php` using `<x-filament::section>` instead of raw Tailwind div cards. Cards now render correctly with Filament's built-in borders, shadows and dark mode.
- **System panel theme**: Created `resources/css/filament/system/theme.css` and registered `->viteTheme()` on `SystemPanelProvider`. The system panel previously had no theme CSS registered — Tailwind utility classes weren't compiled for its views, causing the warning icon SVG to expand to full container size.
- **Live scope selector**: Added `->live()` to the scope `Radio` field so "Single tenant" immediately reveals the tenant search dropdown without requiring a Submit first.
### Core Feature (Phases 1–6, previously shipped)
- `/system/ops/runbooks` — runbook catalog, preflight, run with typed confirmation + reason
- `/system/ops/runs` — run history table with status/outcome badges
- `/system/ops/runs/{id}` — run detail view with summary counts, failures, collapsible context
- `FindingsLifecycleBackfillRunbookService` — preflight + execution logic
- AllowedTenantUniverse — scopes tenant picker to non-platform tenants only
- RBAC: `platform.ops.view`, `platform.runbooks.view`, `platform.runbooks.run`, `platform.runbooks.findings.lifecycle_backfill`
- Rate-limited `/system/login` (10/min per IP+username)
- Distinct session cookie for `/system` isolation
## Test Coverage
- 16 tests / 141 assertions — all passing
- Covers: page access, RBAC, preflight, run dispatch, scope selector, run detail, run list
## Checklist
- [x] Filament v5 / Livewire v4 compliant
- [x] Provider registered in `bootstrap/providers.php`
- [x] Destructive actions require confirmation (`->requiresConfirmation()`)
- [x] System panel theme registered (`viteTheme`)
- [x] Pint clean
- [x] Tests pass
Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #137
4.0 KiB
4.0 KiB
Data Model — Spec 113: Platform Ops Runbooks
This design describes the data we will read/write to implement the /system operator runbooks, grounded in the existing schema.
Core persisted entities
OperationRun (existing)
- Table:
operation_runs - Ownership:
- Workspace-owned (always has
workspace_id) - Tenant association is optional (
tenant_idnullable) to support workspace/canonical runs
- Workspace-owned (always has
- Fields (existing):
idworkspace_id(FK, NOT NULL)tenant_id(FK, nullable)user_id(FK tousers, nullable)initiator_name(string)type(string; for this feature:findings.lifecycle.backfill)status(queued|running|completed)outcome(pending|succeeded|failed|blocked|...)run_identity_hash(string; active-run idempotency)summary_counts(json)failure_summary(json)context(json)started_at,completed_at
Summary counts contract
- Must only use keys from
App\Support\OpsUx\OperationSummaryKeys::all(). - v1 keys for this runbook:
total(findings scanned)processed(findings processed)updated(findings updated + duplicate consolidations)skipped(findings unchanged)failed(per-tenant job failures)tenants(for all-tenants orchestrator: tenants targeted)
Context shape (for this feature)
Store these values in operation_runs.context:
runbook:key:findings.lifecycle.backfillscope:all_tenants|single_tenanttarget_tenant_id: int|nullsource:system_ui|cli|deploy_hook
preflight:affected_count: int (findings that would change)total_count: int (findings scanned)estimated_tenants: int|null (for all tenants)
reason(required for all-tenants and break-glass):reason_code:DATA_REPAIR|INCIDENT|SUPPORT|SECURITYreason_text: string
platform_initiator(when started from/system):platform_user_id: intemail: stringname: stringis_break_glass: bool
Notes:
- We intentionally do not store secrets/PII beyond operator email/name already used in auditing.
failure_summaryshould store sanitized messages + stable reason codes, as already done byRunFailureSanitizer.
All-tenants run modeling (v1)
- All-tenants executes as a single workspace-scoped run (
tenant_id = null). - Implementation fans out to multiple tenant jobs, but they all update the same workspace run via:
OperationRunService::incrementSummaryCounts()OperationRunService::appendFailures()OperationRunService::maybeCompleteBulkRun()
- Per-tenant
OperationRunrows are not required for v1 (avoids parent/child coordination).
Audit log (existing infrastructure)
- Existing:
App\Services\Intune\AuditLoggeris already used for System login auditing. - New audit actions (stable action IDs):
platform.ops.runbooks.preflightplatform.ops.runbooks.startplatform.ops.runbooks.completedplatform.ops.runbooks.failed
- Audit context should include:
- runbook key, scope, affected_count, operation_run_id, platform_user_id/email, ip/user_agent.
Alerts (existing infrastructure)
- Use
AlertDispatchServiceto createalert_deliveriesfor operators. - New alert event:
event_type:operations.run.failedtenant_id: platform tenant id (to route via workspace rules)metadata: run id, run type, scope, view-run URL
Derived / non-persisted
Runbook catalog
- Implementation as a PHP catalog (no DB table) with:
- key, label, description, capability required, estimated duration (can reuse
OperationCatalog).
- key, label, description, capability required, estimated duration (can reuse
State transitions
OperationRun.status/outcometransitions are owned byOperationRunService.- Expected transitions (per run):
queued→running→completed(succeeded|failed|blocked)
- Locks:
- Tenant runs: already implemented via
Cache::lock('tenantpilot:findings:lifecycle_backfill:tenant:{id}', 900) - All-tenants orchestration: add a scope-level lock to prevent duplicate fan-out.
- Tenant runs: already implemented via