--- description: "Task list for Spec 113 implementation" --- # Tasks: Platform Ops Runbooks (Operator Control Plane) **Input**: Design documents from `specs/113-platform-ops-runbooks/` **Prerequisites**: `specs/113-platform-ops-runbooks/plan.md`, `specs/113-platform-ops-runbooks/spec.md`, plus `specs/113-platform-ops-runbooks/research.md`, `specs/113-platform-ops-runbooks/data-model.md`, `specs/113-platform-ops-runbooks/contracts/system-ops-runbooks.openapi.yaml`, `specs/113-platform-ops-runbooks/quickstart.md`. **Tests**: REQUIRED (Pest) for all runtime behavior changes. --- ## Phase 1: Setup (Shared Infrastructure) **Purpose**: Confirm touch points and keep spec artifacts aligned. - [X] T001 Confirm spec UI Action Matrix is complete in specs/113-platform-ops-runbooks/spec.md - [X] T002 Confirm System panel provider registration in bootstrap/providers.php (Laravel 11+/12 provider registration) - [X] T003 [P] Capture current legacy /admin trigger location in app/Filament/Resources/FindingResource/Pages/ListFindings.php ("Backfill findings lifecycle" header action) - [X] T004 [P] Review existing single-tenant backfill pipeline entry points in app/Console/Commands/TenantpilotBackfillFindingLifecycle.php and app/Jobs/BackfillFindingLifecycleJob.php --- ## Phase 2: Foundational (Blocking Prerequisites) **Purpose**: Security semantics, session isolation, and auth hardening that block all user stories. - [X] T005 Add platform runbook capability constants to app/Support/Auth/PlatformCapabilities.php (e.g., platform.ops.view, platform.runbooks.view, platform.runbooks.run, platform.runbooks.findings.lifecycle_backfill) - [X] T006 Update System panel access control to use capability registry constants in app/Providers/Filament/SystemPanelProvider.php (keep ACCESS_SYSTEM_PANEL gate, add per-page capability checks) - [X] T007 Change platform capability denial semantics to 403 (member-but-missing-capability) in app/Http/Middleware/EnsurePlatformCapability.php (keep wrong-plane 404 handled by ensure-correct-guard) - [X] T008 [P] Add SR-002 regression tests for 404 vs 403 semantics in tests/Feature/System/Spec113/AuthorizationSemanticsTest.php (tenant user -> 404 on /system/*, platform user without capability -> 403, platform user with capability -> 200) - [X] T009 Define and enforce the “allowed tenant universe” for System runbooks in app/Services/System/AllowedTenantUniverse.php (v1: exclude platform tenant; provide tenant query for pickers and runtime guard) - [X] T010 [P] Add allowed tenant universe tests in tests/Feature/System/Spec113/AllowedTenantUniverseTest.php (picker excludes platform tenant; attempts to target excluded tenant are rejected; no OperationRun created) - [X] T011 Create System session cookie isolation middleware in app/Http/Middleware/UseSystemSessionCookie.php (set dedicated session cookie name before StartSession) - [X] T012 Wire System session cookie middleware before StartSession in app/Providers/Filament/SystemPanelProvider.php (SR-004) - [X] T013 [P] Add System session isolation test in tests/Feature/System/Spec113/SystemSessionIsolationTest.php (assert response sets the System session cookie name for /system) - [X] T014 Implement /system/login throttling (10/min per IP + username key) in app/Filament/System/Pages/Auth/Login.php (SR-003; use RateLimiter and clear on success) - [X] T015 [P] Add /system/login throttling tests in tests/Feature/System/Spec113/SystemLoginThrottleTest.php (assert throttled after N failures; ensure failures still emit audit via AuditLogger) --- ## Phase 3: User Story 1 — Operator runs a runbook safely (Priority: P1) 🎯 MVP **Goal**: `/system/ops/runbooks` supports preflight + explicit confirmation + reason capture + typed confirmation for all-tenants; starts a tracked `OperationRun` and links to “View run”. **Independent Test**: Visit `/system/ops/runbooks`, run preflight, start run, follow “View run” to `/system/ops/runs/{id}`, and confirm audit/run records exist. ### Tests for User Story 1 - [X] T016 [P] [US1] Add runbook preflight tests in tests/Feature/System/OpsRunbooks/FindingsLifecycleBackfillPreflightTest.php (single tenant + all tenants preflight returns affected_count) - [X] T017 [P] [US1] Add runbook start/confirmation tests in tests/Feature/System/OpsRunbooks/FindingsLifecycleBackfillStartTest.php (typed confirmation + reason required for all_tenants; disabled when affected_count=0) - [X] T018 [P] [US1] Add break-glass reason enforcement + recording tests in tests/Feature/System/OpsRunbooks/FindingsLifecycleBackfillBreakGlassTest.php (reason required when break-glass active; break-glass marker and reason recorded on run + audit) - [X] T019 [P] [US1] Add Ops-UX feedback contract test for start surface in tests/Feature/System/OpsRunbooks/OpsUxStartSurfaceContractTest.php (toast intent-only + “View run” link; no DB queued/running notifications) - [X] T020 [P] [US1] Add audit fail-safe test in tests/Feature/System/OpsRunbooks/FindingsLifecycleBackfillAuditFailSafeTest.php (audit logger failure does not crash run; run still records failure outcome) ### Implementation for User Story 1 - [X] T021 [US1] Create runbook service app/Services/Runbooks/FindingsLifecycleBackfillRunbookService.php with methods preflight(scope) and start(scope, initiator, reason, source) - [X] T022 [P] [US1] Create runbook scope/value objects in app/Services/Runbooks/FindingsLifecycleBackfillScope.php and app/Services/Runbooks/RunbookReason.php (validate reason_code and reason_text max 500 chars; include break-glass reason requirements) - [X] T023 [US1] Add audit events for preflight/start/completed/failed using AuditLogger in app/Services/Runbooks/FindingsLifecycleBackfillRunbookService.php (action IDs per specs/113-platform-ops-runbooks/data-model.md; must be fail-safe) - [X] T024 [US1] Record break-glass marker + reason on OperationRun context and audit in app/Services/Runbooks/FindingsLifecycleBackfillRunbookService.php (SR-005) - [X] T025 [US1] Implement all-tenants orchestration job in app/Jobs/BackfillFindingLifecycleWorkspaceJob.php (create/lock workspace-scoped OperationRun; dispatch tenant fan-out; set summary_counts[tenants/total/processed]) - [X] T026 [US1] Implement tenant worker job that updates the shared workspace run in app/Jobs/BackfillFindingLifecycleTenantIntoWorkspaceRunJob.php (chunk writes; increment summary_counts keys from OperationSummaryKeys::all(); append failures; call maybeCompleteBulkRun()) - [X] T027 [US1] Ensure scope-level lock prevents concurrent all-tenants runs in app/Services/Runbooks/FindingsLifecycleBackfillRunbookService.php (lock key includes workspace + scope) - [X] T028 [US1] Enable platform in-app notifications for run completion/failure by turning on database notifications in app/Providers/Filament/SystemPanelProvider.php (ensure terminal notification is OperationRunCompleted, initiator-only) - [X] T029 [P] [US1] Add System “View run” URL helper in app/Support/System/SystemOperationRunLinks.php and use it for UI + alerts/notifications (avoid admin-plane links) - [X] T030 [US1] Dispatch Alerts event on failure using app/Services/Alerts/AlertDispatchService.php from app/Services/Runbooks/FindingsLifecycleBackfillRunbookService.php (event_type operations.run.failed; include System “View run” URL) - [X] T031 [US1] Create System runbooks page class app/Filament/System/Pages/Ops/Runbooks.php (capability-gated; scope selector uses AllowedTenantUniverse; Preflight action; Run action with confirmation + typed confirm + reason) - [X] T032 [P] [US1] Create System runbooks page view resources/views/filament/system/pages/ops/runbooks.blade.php (operator warning; show preflight results + disable Run when nothing to do) - [X] T033 [US1] Create System runs list page class app/Filament/System/Pages/Ops/Runs.php (table listing operation runs for runbook types; default sort newest) - [X] T034 [P] [US1] Create System runs list view resources/views/filament/system/pages/ops/runs.blade.php (record inspection affordance: clickable row -> run detail) - [X] T035 [US1] Create System run detail page class app/Filament/System/Pages/Ops/ViewRun.php (infolist rendering of OperationRun; show scope/actor/counts/failures) - [X] T036 [P] [US1] Create System run detail view resources/views/filament/system/pages/ops/view-run.blade.php --- ## Phase 4: User Story 2 — Customers never see maintenance actions (Priority: P1) **Goal**: No `/admin` maintenance/backfill affordances by default; tenant users cannot access `/system/*` (404). **Independent Test**: As a tenant user, `/system/*` returns 404; in `/admin` Findings list there is no backfill action when the feature flag is defaulted off. ### Tests for User Story 2 - [X] T037 [P] [US2] Add regression test asserting /admin Findings list has no backfill action by default in tests/Feature/Filament/Spec113/AdminFindingsNoMaintenanceActionsTest.php (targets app/Filament/Resources/FindingResource/Pages/ListFindings.php) - [X] T038 [P] [US2] Add tenant-plane 404 test for /system/ops/runbooks in tests/Feature/System/Spec113/TenantPlaneCannotAccessSystemTest.php ### Implementation for User Story 2 - [X] T039 [US2] Remove or feature-flag off the legacy header action in app/Filament/Resources/FindingResource/Pages/ListFindings.php (FR-001; default off in production-like envs) - [X] T040 [US2] Add a config-backed feature flag defaulting to false in config/tenantpilot.php (e.g., allow_admin_maintenance_actions) and wire it in app/Filament/Resources/FindingResource/Pages/ListFindings.php --- ## Phase 5: User Story 3 — Same logic for deploy-time and operator re-run (Priority: P2) **Goal**: One implementation path for preflight/start that is reused by System UI, CLI, and deploy-time automation. **Independent Test**: Run the runbook twice with the same scope; second run produces updated_count=0; deploy-time entry point calls the same service. ### Tests for User Story 3 - [X] T041 [P] [US3] Add idempotency test in tests/Feature/System/OpsRunbooks/FindingsLifecycleBackfillIdempotencyTest.php (second run updated=0 and/or preflight affected_count=0) - [X] T042 [P] [US3] Add deploy-time entry point test in tests/Feature/Console/Spec113/DeployRunbooksCommandTest.php (command delegates to FindingsLifecycleBackfillRunbookService) ### Implementation for User Story 3 - [X] T043 [US3] Refactor CLI command to call shared runbook service in app/Console/Commands/TenantpilotBackfillFindingLifecycle.php (single-tenant scope, source=cli) - [X] T044 [US3] Add deploy-time runbooks command in app/Console/Commands/TenantpilotRunDeployRunbooks.php (source=deploy_hook; initiator null; uses FindingsLifecycleBackfillRunbookService) - [X] T045 [US3] Ensure System UI uses the same runbook service start() call path in app/Filament/System/Pages/Ops/Runbooks.php (source=system_ui) - [X] T046 [US3] Ensure initiator-null runs do not emit terminal DB notification in app/Services/OperationRunService.php (system-run behavior; audit/alerts still apply) --- ## Phase 6: Polish & Cross-Cutting Concerns - [X] T047 [P] Run new Spec 113 tests via vendor/bin/sail artisan test --compact tests/Feature/System/Spec113/ (ensure all new tests pass) - [X] T048 [P] Run Ops Runbooks tests via vendor/bin/sail artisan test --compact tests/Feature/System/OpsRunbooks/ (ensure US1/US3 tests pass) - [X] T049 [P] Run formatting on touched files via vendor/bin/sail bin pint --dirty --format agent (targets app/Http/Middleware/, app/Filament/System/Pages/, app/Services/Runbooks/, tests/Feature/System/) --- ## Phase 7: UX Polish — Enterprise-grade Ops surfaces (User Story 4) **Purpose**: Elevate operator-facing views from functional MVP to enterprise-grade UX with proper visual hierarchy, alert banners, card layouts, badge indicators, and metadata. - [X] T050 [US4] Upgrade operator warning from plain text to styled alert banner with icon in resources/views/filament/system/pages/ops/runbooks.blade.php (FR-010) - [X] T051 [US4] Restructure runbook entry as a card with title, description, scope badge, and "Last run" metadata in resources/views/filament/system/pages/ops/runbooks.blade.php + app/Filament/System/Pages/Ops/Runbooks.php (FR-011) - [X] T052 [US4] Upgrade preflight stat values to prominent stat-card styling in resources/views/filament/system/pages/ops/runbooks.blade.php (FR-012) - [X] T053 [US4] Render status/outcome as BadgeRenderer badges on run detail page in resources/views/filament/system/pages/ops/view-run.blade.php (FR-013) - [X] T054 [US4] Render summary_counts as labeled key-value grid with JSON fallback on run detail in resources/views/filament/system/pages/ops/view-run.blade.php (FR-014) - [X] T055 [US4] "Recovery" nav group with "Repair workspace owners" already exists (pre-existing; no change needed) - [X] T056 [P] Run formatting via vendor/bin/sail bin pint --dirty --format agent - [X] T057 [P] Run existing Spec 113 tests to verify no regressions (16 passed, 141 assertions) --- ## Dependencies & Execution Order ### Phase Dependencies - **Setup (Phase 1)**: no dependencies - **Foundational (Phase 2)**: depends on Setup; BLOCKS all story work - **US1 (Phase 3)**: depends on Foundational - **US2 (Phase 4)**: depends on Foundational - **US3 (Phase 5)**: depends on US1 shared runbook service (T021) + Foundational - **Polish (Phase 6)**: depends on desired stories being complete ### User Story Dependencies - **US1 (P1)**: foundational security + session isolation + login throttle must be in place first - **US2 (P1)**: can be implemented after Foundational; independent of US1 UI - **US3 (P2)**: depends on the shared runbook service created in US1 --- ## Parallel Execution Examples ### US1 parallelizable tasks - T016, T017, T018, T019, T020 can be drafted in parallel (tests in separate files under tests/Feature/System/OpsRunbooks/) - T031/T032, T033/T034, and T035/T036 can be built in parallel (separate System page classes/views) - T025 and T026 can be built in parallel once the service contract (T021) is agreed ### US2 parallelizable tasks - T037 and T038 can run in parallel (tests) - T039 and T040 can run in parallel if T040 lands first (feature flag), otherwise keep sequential ### US3 parallelizable tasks - T041 and T042 can run in parallel (tests) - T043 and T044 can be implemented in parallel once T021 exists --- ## Implementation Strategy (MVP First) 1) Complete Phase 2 (security semantics + session isolation + login throttle) 2) Deliver US1 (System runbooks page + OperationRun tracking + System runs detail) 3) Deliver US2 (remove/disable /admin maintenance UI) 4) Deliver US3 (shared logic reused by CLI + deploy-time automation)