13 KiB
| description |
|---|
| Task list for Spec 113 implementation |
Tasks: Platform Ops Runbooks (Operator Control Plane)
Input: Design documents from specs/113-platform-ops-runbooks/
Prerequisites: specs/113-platform-ops-runbooks/plan.md, specs/113-platform-ops-runbooks/spec.md, plus specs/113-platform-ops-runbooks/research.md, specs/113-platform-ops-runbooks/data-model.md, specs/113-platform-ops-runbooks/contracts/system-ops-runbooks.openapi.yaml, specs/113-platform-ops-runbooks/quickstart.md.
Tests: REQUIRED (Pest) for all runtime behavior changes.
Phase 1: Setup (Shared Infrastructure)
Purpose: Confirm touch points and keep spec artifacts aligned.
- T001 Confirm spec UI Action Matrix is complete in specs/113-platform-ops-runbooks/spec.md
- T002 Confirm System panel provider registration in bootstrap/providers.php (Laravel 11+/12 provider registration)
- T003 [P] Capture current legacy /admin trigger location in app/Filament/Resources/FindingResource/Pages/ListFindings.php ("Backfill findings lifecycle" header action)
- T004 [P] Review existing single-tenant backfill pipeline entry points in app/Console/Commands/TenantpilotBackfillFindingLifecycle.php and app/Jobs/BackfillFindingLifecycleJob.php
Phase 2: Foundational (Blocking Prerequisites)
Purpose: Security semantics, session isolation, and auth hardening that block all user stories.
-
T005 Add platform runbook capability constants to app/Support/Auth/PlatformCapabilities.php (e.g., platform.ops.view, platform.runbooks.view, platform.runbooks.run, platform.runbooks.findings.lifecycle_backfill)
-
T006 Update System panel access control to use capability registry constants in app/Providers/Filament/SystemPanelProvider.php (keep ACCESS_SYSTEM_PANEL gate, add per-page capability checks)
-
T007 Change platform capability denial semantics to 403 (member-but-missing-capability) in app/Http/Middleware/EnsurePlatformCapability.php (keep wrong-plane 404 handled by ensure-correct-guard)
-
T008 [P] Add SR-002 regression tests for 404 vs 403 semantics in tests/Feature/System/Spec113/AuthorizationSemanticsTest.php (tenant user -> 404 on /system/*, platform user without capability -> 403, platform user with capability -> 200)
-
T009 Define and enforce the “allowed tenant universe” for System runbooks in app/Services/System/AllowedTenantUniverse.php (v1: exclude platform tenant; provide tenant query for pickers and runtime guard)
-
T010 [P] Add allowed tenant universe tests in tests/Feature/System/Spec113/AllowedTenantUniverseTest.php (picker excludes platform tenant; attempts to target excluded tenant are rejected; no OperationRun created)
-
T011 Create System session cookie isolation middleware in app/Http/Middleware/UseSystemSessionCookie.php (set dedicated session cookie name before StartSession)
-
T012 Wire System session cookie middleware before StartSession in app/Providers/Filament/SystemPanelProvider.php (SR-004)
-
T013 [P] Add System session isolation test in tests/Feature/System/Spec113/SystemSessionIsolationTest.php (assert response sets the System session cookie name for /system)
-
T014 Implement /system/login throttling (10/min per IP + username key) in app/Filament/System/Pages/Auth/Login.php (SR-003; use RateLimiter and clear on success)
-
T015 [P] Add /system/login throttling tests in tests/Feature/System/Spec113/SystemLoginThrottleTest.php (assert throttled after N failures; ensure failures still emit audit via AuditLogger)
Phase 3: User Story 1 — Operator runs a runbook safely (Priority: P1) 🎯 MVP
Goal: /system/ops/runbooks supports preflight + explicit confirmation + reason capture + typed confirmation for all-tenants; starts a tracked OperationRun and links to “View run”.
Independent Test: Visit /system/ops/runbooks, run preflight, start run, follow “View run” to /system/ops/runs/{id}, and confirm audit/run records exist.
Tests for User Story 1
- T016 [P] [US1] Add runbook preflight tests in tests/Feature/System/OpsRunbooks/FindingsLifecycleBackfillPreflightTest.php (single tenant + all tenants preflight returns affected_count)
- T017 [P] [US1] Add runbook start/confirmation tests in tests/Feature/System/OpsRunbooks/FindingsLifecycleBackfillStartTest.php (typed confirmation + reason required for all_tenants; disabled when affected_count=0)
- T018 [P] [US1] Add break-glass reason enforcement + recording tests in tests/Feature/System/OpsRunbooks/FindingsLifecycleBackfillBreakGlassTest.php (reason required when break-glass active; break-glass marker and reason recorded on run + audit)
- T019 [P] [US1] Add Ops-UX feedback contract test for start surface in tests/Feature/System/OpsRunbooks/OpsUxStartSurfaceContractTest.php (toast intent-only + “View run” link; no DB queued/running notifications)
- T020 [P] [US1] Add audit fail-safe test in tests/Feature/System/OpsRunbooks/FindingsLifecycleBackfillAuditFailSafeTest.php (audit logger failure does not crash run; run still records failure outcome)
Implementation for User Story 1
-
T021 [US1] Create runbook service app/Services/Runbooks/FindingsLifecycleBackfillRunbookService.php with methods preflight(scope) and start(scope, initiator, reason, source)
-
T022 [P] [US1] Create runbook scope/value objects in app/Services/Runbooks/FindingsLifecycleBackfillScope.php and app/Services/Runbooks/RunbookReason.php (validate reason_code and reason_text max 500 chars; include break-glass reason requirements)
-
T023 [US1] Add audit events for preflight/start/completed/failed using AuditLogger in app/Services/Runbooks/FindingsLifecycleBackfillRunbookService.php (action IDs per specs/113-platform-ops-runbooks/data-model.md; must be fail-safe)
-
T024 [US1] Record break-glass marker + reason on OperationRun context and audit in app/Services/Runbooks/FindingsLifecycleBackfillRunbookService.php (SR-005)
-
T025 [US1] Implement all-tenants orchestration job in app/Jobs/BackfillFindingLifecycleWorkspaceJob.php (create/lock workspace-scoped OperationRun; dispatch tenant fan-out; set summary_counts[tenants/total/processed])
-
T026 [US1] Implement tenant worker job that updates the shared workspace run in app/Jobs/BackfillFindingLifecycleTenantIntoWorkspaceRunJob.php (chunk writes; increment summary_counts keys from OperationSummaryKeys::all(); append failures; call maybeCompleteBulkRun())
-
T027 [US1] Ensure scope-level lock prevents concurrent all-tenants runs in app/Services/Runbooks/FindingsLifecycleBackfillRunbookService.php (lock key includes workspace + scope)
-
T028 [US1] Enable platform in-app notifications for run completion/failure by turning on database notifications in app/Providers/Filament/SystemPanelProvider.php (ensure terminal notification is OperationRunCompleted, initiator-only)
-
T029 [P] [US1] Add System “View run” URL helper in app/Support/System/SystemOperationRunLinks.php and use it for UI + alerts/notifications (avoid admin-plane links)
-
T030 [US1] Dispatch Alerts event on failure using app/Services/Alerts/AlertDispatchService.php from app/Services/Runbooks/FindingsLifecycleBackfillRunbookService.php (event_type operations.run.failed; include System “View run” URL)
-
T031 [US1] Create System runbooks page class app/Filament/System/Pages/Ops/Runbooks.php (capability-gated; scope selector uses AllowedTenantUniverse; Preflight action; Run action with confirmation + typed confirm + reason)
-
T032 [P] [US1] Create System runbooks page view resources/views/filament/system/pages/ops/runbooks.blade.php (operator warning; show preflight results + disable Run when nothing to do)
-
T033 [US1] Create System runs list page class app/Filament/System/Pages/Ops/Runs.php (table listing operation runs for runbook types; default sort newest)
-
T034 [P] [US1] Create System runs list view resources/views/filament/system/pages/ops/runs.blade.php (record inspection affordance: clickable row -> run detail)
-
T035 [US1] Create System run detail page class app/Filament/System/Pages/Ops/ViewRun.php (infolist rendering of OperationRun; show scope/actor/counts/failures)
-
T036 [P] [US1] Create System run detail view resources/views/filament/system/pages/ops/view-run.blade.php
Phase 4: User Story 2 — Customers never see maintenance actions (Priority: P1)
Goal: No /admin maintenance/backfill affordances by default; tenant users cannot access /system/* (404).
Independent Test: As a tenant user, /system/* returns 404; in /admin Findings list there is no backfill action when the feature flag is defaulted off.
Tests for User Story 2
- T037 [P] [US2] Add regression test asserting /admin Findings list has no backfill action by default in tests/Feature/Filament/Spec113/AdminFindingsNoMaintenanceActionsTest.php (targets app/Filament/Resources/FindingResource/Pages/ListFindings.php)
- T038 [P] [US2] Add tenant-plane 404 test for /system/ops/runbooks in tests/Feature/System/Spec113/TenantPlaneCannotAccessSystemTest.php
Implementation for User Story 2
- T039 [US2] Remove or feature-flag off the legacy header action in app/Filament/Resources/FindingResource/Pages/ListFindings.php (FR-001; default off in production-like envs)
- T040 [US2] Add a config-backed feature flag defaulting to false in config/tenantpilot.php (e.g., allow_admin_maintenance_actions) and wire it in app/Filament/Resources/FindingResource/Pages/ListFindings.php
Phase 5: User Story 3 — Same logic for deploy-time and operator re-run (Priority: P2)
Goal: One implementation path for preflight/start that is reused by System UI, CLI, and deploy-time automation.
Independent Test: Run the runbook twice with the same scope; second run produces updated_count=0; deploy-time entry point calls the same service.
Tests for User Story 3
- T041 [P] [US3] Add idempotency test in tests/Feature/System/OpsRunbooks/FindingsLifecycleBackfillIdempotencyTest.php (second run updated=0 and/or preflight affected_count=0)
- T042 [P] [US3] Add deploy-time entry point test in tests/Feature/Console/Spec113/DeployRunbooksCommandTest.php (command delegates to FindingsLifecycleBackfillRunbookService)
Implementation for User Story 3
- T043 [US3] Refactor CLI command to call shared runbook service in app/Console/Commands/TenantpilotBackfillFindingLifecycle.php (single-tenant scope, source=cli)
- T044 [US3] Add deploy-time runbooks command in app/Console/Commands/TenantpilotRunDeployRunbooks.php (source=deploy_hook; initiator null; uses FindingsLifecycleBackfillRunbookService)
- T045 [US3] Ensure System UI uses the same runbook service start() call path in app/Filament/System/Pages/Ops/Runbooks.php (source=system_ui)
- T046 [US3] Ensure initiator-null runs do not emit terminal DB notification in app/Services/OperationRunService.php (system-run behavior; audit/alerts still apply)
Phase 6: Polish & Cross-Cutting Concerns
- T047 [P] Run new Spec 113 tests via vendor/bin/sail artisan test --compact tests/Feature/System/Spec113/ (ensure all new tests pass)
- T048 [P] Run Ops Runbooks tests via vendor/bin/sail artisan test --compact tests/Feature/System/OpsRunbooks/ (ensure US1/US3 tests pass)
- T049 [P] Run formatting on touched files via vendor/bin/sail bin pint --dirty --format agent (targets app/Http/Middleware/, app/Filament/System/Pages/, app/Services/Runbooks/, tests/Feature/System/)
Dependencies & Execution Order
Phase Dependencies
- Setup (Phase 1): no dependencies
- Foundational (Phase 2): depends on Setup; BLOCKS all story work
- US1 (Phase 3): depends on Foundational
- US2 (Phase 4): depends on Foundational
- US3 (Phase 5): depends on US1 shared runbook service (T021) + Foundational
- Polish (Phase 6): depends on desired stories being complete
User Story Dependencies
- US1 (P1): foundational security + session isolation + login throttle must be in place first
- US2 (P1): can be implemented after Foundational; independent of US1 UI
- US3 (P2): depends on the shared runbook service created in US1
Parallel Execution Examples
US1 parallelizable tasks
- T016, T017, T018, T019, T020 can be drafted in parallel (tests in separate files under tests/Feature/System/OpsRunbooks/)
- T031/T032, T033/T034, and T035/T036 can be built in parallel (separate System page classes/views)
- T025 and T026 can be built in parallel once the service contract (T021) is agreed
US2 parallelizable tasks
- T037 and T038 can run in parallel (tests)
- T039 and T040 can run in parallel if T040 lands first (feature flag), otherwise keep sequential
US3 parallelizable tasks
- T041 and T042 can run in parallel (tests)
- T043 and T044 can be implemented in parallel once T021 exists
Implementation Strategy (MVP First)
- Complete Phase 2 (security semantics + session isolation + login throttle)
- Deliver US1 (System runbooks page + OperationRun tracking + System runs detail)
- Deliver US2 (remove/disable /admin maintenance UI)
- Deliver US3 (shared logic reused by CLI + deploy-time automation)