# Implementation Plan: Operational Controls **Branch**: `242-operational-controls` | **Date**: 2026-04-26 | **Spec**: `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/spec.md` **Input**: Feature specification from `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/spec.md` **Note**: This template is filled in by the `/speckit.plan` command. See `.specify/scripts/` for helper scripts. ## Summary - Replace the ad-hoc `allow_admin_maintenance_actions` environment gate with one product-owned operational-control path for the first-slice keys `findings.lifecycle.backfill` and `restore.execute`. - Introduce one platform-operated activation record plus one shared evaluator that plugs into the existing system runbook, tenant findings-maintenance, and restore-execution start seams without becoming a generic experimentation platform. - Reuse existing enforcement and UX seams - `UiEnforcement`, `ProviderOperationStartGate`, `OperationRunService`, `OperationUxPresenter`, `ProviderOperationStartResultPresenter`, `AuditRecorder`, `WorkspaceAuditLogger`, and `AuditActionId` - so the slice stays small, auditable, and server-side enforced. ## Technical Context **Language/Version**: PHP 8.4 (Laravel 12) **Primary Dependencies**: Laravel 12 + Filament v5 + Livewire v4 + Pest; existing `UiEnforcement`, `ProviderOperationStartGate`, `OperationRunService`, `AuditRecorder`, `WorkspaceAuditLogger`, `AuditActionId`, `PlatformCapabilities` **Storage**: PostgreSQL via existing product tables plus one new platform-operated `operational_control_activations` table; no tenant-owned control tables **Testing**: Pest unit + feature tests only **Validation Lanes**: fast-feedback, confidence **Target Platform**: Sail-backed Laravel admin surfaces under `/admin/t/{tenant}` and system surfaces under `/system` **Project Type**: web **Performance Goals**: effective-control resolution remains DB-only and cheap at action start time, adds no outbound HTTP, and blocks in-scope starts before queue or provider execution begins **Constraints**: no generic feature-flag platform, no new browser or heavy-governance suite, no break-glass bypass in v1, no parallel env gate for in-scope controls, global pauses win over workspace pauses, preserve 404 vs 403 semantics, keep provider-specific restore behavior out of platform-core control vocabulary **Scale/Scope**: 2 control keys, 2 scope levels (global and workspace), 1 system management surface, and 3 concrete enforcement families across 4 touched UI surfaces ## UI / Surface Guardrail Plan - **Guardrail scope**: changed surfaces - **Native vs custom classification summary**: native Filament + shared start/result primitives - **Shared-family relevance**: header actions, runbook launch actions, provider-backed start results, audit-backed control changes - **State layers in scope**: page, detail, action/modal - **Handling modes by drift class or surface**: review-mandatory - **Repository-signal treatment**: review-mandatory - **Special surface test profiles**: standard-native-filament, monitoring-state-page - **Required tests or manual smoke**: functional-core, state-contract - **Exception path and spread control**: none; v1 must not allow a second local runtime-control dialect - **Active feature PR close-out entry**: Guardrail ## Shared Pattern & System Fit - **Cross-cutting feature marker**: yes - **Systems touched**: `App\Filament\System\Pages\Ops\Runbooks`, new system ops controls page, `App\Filament\Resources\FindingResource\Pages\ListFindings`, `App\Filament\Resources\RestoreRunResource`, `App\Support\Rbac\UiEnforcement`, `App\Services\Providers\ProviderOperationStartGate`, `App\Support\OpsUx\OperationUxPresenter`, `App\Support\OpsUx\ProviderOperationStartResultPresenter`, `App\Services\Audit\AuditRecorder`, `App\Services\Audit\WorkspaceAuditLogger`, `App\Support\Audit\AuditActionId` - **Shared abstractions reused**: `UiEnforcement`, `ProviderOperationStartGate`, `ProviderOperationStartResultPresenter`, `OperationRunService`, `OperationUxPresenter`, `OpsUxBrowserEvents`, `OperationRunLinks`, `SystemOperationRunLinks`, `AuditRecorder`, `WorkspaceAuditLogger` - **New abstraction introduced? why?**: one bounded `OperationalControlCatalog` plus one `OperationalControlEvaluator` are justified because the feature now has two real concrete control keys that must evaluate consistently across system-plane and tenant-plane start paths. No registry lattice, provider strategy system, or customer-facing flag DSL is introduced. - **Why the existing abstraction was sufficient or insufficient**: existing abstractions already own auth, queue start UX, and audit writing; they are insufficient because none presently carries a reusable runtime-safety decision that can pause an action before it starts, and `WorkspaceAuditLogger` alone cannot truthfully own global platform-plane mutations. - **Bounded deviation / spread control**: no deviation is allowed for in-scope controls; every affected surface must route through the shared evaluator rather than direct `config(...)` reads or page-local booleans. ## OperationRun UX Impact - **Touches OperationRun start/completion/link UX?**: yes - **Central contract reused**: shared OperationRun start UX plus provider-start result helpers - **Delegated UX behaviors**: queued toast, `Open operation` / `View run` links, run-enqueued browser event, dedupe-or-blocked messaging, and tenant/workspace-safe URL resolution remain on existing shared paths - **Surface-owned behavior kept local**: initiation inputs, confirmation copy, and control-management forms only - **Queued DB-notification policy**: unchanged explicit opt-in only - **Terminal notification path**: existing central lifecycle mechanism for starts that are allowed - **Exception path**: none ## Provider Boundary & Portability Fit - **Shared provider/platform boundary touched?**: yes - **Provider-owned seams**: provider-backed `restore.execute` dispatch, provider binding resolution, provider reason translation, existing restore safety and dry-run behavior - **Platform-core seams**: operational-control vocabulary, scope/effective-state evaluation, control management surface, audit labels, blocked-state semantics - **Neutral platform terms / contracts preserved**: operational control, activation, effective state, scope, reason, expiry, blocked execution - **Retained provider-specific semantics and why**: `restore.execute` remains Microsoft-specific provider behavior in the current release because the control feature governs only start allowance, not provider execution semantics - **Bounded extraction or follow-up path**: none in this slice; future catalog growth or provider-neutral expansions require a follow-up spec instead of implicit widening here ## Constitution Check *GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.* - Read/write separation: PASS - control management is an explicit platform-plane mutation with confirmation, audit, and focused tests; blocked execution paths remain non-mutating except for audit logging. - RBAC-UX: PASS - platform management stays on `/system`; tenant/admin execution surfaces stay on `/admin/t/{tenant}`; cross-plane access remains 404; entitled-but-paused users get explicit control feedback while membership and capability failures keep 404/403 semantics. - Workspace isolation / tenant isolation: PASS - workspace-targeted controls apply only within the chosen workspace; tenant surfaces still resolve tenant/workspace entitlement before control-state disclosure. - Run observability / Ops-UX: PASS - allowed starts reuse existing `OperationRun` paths; blocked starts create no run and no new lifecycle dialect; later control activation does not retroactively mutate already accepted runs; shared start/result helpers remain authoritative. - Shared path reuse / `XCUT-001`: PASS - the design extends existing UI enforcement, provider-start gating, audit logging, and operation start UX instead of introducing page-local flags. - Provider boundary / `PROV-001`: PASS - control language stays provider-neutral while restore execution remains provider-owned. - Proportionality / `PROP-001` and `ABSTR-001`: PASS - the only new structure is justified by two current-release controls and three existing enforcement surfaces; no experimentation platform or generalized remote-config system is planned. - Persisted truth / `PERSIST-001`: PASS - active control activations represent independent runtime-safety truth with their own scope, reason, expiry, and audit obligations; convenience UI state remains derived. - Behavioral state / `STATE-001`: PASS - paused/enabled semantics change whether execution may start and therefore justify one bounded effective-state model. - Filament-native UI / `UI-FIL-001`: PASS - all touched surfaces remain native Filament pages/resources/actions; no custom UI framework is introduced. - Global search rule: N/A - no new globally searchable resource is added. - Panel/provider registration: PASS - Filament v5 remains on Livewire v4 and no new panel/provider registration is required; Laravel 12 provider registration stays in `bootstrap/providers.php` if any provider change becomes necessary. - Test governance / `TEST-GOV-001`: PASS - proof stays in focused unit and feature lanes with no browser or heavy-governance expansion. ## Test Governance Check - **Test purpose / classification by changed surface**: Unit for catalog/evaluator/scope precedence/expiry logic; Feature for system control management, runbook enforcement, findings header-action enforcement, restore-execution enforcement, audit logging, and `404`/`403` semantics - **Affected validation lanes**: fast-feedback, confidence - **Why this lane mix is the narrowest sufficient proof**: the business truth is server-side effective-state resolution plus enforcement at existing Filament and service seams. Browser tests would duplicate modal choreography without proving additional runtime safety truth. - **Narrowest proving command(s)**: - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/OperationalControls/OperationalControlCatalogTest.php tests/Unit/Support/OperationalControls/OperationalControlEvaluatorTest.php tests/Unit/Support/OperationalControls/OperationalControlScopeResolutionTest.php` - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/System/OpsControls/OperationalControlManagementTest.php tests/Feature/System/OpsRunbooks/OperationalControlRunbookGateTest.php` - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Findings/OperationalControlFindingsBackfillGateTest.php tests/Feature/Restore/OperationalControlRestoreExecutionGateTest.php tests/Feature/OperationalControls/OperationalControlAuthorizationSemanticsTest.php tests/Feature/OperationalControls/NoAdHocOperationalControlBypassTest.php` - **Fixture / helper / factory / seed / context cost risks**: add one local factory for active control activations plus platform-user and workspace-scoped setup helpers reused only by operational-control tests; avoid new shared browser or provider-fixture defaults - **Expensive defaults or shared helper growth introduced?**: no; control fixtures stay opt-in and local to the new test family - **Heavy-family additions, promotions, or visibility changes**: none - **Surface-class relief / special coverage rule**: standard-native-filament and monitoring-state-page relief are sufficient; assert disabled/blocked behavior and no side effects instead of browser-only choreography - **Closing validation and reviewer handoff**: reviewers should rerun the targeted unit/feature commands, verify the env gate is removed from the in-scope findings action, confirm restore execution is blocked before queue/provider start, confirm blocked-execution audit entries exist for runbook/findings/restore paths, confirm global control changes audit without false workspace ownership, confirm `/system/ops/controls` returns 403 for system users missing `platform.ops.controls.manage`, and confirm non-members still receive 404 while missing capabilities still receive 403 with the existing capability-denied UX rather than paused-state helper text - **Budget / baseline / trend follow-up**: low-to-moderate increase in focused unit/feature coverage only - **Review-stop questions**: did implementation add a second control persistence shape, leave the env gate in place, introduce a local blocked-state dialect, or widen into browser/heavy-governance lanes? - **Escalation path**: `reject-or-split` if the implementation widens into generic feature-flagging or customer-managed controls; `document-in-feature` for small shared-helper extensions that remain local to this slice - **Active feature PR close-out entry**: Guardrail - **Why no dedicated follow-up spec is needed**: the planned new model, evaluator, and tests stay local to the first-slice control family; recurring growth beyond the two bounded control keys would require its own follow-up spec ## Project Structure ### Documentation (this feature) ```text specs/242-operational-controls/ ├── plan.md ├── research.md ├── data-model.md ├── quickstart.md ├── checklists/ │ └── requirements.md ├── contracts/ │ └── operational-controls.contract.yaml └── tasks.md ``` ### Source Code (repository root) ```text apps/platform/ ├── app/ │ ├── Filament/System/Pages/Ops/ │ │ ├── Controls.php │ │ └── Runbooks.php │ ├── Filament/Resources/FindingResource/Pages/ListFindings.php │ ├── Filament/Resources/RestoreRunResource.php │ ├── Models/ │ │ └── OperationalControlActivation.php │ ├── Services/Audit/AuditRecorder.php │ ├── Services/Audit/WorkspaceAuditLogger.php │ ├── Services/Providers/ProviderOperationStartGate.php │ ├── Support/Audit/AuditActionId.php │ ├── Support/Auth/PlatformCapabilities.php │ └── Support/OperationalControls/ │ ├── OperationalControlCatalog.php │ ├── OperationalControlDecision.php │ └── OperationalControlEvaluator.php ├── database/ │ ├── factories/ │ │ └── OperationalControlActivationFactory.php │ └── migrations/ │ └── *_create_operational_control_activations_table.php └── tests/ ├── Feature/ │ ├── Findings/OperationalControlFindingsBackfillGateTest.php │ ├── OperationalControls/ │ │ ├── NoAdHocOperationalControlBypassTest.php │ │ └── OperationalControlAuthorizationSemanticsTest.php │ ├── Restore/OperationalControlRestoreExecutionGateTest.php │ ├── System/OpsControls/OperationalControlManagementTest.php │ └── System/OpsRunbooks/OperationalControlRunbookGateTest.php └── Unit/Support/OperationalControls/ ├── OperationalControlCatalogTest.php ├── OperationalControlEvaluatorTest.php └── OperationalControlScopeResolutionTest.php ``` **Structure Decision**: Single Laravel web application. The feature adds one bounded platform-operated model and one small support namespace for operational-control evaluation, then plugs that into existing system and tenant Filament surfaces. ## Complexity Tracking No unapproved constitution violations are required. The only new persistence and abstraction are the justified control-activation record plus evaluator/catalog pair described below. ## Proportionality Review - **Current operator problem**: founders and platform operators need a safe runtime way to pause already-existing risky actions without editing environment variables or relying on inconsistent per-surface logic. - **Existing structure is insufficient because**: `UiEnforcement` decides RBAC, `ProviderOperationStartGate` decides provider readiness, and env flags decide hidden page-local runtime behavior. None of those alone gives one auditable runtime-safety truth across both system and tenant surfaces. - **Narrowest correct implementation**: persist only explicit active control activations, derive the enabled state from absence of an activation, evaluate one effective decision through a shared catalog/evaluator, and wire that into the three concrete existing start paths. - **Ownership cost created**: one new table/model/factory, one small support namespace, one system page, new audit action IDs and capability constants, and focused unit/feature coverage. - **Alternative intentionally rejected**: keep env/config flags, reuse workspace settings, or build a generalized feature-flag system. Env/config flags are invisible product truth, workspace settings do not cleanly represent one global control truth, and a generic flag platform is far too broad. - **Release truth**: current-release truth ## Phase 0 — Research (output: `research.md`) See: `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/research.md` Goals: - Confirm the narrowest persistence shape for runtime-safety truth and explicitly reject env-only or workspace-settings-only alternatives. - Confirm the smallest shared seam where control evaluation belongs for system runbooks, tenant findings lifecycle backfill, and provider-backed restore execution. - Define v1 scoping, global-first precedence, expiry, and audit expectations without inventing a generic flag taxonomy. - Document the v1 decision that break-glass and broad platform capabilities do not bypass an active operational control. ## Phase 1 — Design & Contracts (outputs: `data-model.md`, `contracts/`, `quickstart.md`) See: - `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/data-model.md` - `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/contracts/operational-controls.contract.yaml` - `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/quickstart.md` Design focus: - Add one platform-operated activation record that can pause a control globally or for one workspace, with optional expiry, auditable reason, global-first precedence, and partial unique indexes that enforce one active global row per control and one active workspace row per control/workspace pair; the write path deletes expired conflicting rows before inserting a new activation, and this table is not used as an archive. - Add one new system ops controls page that lists the two bounded control keys, their effective state, scope, owner, expiry, change actions, and on-demand audit history links, and uses a staged scope-impact preview before control mutations are confirmed. - Use `OperationalControlDecision` as the shared control-state presentation primitive for controls, runbooks, findings, and restore surfaces. - Route `findings.lifecycle.backfill` through the new evaluator in both `ListFindings` and `Runbooks`, removing the existing env gate. - Route `findings.lifecycle.backfill` through `FindingsLifecycleBackfillRunbookService::start()` so the system runbooks page, tenant findings page, CLI command, and deploy-hook command all honor the same control decision. - Route `restore.execute` through the same evaluator before provider-backed or non-provider-backed queued restore execution is created. - Add dedicated audit action IDs and a dedicated platform capability for control management, using `AuditRecorder` for global control changes and blocked system-plane all-tenant attempts, and `WorkspaceAuditLogger` for workspace/tenant-scoped changes and blocked-execution evidence with concrete scope. - Keep blocked-state messaging on existing shared start/result helpers and avoid custom control-state UI frameworks. ## Phase 1 — Agent Context Update After Phase 1 artifacts are generated, update Copilot context from the plan: - `/Users/ahmeddarrazi/Documents/projects/wt-plattform/.specify/scripts/bash/update-agent-context.sh copilot` ## Phase 2 — Implementation Outline (tasks created in `/speckit.tasks`) - Add the `operational_control_activations` persistence, model, and local factory for active pause records. - Introduce the bounded operational-controls support namespace (`OperationalControlCatalog`, `OperationalControlDecision`, `OperationalControlEvaluator`) and keep enabled-state derived from active rows. - Add the dedicated controls-manage capability and its local grant path in the seeded platform operator setup. - Add the system-plane controls page and wire it into the existing system ops navigation with staged preview-plus-confirm pause/resume actions, audit logging, and on-demand audit history links. - Replace the findings env gate with evaluator-driven control checks on the tenant findings header action and the system runbooks start path. - Integrate the same evaluator into restore execution before any queued execution `OperationRun`, queued execution `RestoreRun`, queue dispatch, or provider-backed execution starts. - Add focused unit and feature tests, plus a guard test that blocks new ad-hoc runtime-control bypasses for in-scope controls and one proving path that activating a control does not rewrite previously accepted runs. ## Constitution Check (Post-Design) Re-check target: PASS. The post-design shape must still use one bounded control catalog, one active-row persistence model, one evaluator, existing auth/start/audit helpers, and no second runtime-control dialect. ## Implementation Close-out - Delivered the bounded operational-controls slice end-to-end: one `operational_control_activations` truth model, one catalog/evaluator/decision support path, a new `/system/ops/controls` management page, findings lifecycle enforcement through `FindingsLifecycleBackfillRunbookService::start()`, and restore execution blocking before any queued execution `OperationRun`, queued execution `RestoreRun`, job dispatch, or provider-backed start. - Runtime cleanup landed with the in-scope findings env gate removed from `config/tenantpilot.php`, a source-scanning guard against ad-hoc bypasses, and workspace-isolation proof showing a workspace-scoped pause blocks only the targeted workspace while a second workspace remains unaffected. - Validation passed on the narrow feature lane: `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/OperationalControls/OperationalControlCatalogTest.php tests/Unit/Support/OperationalControls/OperationalControlEvaluatorTest.php tests/Unit/Support/OperationalControls/OperationalControlScopeResolutionTest.php tests/Feature/Filament/Spec113/AdminFindingsNoMaintenanceActionsTest.php tests/Feature/System/OpsControls/OperationalControlManagementTest.php tests/Feature/System/OpsRunbooks/OperationalControlRunbookGateTest.php tests/Feature/Findings/OperationalControlFindingsBackfillGateTest.php tests/Feature/Restore/OperationalControlRestoreExecutionGateTest.php tests/Feature/OperationalControls/OperationalControlAuthorizationSemanticsTest.php tests/Feature/OperationalControls/NoAdHocOperationalControlBypassTest.php` with `20 passed (253 assertions)`. - Formatting passed with `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent`. - Manual smoke passed in the integrated browser: the staged pause/resume flow on `/system/ops/controls` for `Findings lifecycle backfill` rendered scope-impact previews, applied the global pause, and returned to `Enabled` inside the SC-001 budget after bringing the local database up to date.