Removes the Findings lifecycle backfill from the Operational Controls UI and OperationalControlCatalog. This patch is a safe, controls-only change; runbooks, jobs and other runtime artifacts are NOT removed yet. Follow-up work will delete the runbook service/scope, jobs, commands, and update tests. Files changed: - apps/platform/app/Filament/System/Pages/Ops/Controls.php - apps/platform/app/Support/OperationalControls/OperationalControlCatalog.php - apps/platform/tests/Feature/System/OpsControls/OperationalControlManagementTest.php - apps/platform/tests/Unit/Support/OperationalControls/OperationalControlCatalogTest.php - apps/platform/tests/Unit/Support/OperationalControls/OperationalControlScopeResolutionTest.php Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de> Reviewed-on: #280
23 KiB
Implementation Plan: Operational Controls
Branch: 242-operational-controls | Date: 2026-04-26 | Spec: /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/spec.md
Input: Feature specification from /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/spec.md
Note: This template is filled in by the /speckit.plan command. See .specify/scripts/ for helper scripts.
Summary
- Replace the ad-hoc
allow_admin_maintenance_actionsenvironment gate with one product-owned operational-control path for the first-slice keysfindings.lifecycle.backfillandrestore.execute. - Introduce one platform-operated activation record plus one shared evaluator that plugs into the existing system runbook, tenant findings-maintenance, and restore-execution start seams without becoming a generic experimentation platform.
- Reuse existing enforcement and UX seams -
UiEnforcement,ProviderOperationStartGate,OperationRunService,OperationUxPresenter,ProviderOperationStartResultPresenter,AuditRecorder,WorkspaceAuditLogger, andAuditActionId- so the slice stays small, auditable, and server-side enforced.
Technical Context
Language/Version: PHP 8.4 (Laravel 12)
Primary Dependencies: Laravel 12 + Filament v5 + Livewire v4 + Pest; existing UiEnforcement, ProviderOperationStartGate, OperationRunService, AuditRecorder, WorkspaceAuditLogger, AuditActionId, PlatformCapabilities
Storage: PostgreSQL via existing product tables plus one new platform-operated operational_control_activations table; no tenant-owned control tables
Testing: Pest unit + feature tests only
Validation Lanes: fast-feedback, confidence
Target Platform: Sail-backed Laravel admin surfaces under /admin/t/{tenant} and system surfaces under /system
Project Type: web
Performance Goals: effective-control resolution remains DB-only and cheap at action start time, adds no outbound HTTP, and blocks in-scope starts before queue or provider execution begins
Constraints: no generic feature-flag platform, no new browser or heavy-governance suite, no break-glass bypass in v1, no parallel env gate for in-scope controls, global pauses win over workspace pauses, preserve 404 vs 403 semantics, keep provider-specific restore behavior out of platform-core control vocabulary
Scale/Scope: 2 control keys, 2 scope levels (global and workspace), 1 system management surface, and 3 concrete enforcement families across 4 touched UI surfaces
UI / Surface Guardrail Plan
- Guardrail scope: changed surfaces
- Native vs custom classification summary: native Filament + shared start/result primitives
- Shared-family relevance: header actions, runbook launch actions, provider-backed start results, audit-backed control changes
- State layers in scope: page, detail, action/modal
- Handling modes by drift class or surface: review-mandatory
- Repository-signal treatment: review-mandatory
- Special surface test profiles: standard-native-filament, monitoring-state-page
- Required tests or manual smoke: functional-core, state-contract
- Exception path and spread control: none; v1 must not allow a second local runtime-control dialect
- Active feature PR close-out entry: Guardrail
Shared Pattern & System Fit
- Cross-cutting feature marker: yes
- Systems touched:
App\Filament\System\Pages\Ops\Runbooks, new system ops controls page,App\Filament\Resources\FindingResource\Pages\ListFindings,App\Filament\Resources\RestoreRunResource,App\Support\Rbac\UiEnforcement,App\Services\Providers\ProviderOperationStartGate,App\Support\OpsUx\OperationUxPresenter,App\Support\OpsUx\ProviderOperationStartResultPresenter,App\Services\Audit\AuditRecorder,App\Services\Audit\WorkspaceAuditLogger,App\Support\Audit\AuditActionId - Shared abstractions reused:
UiEnforcement,ProviderOperationStartGate,ProviderOperationStartResultPresenter,OperationRunService,OperationUxPresenter,OpsUxBrowserEvents,OperationRunLinks,SystemOperationRunLinks,AuditRecorder,WorkspaceAuditLogger - New abstraction introduced? why?: one bounded
OperationalControlCatalogplus oneOperationalControlEvaluatorare justified because the feature now has two real concrete control keys that must evaluate consistently across system-plane and tenant-plane start paths. No registry lattice, provider strategy system, or customer-facing flag DSL is introduced. - Why the existing abstraction was sufficient or insufficient: existing abstractions already own auth, queue start UX, and audit writing; they are insufficient because none presently carries a reusable runtime-safety decision that can pause an action before it starts, and
WorkspaceAuditLoggeralone cannot truthfully own global platform-plane mutations. - Bounded deviation / spread control: no deviation is allowed for in-scope controls; every affected surface must route through the shared evaluator rather than direct
config(...)reads or page-local booleans.
OperationRun UX Impact
- Touches OperationRun start/completion/link UX?: yes
- Central contract reused: shared OperationRun start UX plus provider-start result helpers
- Delegated UX behaviors: queued toast,
Open operation/View runlinks, run-enqueued browser event, dedupe-or-blocked messaging, and tenant/workspace-safe URL resolution remain on existing shared paths - Surface-owned behavior kept local: initiation inputs, confirmation copy, and control-management forms only
- Queued DB-notification policy: unchanged explicit opt-in only
- Terminal notification path: existing central lifecycle mechanism for starts that are allowed
- Exception path: none
Provider Boundary & Portability Fit
- Shared provider/platform boundary touched?: yes
- Provider-owned seams: provider-backed
restore.executedispatch, provider binding resolution, provider reason translation, existing restore safety and dry-run behavior - Platform-core seams: operational-control vocabulary, scope/effective-state evaluation, control management surface, audit labels, blocked-state semantics
- Neutral platform terms / contracts preserved: operational control, activation, effective state, scope, reason, expiry, blocked execution
- Retained provider-specific semantics and why:
restore.executeremains Microsoft-specific provider behavior in the current release because the control feature governs only start allowance, not provider execution semantics - Bounded extraction or follow-up path: none in this slice; future catalog growth or provider-neutral expansions require a follow-up spec instead of implicit widening here
Constitution Check
GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.
- Read/write separation: PASS - control management is an explicit platform-plane mutation with confirmation, audit, and focused tests; blocked execution paths remain non-mutating except for audit logging.
- RBAC-UX: PASS - platform management stays on
/system; tenant/admin execution surfaces stay on/admin/t/{tenant}; cross-plane access remains 404; entitled-but-paused users get explicit control feedback while membership and capability failures keep 404/403 semantics. - Workspace isolation / tenant isolation: PASS - workspace-targeted controls apply only within the chosen workspace; tenant surfaces still resolve tenant/workspace entitlement before control-state disclosure.
- Run observability / Ops-UX: PASS - allowed starts reuse existing
OperationRunpaths; blocked starts create no run and no new lifecycle dialect; later control activation does not retroactively mutate already accepted runs; shared start/result helpers remain authoritative. - Shared path reuse /
XCUT-001: PASS - the design extends existing UI enforcement, provider-start gating, audit logging, and operation start UX instead of introducing page-local flags. - Provider boundary /
PROV-001: PASS - control language stays provider-neutral while restore execution remains provider-owned. - Proportionality /
PROP-001andABSTR-001: PASS - the only new structure is justified by two current-release controls and three existing enforcement surfaces; no experimentation platform or generalized remote-config system is planned. - Persisted truth /
PERSIST-001: PASS - active control activations represent independent runtime-safety truth with their own scope, reason, expiry, and audit obligations; convenience UI state remains derived. - Behavioral state /
STATE-001: PASS - paused/enabled semantics change whether execution may start and therefore justify one bounded effective-state model. - Filament-native UI /
UI-FIL-001: PASS - all touched surfaces remain native Filament pages/resources/actions; no custom UI framework is introduced. - Global search rule: N/A - no new globally searchable resource is added.
- Panel/provider registration: PASS - Filament v5 remains on Livewire v4 and no new panel/provider registration is required; Laravel 12 provider registration stays in
bootstrap/providers.phpif any provider change becomes necessary. - Test governance /
TEST-GOV-001: PASS - proof stays in focused unit and feature lanes with no browser or heavy-governance expansion.
Test Governance Check
- Test purpose / classification by changed surface: Unit for catalog/evaluator/scope precedence/expiry logic; Feature for system control management, runbook enforcement, findings header-action enforcement, restore-execution enforcement, audit logging, and
404/403semantics - Affected validation lanes: fast-feedback, confidence
- Why this lane mix is the narrowest sufficient proof: the business truth is server-side effective-state resolution plus enforcement at existing Filament and service seams. Browser tests would duplicate modal choreography without proving additional runtime safety truth.
- Narrowest proving command(s):
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/OperationalControls/OperationalControlCatalogTest.php tests/Unit/Support/OperationalControls/OperationalControlEvaluatorTest.php tests/Unit/Support/OperationalControls/OperationalControlScopeResolutionTest.phpexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/System/OpsControls/OperationalControlManagementTest.php tests/Feature/System/OpsRunbooks/OperationalControlRunbookGateTest.phpexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Findings/OperationalControlFindingsBackfillGateTest.php tests/Feature/Restore/OperationalControlRestoreExecutionGateTest.php tests/Feature/OperationalControls/OperationalControlAuthorizationSemanticsTest.php tests/Feature/OperationalControls/NoAdHocOperationalControlBypassTest.php
- Fixture / helper / factory / seed / context cost risks: add one local factory for active control activations plus platform-user and workspace-scoped setup helpers reused only by operational-control tests; avoid new shared browser or provider-fixture defaults
- Expensive defaults or shared helper growth introduced?: no; control fixtures stay opt-in and local to the new test family
- Heavy-family additions, promotions, or visibility changes: none
- Surface-class relief / special coverage rule: standard-native-filament and monitoring-state-page relief are sufficient; assert disabled/blocked behavior and no side effects instead of browser-only choreography
- Closing validation and reviewer handoff: reviewers should rerun the targeted unit/feature commands, verify the env gate is removed from the in-scope findings action, confirm restore execution is blocked before queue/provider start, confirm blocked-execution audit entries exist for runbook/findings/restore paths, confirm global control changes audit without false workspace ownership, confirm
/system/ops/controlsreturns 403 for system users missingplatform.ops.controls.manage, and confirm non-members still receive 404 while missing capabilities still receive 403 with the existing capability-denied UX rather than paused-state helper text - Budget / baseline / trend follow-up: low-to-moderate increase in focused unit/feature coverage only
- Review-stop questions: did implementation add a second control persistence shape, leave the env gate in place, introduce a local blocked-state dialect, or widen into browser/heavy-governance lanes?
- Escalation path:
reject-or-splitif the implementation widens into generic feature-flagging or customer-managed controls;document-in-featurefor small shared-helper extensions that remain local to this slice - Active feature PR close-out entry: Guardrail
- Why no dedicated follow-up spec is needed: the planned new model, evaluator, and tests stay local to the first-slice control family; recurring growth beyond the two bounded control keys would require its own follow-up spec
Project Structure
Documentation (this feature)
specs/242-operational-controls/
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── checklists/
│ └── requirements.md
├── contracts/
│ └── operational-controls.contract.yaml
└── tasks.md
Source Code (repository root)
apps/platform/
├── app/
│ ├── Filament/System/Pages/Ops/
│ │ ├── Controls.php
│ │ └── Runbooks.php
│ ├── Filament/Resources/FindingResource/Pages/ListFindings.php
│ ├── Filament/Resources/RestoreRunResource.php
│ ├── Models/
│ │ └── OperationalControlActivation.php
│ ├── Services/Audit/AuditRecorder.php
│ ├── Services/Audit/WorkspaceAuditLogger.php
│ ├── Services/Providers/ProviderOperationStartGate.php
│ ├── Support/Audit/AuditActionId.php
│ ├── Support/Auth/PlatformCapabilities.php
│ └── Support/OperationalControls/
│ ├── OperationalControlCatalog.php
│ ├── OperationalControlDecision.php
│ └── OperationalControlEvaluator.php
├── database/
│ ├── factories/
│ │ └── OperationalControlActivationFactory.php
│ └── migrations/
│ └── *_create_operational_control_activations_table.php
└── tests/
├── Feature/
│ ├── Findings/OperationalControlFindingsBackfillGateTest.php
│ ├── OperationalControls/
│ │ ├── NoAdHocOperationalControlBypassTest.php
│ │ └── OperationalControlAuthorizationSemanticsTest.php
│ ├── Restore/OperationalControlRestoreExecutionGateTest.php
│ ├── System/OpsControls/OperationalControlManagementTest.php
│ └── System/OpsRunbooks/OperationalControlRunbookGateTest.php
└── Unit/Support/OperationalControls/
├── OperationalControlCatalogTest.php
├── OperationalControlEvaluatorTest.php
└── OperationalControlScopeResolutionTest.php
Structure Decision: Single Laravel web application. The feature adds one bounded platform-operated model and one small support namespace for operational-control evaluation, then plugs that into existing system and tenant Filament surfaces.
Complexity Tracking
No unapproved constitution violations are required. The only new persistence and abstraction are the justified control-activation record plus evaluator/catalog pair described below.
Proportionality Review
- Current operator problem: founders and platform operators need a safe runtime way to pause already-existing risky actions without editing environment variables or relying on inconsistent per-surface logic.
- Existing structure is insufficient because:
UiEnforcementdecides RBAC,ProviderOperationStartGatedecides provider readiness, and env flags decide hidden page-local runtime behavior. None of those alone gives one auditable runtime-safety truth across both system and tenant surfaces. - Narrowest correct implementation: persist only explicit active control activations, derive the enabled state from absence of an activation, evaluate one effective decision through a shared catalog/evaluator, and wire that into the three concrete existing start paths.
- Ownership cost created: one new table/model/factory, one small support namespace, one system page, new audit action IDs and capability constants, and focused unit/feature coverage.
- Alternative intentionally rejected: keep env/config flags, reuse workspace settings, or build a generalized feature-flag system. Env/config flags are invisible product truth, workspace settings do not cleanly represent one global control truth, and a generic flag platform is far too broad.
- Release truth: current-release truth
Phase 0 — Research (output: research.md)
See: /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/research.md
Goals:
- Confirm the narrowest persistence shape for runtime-safety truth and explicitly reject env-only or workspace-settings-only alternatives.
- Confirm the smallest shared seam where control evaluation belongs for system runbooks, tenant findings lifecycle backfill, and provider-backed restore execution.
- Define v1 scoping, global-first precedence, expiry, and audit expectations without inventing a generic flag taxonomy.
- Document the v1 decision that break-glass and broad platform capabilities do not bypass an active operational control.
Phase 1 — Design & Contracts (outputs: data-model.md, contracts/, quickstart.md)
See:
/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/data-model.md/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/contracts/operational-controls.contract.yaml/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/quickstart.md
Design focus:
- Add one platform-operated activation record that can pause a control globally or for one workspace, with optional expiry, auditable reason, global-first precedence, and partial unique indexes that enforce one active global row per control and one active workspace row per control/workspace pair; the write path deletes expired conflicting rows before inserting a new activation, and this table is not used as an archive.
- Add one new system ops controls page that lists the two bounded control keys, their effective state, scope, owner, expiry, change actions, and on-demand audit history links, and uses a staged scope-impact preview before control mutations are confirmed.
- Use
OperationalControlDecisionas the shared control-state presentation primitive for controls, runbooks, findings, and restore surfaces. - Route
findings.lifecycle.backfillthrough the new evaluator in bothListFindingsandRunbooks, removing the existing env gate. - Route
findings.lifecycle.backfillthroughFindingsLifecycleBackfillRunbookService::start()so the system runbooks page, tenant findings page, CLI command, and deploy-hook command all honor the same control decision. - Route
restore.executethrough the same evaluator before provider-backed or non-provider-backed queued restore execution is created. - Add dedicated audit action IDs and a dedicated platform capability for control management, using
AuditRecorderfor global control changes and blocked system-plane all-tenant attempts, andWorkspaceAuditLoggerfor workspace/tenant-scoped changes and blocked-execution evidence with concrete scope. - Keep blocked-state messaging on existing shared start/result helpers and avoid custom control-state UI frameworks.
Phase 1 — Agent Context Update
After Phase 1 artifacts are generated, update Copilot context from the plan:
/Users/ahmeddarrazi/Documents/projects/wt-plattform/.specify/scripts/bash/update-agent-context.sh copilot
Phase 2 — Implementation Outline (tasks created in /speckit.tasks)
- Add the
operational_control_activationspersistence, model, and local factory for active pause records. - Introduce the bounded operational-controls support namespace (
OperationalControlCatalog,OperationalControlDecision,OperationalControlEvaluator) and keep enabled-state derived from active rows. - Add the dedicated controls-manage capability and its local grant path in the seeded platform operator setup.
- Add the system-plane controls page and wire it into the existing system ops navigation with staged preview-plus-confirm pause/resume actions, audit logging, and on-demand audit history links.
- Replace the findings env gate with evaluator-driven control checks on the tenant findings header action and the system runbooks start path.
- Integrate the same evaluator into restore execution before any queued execution
OperationRun, queued executionRestoreRun, queue dispatch, or provider-backed execution starts. - Add focused unit and feature tests, plus a guard test that blocks new ad-hoc runtime-control bypasses for in-scope controls and one proving path that activating a control does not rewrite previously accepted runs.
Constitution Check (Post-Design)
Re-check target: PASS. The post-design shape must still use one bounded control catalog, one active-row persistence model, one evaluator, existing auth/start/audit helpers, and no second runtime-control dialect.
Implementation Close-out
- Delivered the bounded operational-controls slice end-to-end: one
operational_control_activationstruth model, one catalog/evaluator/decision support path, a new/system/ops/controlsmanagement page, findings lifecycle enforcement throughFindingsLifecycleBackfillRunbookService::start(), and restore execution blocking before any queued executionOperationRun, queued executionRestoreRun, job dispatch, or provider-backed start. - Runtime cleanup landed with the in-scope findings env gate removed from
config/tenantpilot.php, a source-scanning guard against ad-hoc bypasses, and workspace-isolation proof showing a workspace-scoped pause blocks only the targeted workspace while a second workspace remains unaffected. - Validation passed on the narrow feature lane:
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/OperationalControls/OperationalControlCatalogTest.php tests/Unit/Support/OperationalControls/OperationalControlEvaluatorTest.php tests/Unit/Support/OperationalControls/OperationalControlScopeResolutionTest.php tests/Feature/Filament/Spec113/AdminFindingsNoMaintenanceActionsTest.php tests/Feature/System/OpsControls/OperationalControlManagementTest.php tests/Feature/System/OpsRunbooks/OperationalControlRunbookGateTest.php tests/Feature/Findings/OperationalControlFindingsBackfillGateTest.php tests/Feature/Restore/OperationalControlRestoreExecutionGateTest.php tests/Feature/OperationalControls/OperationalControlAuthorizationSemanticsTest.php tests/Feature/OperationalControls/NoAdHocOperationalControlBypassTest.phpwith20 passed (253 assertions). - Formatting passed with
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent. - Manual smoke passed in the integrated browser: the staged pause/resume flow on
/system/ops/controlsforFindings lifecycle backfillrendered scope-impact previews, applied the global pause, and returned toEnabledinside the SC-001 budget after bringing the local database up to date.