Some checks failed
Main Confidence / confidence (push) Failing after 1m23s
Removes the Findings lifecycle backfill from the Operational Controls UI and OperationalControlCatalog. This patch is a safe, controls-only change; runbooks, jobs and other runtime artifacts are NOT removed yet. Follow-up work will delete the runbook service/scope, jobs, commands, and update tests. Files changed: - apps/platform/app/Filament/System/Pages/Ops/Controls.php - apps/platform/app/Support/OperationalControls/OperationalControlCatalog.php - apps/platform/tests/Feature/System/OpsControls/OperationalControlManagementTest.php - apps/platform/tests/Unit/Support/OperationalControls/OperationalControlCatalogTest.php - apps/platform/tests/Unit/Support/OperationalControls/OperationalControlScopeResolutionTest.php Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de> Reviewed-on: #280
233 lines
23 KiB
Markdown
233 lines
23 KiB
Markdown
# Implementation Plan: Operational Controls
|
|
|
|
**Branch**: `242-operational-controls` | **Date**: 2026-04-26 | **Spec**: `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/spec.md`
|
|
**Input**: Feature specification from `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/spec.md`
|
|
|
|
**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/scripts/` for helper scripts.
|
|
|
|
## Summary
|
|
|
|
- Replace the ad-hoc `allow_admin_maintenance_actions` environment gate with one product-owned operational-control path for the first-slice keys `findings.lifecycle.backfill` and `restore.execute`.
|
|
- Introduce one platform-operated activation record plus one shared evaluator that plugs into the existing system runbook, tenant findings-maintenance, and restore-execution start seams without becoming a generic experimentation platform.
|
|
- Reuse existing enforcement and UX seams - `UiEnforcement`, `ProviderOperationStartGate`, `OperationRunService`, `OperationUxPresenter`, `ProviderOperationStartResultPresenter`, `AuditRecorder`, `WorkspaceAuditLogger`, and `AuditActionId` - so the slice stays small, auditable, and server-side enforced.
|
|
|
|
## Technical Context
|
|
|
|
**Language/Version**: PHP 8.4 (Laravel 12)
|
|
**Primary Dependencies**: Laravel 12 + Filament v5 + Livewire v4 + Pest; existing `UiEnforcement`, `ProviderOperationStartGate`, `OperationRunService`, `AuditRecorder`, `WorkspaceAuditLogger`, `AuditActionId`, `PlatformCapabilities`
|
|
**Storage**: PostgreSQL via existing product tables plus one new platform-operated `operational_control_activations` table; no tenant-owned control tables
|
|
**Testing**: Pest unit + feature tests only
|
|
**Validation Lanes**: fast-feedback, confidence
|
|
**Target Platform**: Sail-backed Laravel admin surfaces under `/admin/t/{tenant}` and system surfaces under `/system`
|
|
**Project Type**: web
|
|
**Performance Goals**: effective-control resolution remains DB-only and cheap at action start time, adds no outbound HTTP, and blocks in-scope starts before queue or provider execution begins
|
|
**Constraints**: no generic feature-flag platform, no new browser or heavy-governance suite, no break-glass bypass in v1, no parallel env gate for in-scope controls, global pauses win over workspace pauses, preserve 404 vs 403 semantics, keep provider-specific restore behavior out of platform-core control vocabulary
|
|
**Scale/Scope**: 2 control keys, 2 scope levels (global and workspace), 1 system management surface, and 3 concrete enforcement families across 4 touched UI surfaces
|
|
|
|
## UI / Surface Guardrail Plan
|
|
|
|
- **Guardrail scope**: changed surfaces
|
|
- **Native vs custom classification summary**: native Filament + shared start/result primitives
|
|
- **Shared-family relevance**: header actions, runbook launch actions, provider-backed start results, audit-backed control changes
|
|
- **State layers in scope**: page, detail, action/modal
|
|
- **Handling modes by drift class or surface**: review-mandatory
|
|
- **Repository-signal treatment**: review-mandatory
|
|
- **Special surface test profiles**: standard-native-filament, monitoring-state-page
|
|
- **Required tests or manual smoke**: functional-core, state-contract
|
|
- **Exception path and spread control**: none; v1 must not allow a second local runtime-control dialect
|
|
- **Active feature PR close-out entry**: Guardrail
|
|
|
|
## Shared Pattern & System Fit
|
|
|
|
- **Cross-cutting feature marker**: yes
|
|
- **Systems touched**: `App\Filament\System\Pages\Ops\Runbooks`, new system ops controls page, `App\Filament\Resources\FindingResource\Pages\ListFindings`, `App\Filament\Resources\RestoreRunResource`, `App\Support\Rbac\UiEnforcement`, `App\Services\Providers\ProviderOperationStartGate`, `App\Support\OpsUx\OperationUxPresenter`, `App\Support\OpsUx\ProviderOperationStartResultPresenter`, `App\Services\Audit\AuditRecorder`, `App\Services\Audit\WorkspaceAuditLogger`, `App\Support\Audit\AuditActionId`
|
|
- **Shared abstractions reused**: `UiEnforcement`, `ProviderOperationStartGate`, `ProviderOperationStartResultPresenter`, `OperationRunService`, `OperationUxPresenter`, `OpsUxBrowserEvents`, `OperationRunLinks`, `SystemOperationRunLinks`, `AuditRecorder`, `WorkspaceAuditLogger`
|
|
- **New abstraction introduced? why?**: one bounded `OperationalControlCatalog` plus one `OperationalControlEvaluator` are justified because the feature now has two real concrete control keys that must evaluate consistently across system-plane and tenant-plane start paths. No registry lattice, provider strategy system, or customer-facing flag DSL is introduced.
|
|
- **Why the existing abstraction was sufficient or insufficient**: existing abstractions already own auth, queue start UX, and audit writing; they are insufficient because none presently carries a reusable runtime-safety decision that can pause an action before it starts, and `WorkspaceAuditLogger` alone cannot truthfully own global platform-plane mutations.
|
|
- **Bounded deviation / spread control**: no deviation is allowed for in-scope controls; every affected surface must route through the shared evaluator rather than direct `config(...)` reads or page-local booleans.
|
|
|
|
## OperationRun UX Impact
|
|
|
|
- **Touches OperationRun start/completion/link UX?**: yes
|
|
- **Central contract reused**: shared OperationRun start UX plus provider-start result helpers
|
|
- **Delegated UX behaviors**: queued toast, `Open operation` / `View run` links, run-enqueued browser event, dedupe-or-blocked messaging, and tenant/workspace-safe URL resolution remain on existing shared paths
|
|
- **Surface-owned behavior kept local**: initiation inputs, confirmation copy, and control-management forms only
|
|
- **Queued DB-notification policy**: unchanged explicit opt-in only
|
|
- **Terminal notification path**: existing central lifecycle mechanism for starts that are allowed
|
|
- **Exception path**: none
|
|
|
|
## Provider Boundary & Portability Fit
|
|
|
|
- **Shared provider/platform boundary touched?**: yes
|
|
- **Provider-owned seams**: provider-backed `restore.execute` dispatch, provider binding resolution, provider reason translation, existing restore safety and dry-run behavior
|
|
- **Platform-core seams**: operational-control vocabulary, scope/effective-state evaluation, control management surface, audit labels, blocked-state semantics
|
|
- **Neutral platform terms / contracts preserved**: operational control, activation, effective state, scope, reason, expiry, blocked execution
|
|
- **Retained provider-specific semantics and why**: `restore.execute` remains Microsoft-specific provider behavior in the current release because the control feature governs only start allowance, not provider execution semantics
|
|
- **Bounded extraction or follow-up path**: none in this slice; future catalog growth or provider-neutral expansions require a follow-up spec instead of implicit widening here
|
|
|
|
## Constitution Check
|
|
|
|
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
|
|
|
|
- Read/write separation: PASS - control management is an explicit platform-plane mutation with confirmation, audit, and focused tests; blocked execution paths remain non-mutating except for audit logging.
|
|
- RBAC-UX: PASS - platform management stays on `/system`; tenant/admin execution surfaces stay on `/admin/t/{tenant}`; cross-plane access remains 404; entitled-but-paused users get explicit control feedback while membership and capability failures keep 404/403 semantics.
|
|
- Workspace isolation / tenant isolation: PASS - workspace-targeted controls apply only within the chosen workspace; tenant surfaces still resolve tenant/workspace entitlement before control-state disclosure.
|
|
- Run observability / Ops-UX: PASS - allowed starts reuse existing `OperationRun` paths; blocked starts create no run and no new lifecycle dialect; later control activation does not retroactively mutate already accepted runs; shared start/result helpers remain authoritative.
|
|
- Shared path reuse / `XCUT-001`: PASS - the design extends existing UI enforcement, provider-start gating, audit logging, and operation start UX instead of introducing page-local flags.
|
|
- Provider boundary / `PROV-001`: PASS - control language stays provider-neutral while restore execution remains provider-owned.
|
|
- Proportionality / `PROP-001` and `ABSTR-001`: PASS - the only new structure is justified by two current-release controls and three existing enforcement surfaces; no experimentation platform or generalized remote-config system is planned.
|
|
- Persisted truth / `PERSIST-001`: PASS - active control activations represent independent runtime-safety truth with their own scope, reason, expiry, and audit obligations; convenience UI state remains derived.
|
|
- Behavioral state / `STATE-001`: PASS - paused/enabled semantics change whether execution may start and therefore justify one bounded effective-state model.
|
|
- Filament-native UI / `UI-FIL-001`: PASS - all touched surfaces remain native Filament pages/resources/actions; no custom UI framework is introduced.
|
|
- Global search rule: N/A - no new globally searchable resource is added.
|
|
- Panel/provider registration: PASS - Filament v5 remains on Livewire v4 and no new panel/provider registration is required; Laravel 12 provider registration stays in `bootstrap/providers.php` if any provider change becomes necessary.
|
|
- Test governance / `TEST-GOV-001`: PASS - proof stays in focused unit and feature lanes with no browser or heavy-governance expansion.
|
|
|
|
## Test Governance Check
|
|
|
|
- **Test purpose / classification by changed surface**: Unit for catalog/evaluator/scope precedence/expiry logic; Feature for system control management, runbook enforcement, findings header-action enforcement, restore-execution enforcement, audit logging, and `404`/`403` semantics
|
|
- **Affected validation lanes**: fast-feedback, confidence
|
|
- **Why this lane mix is the narrowest sufficient proof**: the business truth is server-side effective-state resolution plus enforcement at existing Filament and service seams. Browser tests would duplicate modal choreography without proving additional runtime safety truth.
|
|
- **Narrowest proving command(s)**:
|
|
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/OperationalControls/OperationalControlCatalogTest.php tests/Unit/Support/OperationalControls/OperationalControlEvaluatorTest.php tests/Unit/Support/OperationalControls/OperationalControlScopeResolutionTest.php`
|
|
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/System/OpsControls/OperationalControlManagementTest.php tests/Feature/System/OpsRunbooks/OperationalControlRunbookGateTest.php`
|
|
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Findings/OperationalControlFindingsBackfillGateTest.php tests/Feature/Restore/OperationalControlRestoreExecutionGateTest.php tests/Feature/OperationalControls/OperationalControlAuthorizationSemanticsTest.php tests/Feature/OperationalControls/NoAdHocOperationalControlBypassTest.php`
|
|
- **Fixture / helper / factory / seed / context cost risks**: add one local factory for active control activations plus platform-user and workspace-scoped setup helpers reused only by operational-control tests; avoid new shared browser or provider-fixture defaults
|
|
- **Expensive defaults or shared helper growth introduced?**: no; control fixtures stay opt-in and local to the new test family
|
|
- **Heavy-family additions, promotions, or visibility changes**: none
|
|
- **Surface-class relief / special coverage rule**: standard-native-filament and monitoring-state-page relief are sufficient; assert disabled/blocked behavior and no side effects instead of browser-only choreography
|
|
- **Closing validation and reviewer handoff**: reviewers should rerun the targeted unit/feature commands, verify the env gate is removed from the in-scope findings action, confirm restore execution is blocked before queue/provider start, confirm blocked-execution audit entries exist for runbook/findings/restore paths, confirm global control changes audit without false workspace ownership, confirm `/system/ops/controls` returns 403 for system users missing `platform.ops.controls.manage`, and confirm non-members still receive 404 while missing capabilities still receive 403 with the existing capability-denied UX rather than paused-state helper text
|
|
- **Budget / baseline / trend follow-up**: low-to-moderate increase in focused unit/feature coverage only
|
|
- **Review-stop questions**: did implementation add a second control persistence shape, leave the env gate in place, introduce a local blocked-state dialect, or widen into browser/heavy-governance lanes?
|
|
- **Escalation path**: `reject-or-split` if the implementation widens into generic feature-flagging or customer-managed controls; `document-in-feature` for small shared-helper extensions that remain local to this slice
|
|
- **Active feature PR close-out entry**: Guardrail
|
|
- **Why no dedicated follow-up spec is needed**: the planned new model, evaluator, and tests stay local to the first-slice control family; recurring growth beyond the two bounded control keys would require its own follow-up spec
|
|
|
|
## Project Structure
|
|
|
|
### Documentation (this feature)
|
|
|
|
```text
|
|
specs/242-operational-controls/
|
|
├── plan.md
|
|
├── research.md
|
|
├── data-model.md
|
|
├── quickstart.md
|
|
├── checklists/
|
|
│ └── requirements.md
|
|
├── contracts/
|
|
│ └── operational-controls.contract.yaml
|
|
└── tasks.md
|
|
```
|
|
|
|
### Source Code (repository root)
|
|
|
|
```text
|
|
apps/platform/
|
|
├── app/
|
|
│ ├── Filament/System/Pages/Ops/
|
|
│ │ ├── Controls.php
|
|
│ │ └── Runbooks.php
|
|
│ ├── Filament/Resources/FindingResource/Pages/ListFindings.php
|
|
│ ├── Filament/Resources/RestoreRunResource.php
|
|
│ ├── Models/
|
|
│ │ └── OperationalControlActivation.php
|
|
│ ├── Services/Audit/AuditRecorder.php
|
|
│ ├── Services/Audit/WorkspaceAuditLogger.php
|
|
│ ├── Services/Providers/ProviderOperationStartGate.php
|
|
│ ├── Support/Audit/AuditActionId.php
|
|
│ ├── Support/Auth/PlatformCapabilities.php
|
|
│ └── Support/OperationalControls/
|
|
│ ├── OperationalControlCatalog.php
|
|
│ ├── OperationalControlDecision.php
|
|
│ └── OperationalControlEvaluator.php
|
|
├── database/
|
|
│ ├── factories/
|
|
│ │ └── OperationalControlActivationFactory.php
|
|
│ └── migrations/
|
|
│ └── *_create_operational_control_activations_table.php
|
|
└── tests/
|
|
├── Feature/
|
|
│ ├── Findings/OperationalControlFindingsBackfillGateTest.php
|
|
│ ├── OperationalControls/
|
|
│ │ ├── NoAdHocOperationalControlBypassTest.php
|
|
│ │ └── OperationalControlAuthorizationSemanticsTest.php
|
|
│ ├── Restore/OperationalControlRestoreExecutionGateTest.php
|
|
│ ├── System/OpsControls/OperationalControlManagementTest.php
|
|
│ └── System/OpsRunbooks/OperationalControlRunbookGateTest.php
|
|
└── Unit/Support/OperationalControls/
|
|
├── OperationalControlCatalogTest.php
|
|
├── OperationalControlEvaluatorTest.php
|
|
└── OperationalControlScopeResolutionTest.php
|
|
```
|
|
|
|
**Structure Decision**: Single Laravel web application. The feature adds one bounded platform-operated model and one small support namespace for operational-control evaluation, then plugs that into existing system and tenant Filament surfaces.
|
|
|
|
## Complexity Tracking
|
|
|
|
No unapproved constitution violations are required. The only new persistence and abstraction are the justified control-activation record plus evaluator/catalog pair described below.
|
|
|
|
## Proportionality Review
|
|
|
|
- **Current operator problem**: founders and platform operators need a safe runtime way to pause already-existing risky actions without editing environment variables or relying on inconsistent per-surface logic.
|
|
- **Existing structure is insufficient because**: `UiEnforcement` decides RBAC, `ProviderOperationStartGate` decides provider readiness, and env flags decide hidden page-local runtime behavior. None of those alone gives one auditable runtime-safety truth across both system and tenant surfaces.
|
|
- **Narrowest correct implementation**: persist only explicit active control activations, derive the enabled state from absence of an activation, evaluate one effective decision through a shared catalog/evaluator, and wire that into the three concrete existing start paths.
|
|
- **Ownership cost created**: one new table/model/factory, one small support namespace, one system page, new audit action IDs and capability constants, and focused unit/feature coverage.
|
|
- **Alternative intentionally rejected**: keep env/config flags, reuse workspace settings, or build a generalized feature-flag system. Env/config flags are invisible product truth, workspace settings do not cleanly represent one global control truth, and a generic flag platform is far too broad.
|
|
- **Release truth**: current-release truth
|
|
|
|
## Phase 0 — Research (output: `research.md`)
|
|
|
|
See: `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/research.md`
|
|
|
|
Goals:
|
|
- Confirm the narrowest persistence shape for runtime-safety truth and explicitly reject env-only or workspace-settings-only alternatives.
|
|
- Confirm the smallest shared seam where control evaluation belongs for system runbooks, tenant findings lifecycle backfill, and provider-backed restore execution.
|
|
- Define v1 scoping, global-first precedence, expiry, and audit expectations without inventing a generic flag taxonomy.
|
|
- Document the v1 decision that break-glass and broad platform capabilities do not bypass an active operational control.
|
|
|
|
## Phase 1 — Design & Contracts (outputs: `data-model.md`, `contracts/`, `quickstart.md`)
|
|
|
|
See:
|
|
- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/data-model.md`
|
|
- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/contracts/operational-controls.contract.yaml`
|
|
- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/242-operational-controls/quickstart.md`
|
|
|
|
Design focus:
|
|
- Add one platform-operated activation record that can pause a control globally or for one workspace, with optional expiry, auditable reason, global-first precedence, and partial unique indexes that enforce one active global row per control and one active workspace row per control/workspace pair; the write path deletes expired conflicting rows before inserting a new activation, and this table is not used as an archive.
|
|
- Add one new system ops controls page that lists the two bounded control keys, their effective state, scope, owner, expiry, change actions, and on-demand audit history links, and uses a staged scope-impact preview before control mutations are confirmed.
|
|
- Use `OperationalControlDecision` as the shared control-state presentation primitive for controls, runbooks, findings, and restore surfaces.
|
|
- Route `findings.lifecycle.backfill` through the new evaluator in both `ListFindings` and `Runbooks`, removing the existing env gate.
|
|
- Route `findings.lifecycle.backfill` through `FindingsLifecycleBackfillRunbookService::start()` so the system runbooks page, tenant findings page, CLI command, and deploy-hook command all honor the same control decision.
|
|
- Route `restore.execute` through the same evaluator before provider-backed or non-provider-backed queued restore execution is created.
|
|
- Add dedicated audit action IDs and a dedicated platform capability for control management, using `AuditRecorder` for global control changes and blocked system-plane all-tenant attempts, and `WorkspaceAuditLogger` for workspace/tenant-scoped changes and blocked-execution evidence with concrete scope.
|
|
- Keep blocked-state messaging on existing shared start/result helpers and avoid custom control-state UI frameworks.
|
|
|
|
## Phase 1 — Agent Context Update
|
|
|
|
After Phase 1 artifacts are generated, update Copilot context from the plan:
|
|
|
|
- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/.specify/scripts/bash/update-agent-context.sh copilot`
|
|
|
|
## Phase 2 — Implementation Outline (tasks created in `/speckit.tasks`)
|
|
|
|
- Add the `operational_control_activations` persistence, model, and local factory for active pause records.
|
|
- Introduce the bounded operational-controls support namespace (`OperationalControlCatalog`, `OperationalControlDecision`, `OperationalControlEvaluator`) and keep enabled-state derived from active rows.
|
|
- Add the dedicated controls-manage capability and its local grant path in the seeded platform operator setup.
|
|
- Add the system-plane controls page and wire it into the existing system ops navigation with staged preview-plus-confirm pause/resume actions, audit logging, and on-demand audit history links.
|
|
- Replace the findings env gate with evaluator-driven control checks on the tenant findings header action and the system runbooks start path.
|
|
- Integrate the same evaluator into restore execution before any queued execution `OperationRun`, queued execution `RestoreRun`, queue dispatch, or provider-backed execution starts.
|
|
- Add focused unit and feature tests, plus a guard test that blocks new ad-hoc runtime-control bypasses for in-scope controls and one proving path that activating a control does not rewrite previously accepted runs.
|
|
|
|
## Constitution Check (Post-Design)
|
|
|
|
Re-check target: PASS. The post-design shape must still use one bounded control catalog, one active-row persistence model, one evaluator, existing auth/start/audit helpers, and no second runtime-control dialect.
|
|
|
|
## Implementation Close-out
|
|
|
|
- Delivered the bounded operational-controls slice end-to-end: one `operational_control_activations` truth model, one catalog/evaluator/decision support path, a new `/system/ops/controls` management page, findings lifecycle enforcement through `FindingsLifecycleBackfillRunbookService::start()`, and restore execution blocking before any queued execution `OperationRun`, queued execution `RestoreRun`, job dispatch, or provider-backed start.
|
|
- Runtime cleanup landed with the in-scope findings env gate removed from `config/tenantpilot.php`, a source-scanning guard against ad-hoc bypasses, and workspace-isolation proof showing a workspace-scoped pause blocks only the targeted workspace while a second workspace remains unaffected.
|
|
- Validation passed on the narrow feature lane: `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/OperationalControls/OperationalControlCatalogTest.php tests/Unit/Support/OperationalControls/OperationalControlEvaluatorTest.php tests/Unit/Support/OperationalControls/OperationalControlScopeResolutionTest.php tests/Feature/Filament/Spec113/AdminFindingsNoMaintenanceActionsTest.php tests/Feature/System/OpsControls/OperationalControlManagementTest.php tests/Feature/System/OpsRunbooks/OperationalControlRunbookGateTest.php tests/Feature/Findings/OperationalControlFindingsBackfillGateTest.php tests/Feature/Restore/OperationalControlRestoreExecutionGateTest.php tests/Feature/OperationalControls/OperationalControlAuthorizationSemanticsTest.php tests/Feature/OperationalControls/NoAdHocOperationalControlBypassTest.php` with `20 passed (253 assertions)`.
|
|
- Formatting passed with `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent`.
|
|
- Manual smoke passed in the integrated browser: the staged pause/resume flow on `/system/ops/controls` for `Findings lifecycle backfill` rendered scope-impact previews, applied the global pause, and returned to `Enabled` inside the SC-001 budget after bringing the local database up to date.
|