TenantAtlas/specs/253-remove-findings-backfill-runtime-surfaces/research.md
ahmido 2fa8fc0f87
Some checks failed
PR Fast Feedback / fast-feedback (pull_request) Failing after 51s
refactor: remove findings lifecycle backfill runtime surfaces (#294)
## Summary
- decommission the legacy findings lifecycle backfill substrate across command, job, service, and UI layers
- remove related platform capabilities, operation catalog entries, and action surface exemptions
- add regression and removal verification tests to ensure runtime integrity and surface absence
- include spec, plan, tasks, and data-model artifacts for the removal slice

## Scope
- active spec: specs/253-remove-findings-backfill-runtime-surfaces
- target branch: dev

## Validation
- integrated regression and removal verification tests for console, findings, and system ops surfaces
- audit log and capability trace verification for the removal path

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #294
2026-04-28 22:00:51 +00:00

10 KiB

Research — Remove Findings Lifecycle Backfill Runtime Surfaces

Date: 2026-04-28
Spec: spec.md

This document records the repo-grounded planning decisions for the findings lifecycle backfill cleanup slice. All decisions assume the current pre-production LEAN-001 posture.

Decision 1 — Remove source traces, not only visible buttons

Decision: Delete the owning runtime sources for findings lifecycle backfill wherever the repo still starts, labels, or advertises the path. Do not treat the work as a page-local hide of the runbook card or the tenant findings header action.

Rationale:

  • The same path is currently sourced from Runbooks.php, ListFindings.php, TenantpilotBackfillFindingLifecycle, TenantpilotRunDeployRunbooks, FindingsLifecycleBackfillRunbookService, dedicated jobs, OperationCatalog, and OperationRunTriageService.
  • The product-truth problem is cross-surface. Hiding only the visible buttons would leave CLI, deploy/runtime, catalog, and monitoring traces alive.
  • FR-253-013 requires removing the source trace when a shared registry or helper family still emits lifecycle-backfill semantics.

Evidence:

  • apps/platform/app/Filament/System/Pages/Ops/Runbooks.php
  • apps/platform/app/Filament/Resources/FindingResource/Pages/ListFindings.php
  • apps/platform/app/Console/Commands/TenantpilotBackfillFindingLifecycle.php
  • apps/platform/app/Console/Commands/TenantpilotRunDeployRunbooks.php
  • apps/platform/app/Services/Runbooks/FindingsLifecycleBackfillRunbookService.php
  • apps/platform/app/Support/OperationCatalog.php
  • apps/platform/app/Services/SystemConsole/OperationRunTriageService.php

Alternatives considered:

  • Hide the system runbook only.
    • Rejected: tenant UI, CLI, deploy/runtime, and monitoring traces would still advertise supported behavior.
  • Hide the tenant findings action only.
    • Rejected: /system and runtime hooks would still keep the repair path productized.

Decision 2 — Delete the backfill-only runtime cluster; do not keep no-op compatibility shells

Decision: Delete TenantpilotBackfillFindingLifecycle, delete TenantpilotRunDeployRunbooks if lifecycle backfill is still its only shipped responsibility, and delete the dedicated backfill service and jobs instead of leaving dormant compatibility shells.

Rationale:

  • LEAN-001 explicitly prefers replacement or deletion over shims in this repo.
  • TenantpilotRunDeployRunbooks currently delegates only to the shared backfill service, so leaving it behind as a no-op would preserve false product truth.
  • The dedicated workspace and tenant job chain exists only for lifecycle backfill and has no independent product purpose after cleanup.

Evidence:

  • apps/platform/app/Console/Commands/TenantpilotRunDeployRunbooks.php
  • apps/platform/app/Console/Commands/TenantpilotBackfillFindingLifecycle.php
  • apps/platform/app/Services/Runbooks/FindingsLifecycleBackfillRunbookService.php
  • apps/platform/app/Jobs/BackfillFindingLifecycleJob.php
  • apps/platform/app/Jobs/BackfillFindingLifecycleWorkspaceJob.php
  • apps/platform/app/Jobs/BackfillFindingLifecycleTenantIntoWorkspaceRunJob.php

Alternatives considered:

  • Keep the commands as deprecated wrappers that print a skip message.
    • Rejected: still productizes the removed path.
  • Leave the service and jobs behind behind an always-false gate.
    • Rejected: dead runtime ballast is exactly what this cleanup is intended to remove.

Decision 3 — Preserve canonical findings workflows; defer deeper semantics cleanup

Decision: Keep canonical findings workflow behavior unchanged and limit this slice to removing the backfill path and any direct references that disappear with it. Continue to treat acknowledged-status cleanup and creation-time lifecycle invariants as explicit follow-up candidates.

Rationale:

  • spec-candidates.md separates Remove Findings Lifecycle Backfill Runtime Surfaces, Remove Legacy Acknowledged Finding Status Compatibility, and Enforce Creation-Time Finding Invariants into distinct follow-up slices.
  • The current backfill jobs mutate more than surface wiring: they normalize legacy acknowledged to triaged, fill lifecycle fields, fill SLA fields, and consolidate drift duplicates. Folding those semantics into this cleanup would widen scope beyond “remove shipped repair tooling”.
  • The spec and approval rubric both require a bounded cleanup slice.

Evidence:

  • docs/product/spec-candidates.md
  • docs/product/implementation-ledger.md
  • apps/platform/app/Jobs/BackfillFindingLifecycleJob.php
  • apps/platform/app/Jobs/BackfillFindingLifecycleTenantIntoWorkspaceRunJob.php

Alternatives considered:

  • Merge acknowledged-status cleanup into this slice.
    • Rejected: deeper workflow, badge, query, and RBAC consequences deserve their own bounded spec.
  • Merge creation-time invariant hardening into this slice.
    • Rejected: generator and reopen semantics hardening is broader than runtime-surface deletion and should follow after repair tooling is gone.

Decision 4 — Treat operational-control backfill traces as partial residue, not active product truth

Decision: Remove remaining operational-control-related lifecycle-backfill branches and tests rather than trying to make the control path “consistent again”.

Rationale:

  • The repo already partially removed the operational-control surface for this path. OperationalControlCatalogTest rejects findings.lifecycle.backfill, and OperationalControlManagementTest asserts the controls page no longer renders it.
  • The backfill service and some feature tests still carry OperationalControlBlockedException handling and blocked-start audit expectations for the removed control key.
  • Re-adding the control key would widen product truth in the wrong direction.

Evidence:

  • apps/platform/tests/Unit/Support/OperationalControls/OperationalControlCatalogTest.php
  • apps/platform/tests/Feature/System/OpsControls/OperationalControlManagementTest.php
  • apps/platform/tests/Feature/Findings/OperationalControlFindingsBackfillGateTest.php
  • apps/platform/tests/Feature/System/OpsRunbooks/OperationalControlRunbookGateTest.php
  • apps/platform/app/Services/Runbooks/FindingsLifecycleBackfillRunbookService.php

Alternatives considered:

  • Reintroduce findings.lifecycle.backfill to the operational-control catalog so all traces line up.
    • Rejected: that would reverse an already-desirable cleanup and keep the non-shipping feature alive.

Decision 5 — Historical OperationRun and audit rows remain tolerated legacy data without new aliases

Decision: Historical operation_runs.type = findings.lifecycle.backfill rows and prior audit rows may remain stored, but the cleanup must not add new alias handling, new UI guarantees, or special retry or cancel semantics solely for those historical rows.

Rationale:

  • LEAN-001 forbids compatibility layers without production data pressure.
  • OperationRunTriageService still treats findings.lifecycle.backfill as retryable and cancelable. That support is part of the shipped runtime story and should disappear with the runtime path rather than being preserved for historical records.
  • The spec explicitly says historical data migration and historical compatibility handling are out of scope.

Evidence:

  • apps/platform/app/Services/SystemConsole/OperationRunTriageService.php
  • apps/platform/app/Support/OperationCatalog.php
  • specs/253-remove-findings-backfill-runtime-surfaces/spec.md

Alternatives considered:

  • Preserve the operation type and alias so old runs keep a polished label forever.
    • Rejected: adds a compatibility obligation for non-shipping behavior.
  • Add a migration to scrub old rows.
    • Rejected: out of scope and not justified in pre-production.

Decision 6 — Validation stays in fast-feedback and confidence lanes with absence-focused proof

Decision: Replace backfill-specific start, preflight, gate, and command tests with narrow absence and regression coverage. Keep representative findings workflow regression explicit and do not add browser or heavy-governance coverage.

Rationale:

  • The new business truth is absence of the repair path plus continuity of canonical findings workflows.
  • Existing backfill tests already prove the current path thoroughly; the replacement proof should be just as targeted, but around absence and unaffected workflows.
  • Browser coverage would mostly duplicate Filament action choreography and not improve confidence on the cleanup boundaries.

Evidence:

  • apps/platform/tests/Feature/System/OpsRunbooks/FindingsLifecycleBackfillStartTest.php
  • apps/platform/tests/Feature/Filament/Spec113/AdminFindingsNoMaintenanceActionsTest.php
  • apps/platform/tests/Feature/Console/Spec113/DeployRunbooksCommandTest.php
  • apps/platform/tests/Feature/Findings/OperationalControlFindingsBackfillGateTest.php

Alternatives considered:

  • Keep the existing backfill-only tests and just rename assertions.
    • Rejected: they would still preserve a product contract for a deleted runtime path.
  • Add browser smoke for the deleted buttons.
    • Rejected: the proving purpose is server-side absence and unchanged workflow behavior, not browser choreography.

Decision 7 — No panel, global-search, or asset work is part of this cleanup

Decision: Keep the cleanup inside existing system and tenant surfaces. Do not change Filament panel registration, do not introduce or alter global-search behavior, and do not add asset work.

Rationale:

  • The affected surfaces already exist and already run on Filament v5 + Livewire v4.
  • The cleanup removes a header action and a system runbook card, but it does not add a new resource, page family, or asset bundle.
  • Provider registration changes in bootstrap/providers.php or filament:assets deployment work would be unrelated scope growth.

Evidence:

  • apps/platform/app/Filament/System/Pages/Ops/Runbooks.php
  • apps/platform/app/Filament/Resources/FindingResource/Pages/ListFindings.php
  • repo conventions in Agents.md and .github/copilot-instructions.md

Alternatives considered:

  • Add a replacement informational page or asset-backed empty state.
    • Rejected: the narrowest correct implementation is removal, not replacement UX.