TenantAtlas/specs/116-baseline-drift-engine/plan.md
ahmido 7620144ab6 Spec 116: Baseline drift engine v1 (meta fidelity + coverage guard) (#141)
Implements Spec 116 baseline drift engine v1 (meta fidelity) with coverage guard, stable finding identity, and Filament UI surfaces.

Highlights
- Baseline capture/compare jobs and supporting services (meta contract hashing via InventoryMetaContract + DriftHasher)
- Coverage proof parsing + compare partial outcome behavior
- Filament pages/resources/widgets for baseline compare + drift landing improvements
- Pest tests for capture/compare/coverage guard and UI start surfaces
- Research report: docs/research/golden-master-baseline-drift-deep-analysis.md

Validation
- `vendor/bin/sail bin pint --dirty`
- `vendor/bin/sail artisan test --compact --filter="Baseline"`

Notes
- No destructive user actions added; compare/capture remain queued jobs.
- Provider registration unchanged (Laravel 11+/12 uses bootstrap/providers.php for panel providers; not touched here).

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #141
2026-03-02 22:02:58 +00:00

13 KiB

Implementation Plan: 116 — Baseline Drift Engine (Final Architecture)

Branch: 116-baseline-drift-engine | Date: 2026-03-01 | Spec: specs/116-baseline-drift-engine/spec.md Input: Feature specification from specs/116-baseline-drift-engine/spec.md

Summary

Align the existing baseline capture/compare pipeline to Spec 116 by (1) defining an explicit meta-fidelity hash contract, (2) enforcing the “coverage guard” based on the latest inventory sync run, and (3) switching baseline-compare findings to snapshot-scoped stable identities (recurrence keys) while preserving existing baseline-profile grouping for UI/stats and auto-close semantics.

Technical Context

Language/Version: PHP 8.4 Primary Dependencies: Laravel 12, Filament v5, Livewire v4 Storage: PostgreSQL (via Sail) Testing: Pest v4 (PHPUnit 12) Target Platform: Docker (Laravel Sail) Project Type: Web application (Laravel) Performance Goals: Compare jobs must remain bounded by scope size; avoid N+1 queries when loading snapshot + current inventory Constraints:

  • Ops-UX: OperationRun lifecycle + 3-surface feedback (toast queued-only, progress in widget/run detail, terminal DB notification exactly-once)
  • Summary counts numeric-only and keys restricted to OperationSummaryKeys
  • Tenant/workspace isolation + RBAC deny-as-not-found rules Scale/Scope: Tenant inventories may be large; baseline compare must be efficient on (tenant_id, policy_type) filtering

Constitution Check

GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.

  • Inventory-first: PASS — compare uses Inventory as last observed state; baselines are immutable snapshots.
  • Read/write separation: PASS — this feature is read-only analysis; no Graph writes.
  • Graph contract path: PASS — inventory sync already uses GraphClientInterface; baseline compare itself is DB-only at render time.
  • Deterministic capabilities: PASS — baseline capability checks use existing registries/policies; no new ad-hoc strings.
  • Workspace + tenant isolation: PASS — baseline profiles are workspace-owned; runs/findings are tenant-owned; authorization remains deny-as-not-found for non-members.
  • Run observability (OperationRun): PASS — capture/compare already use OperationRun + queued jobs.
  • Ops-UX 3-surface feedback: PASS — existing pages use canonical queued toast presenter.
  • Ops-UX lifecycle: PASS — transitions must remain inside OperationRunService.
  • Ops-UX summary counts: PASS — only numeric summary counts using canonical keys.
  • Filament UI contract: PASS — only small scope-picker adjustments; no new pages beyond what exists.
  • Filament UX-001 layout: PASS — Baseline Profile Create/Edit will be updated to a Main/Aside layout as part of the scope-picker work.

Project Structure

Documentation (this feature)

specs/116-baseline-drift-engine/
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│   └── openapi.yaml
└── checklists/
    └── requirements.md

Source Code (repository root)

app/
├── Filament/
│   ├── Pages/                # Baseline compare landing + run detail links (existing)
│   ├── Resources/            # BaselineProfileResource (existing)
│   └── Widgets/              # Baseline compare widgets (existing)
├── Jobs/
│   ├── CaptureBaselineSnapshotJob.php
│   └── CompareBaselineToTenantJob.php
├── Models/
│   ├── BaselineProfile.php
│   ├── BaselineSnapshot.php
│   ├── BaselineSnapshotItem.php
│   ├── Finding.php
│   ├── InventoryItem.php
│   └── OperationRun.php
├── Services/
│   ├── Baselines/
│   │   ├── BaselineCaptureService.php
│   │   ├── BaselineCompareService.php
│   │   ├── BaselineAutoCloseService.php
│   │   └── BaselineSnapshotIdentity.php
│   ├── Drift/
│   │   ├── DriftFindingGenerator.php
│   │   └── DriftHasher.php
│   ├── Inventory/
│   │   └── InventorySyncService.php
│   └── OperationRunService.php
└── Support/
    ├── Baselines/BaselineCompareStats.php
    └── OpsUx/OperationSummaryKeys.php

tests/
└── Feature/
    └── Baselines/
        ├── BaselineCompareFindingsTest.php
        ├── BaselineComparePreconditionsTest.php
        ├── BaselineCompareStatsTest.php
        └── BaselineOperabilityAutoCloseTest.php

Structure Decision: Web application (Laravel 12) — all work stays in existing app/ services/jobs/models and tests/Feature.

Complexity Tracking

No constitution violations are required for this feature.

Phase 0 — Outline & Research (DONE)

Outputs:

  • specs/116-baseline-drift-engine/research.md

Key reconciliations captured:

  • Baseline compare finding identity will move to recurrence-key based upsert (snapshot-scoped identity) aligned with the existing DriftFindingGenerator pattern.
  • Coverage guard requires persisting per-type coverage outcomes into the latest inventory sync run context.
  • Scope must include policy_types + foundation_types with correct empty-default semantics.

Phase 1 — Design & Contracts (DONE)

Outputs:

  • specs/116-baseline-drift-engine/data-model.md
  • specs/116-baseline-drift-engine/contracts/openapi.yaml
  • specs/116-baseline-drift-engine/quickstart.md

Design highlights:

  • Coverage lives in operation_runs.context for inventory sync runs (detailed lists), while summary_counts remain numeric-only.
  • Findings use recurrence_key and fingerprint = recurrence_key for idempotent upserts.
  • Findings remain grouped by scope_key = baseline_profile:{id} to preserve existing UI/stats and auto-close behavior.

Phase 1 — Agent Context Update (REQUIRED)

Run:

  • .specify/scripts/bash/update-agent-context.sh copilot

Phase 2 — Implementation Plan

Step 1 — Baseline scope schema + UI picker

Goal: implement FR-116v1-01 and FR-116v1-02.

Changes:

  • Update baseline scope handling (app/Support/Baselines/BaselineScope.php) to support:
    • policy_types: [] meaning “all supported policy types excluding foundations”
    • foundation_types: [] meaning “none”
  • Update BaselineProfile form schema (Filament Resource) to show multi-selects for Policy Types and Foundations.
  • Document selector-to-config mapping (source of truth for option lists + defaults):
Selector Form state path Options source Default semantics
Policy Types scope_jsonb.policy_types config('tenantpilot.supported_policy_types') via App\Support\Inventory\InventoryPolicyTypeMeta::supported() Empty ⇒ all supported policy types (excluding foundations)
Foundations scope_jsonb.foundation_types config('tenantpilot.foundation_types') via App\Support\Inventory\InventoryPolicyTypeMeta::foundations() Empty ⇒ none

Notes:

  • Inventory sync selection uses App\Services\BackupScheduling\PolicyTypeResolver::supportedPolicyTypes() for policy types, and InventorySyncService::foundationTypes() (derived from config('tenantpilot.foundation_types')) when include_foundations=true.

Tests:

  • Update/add Pest tests around scope expansion defaults (prefer a focused unit-like test if an expansion helper exists).

Step 2 — Inventory Meta Contract (explicit hash input)

Goal: implement FR-116v1-04, FR-116v1-05, FR-116v1-06, FR-116v1-06a.

Changes:

  • Introduce a dedicated contract builder (e.g. App\Services\Baselines\InventoryMetaContract) that returns a normalized array for hashing.
    • Contract output must be explicitly versioned (e.g., meta_contract.version = 1) so future additions do not retroactively change v1 semantics.
    • Contract signals are best-effort: missing signals are represented as null (not omitted) to keep hashing stable across partial inventories.
  • Update baseline capture hashing (BaselineSnapshotIdentity::hashItemContent() or the capture service) to hash the contract output only.
    • Persist the exact contract payload used for hashing to baseline_snapshot_items.meta_jsonb.meta_contract for auditability/reproducibility.
    • Persist observation metadata alongside the hash in baseline_snapshot_items.meta_jsonb (at minimum: fidelity, source, observed_at; when available: observed_operation_run_id).
  • Update baseline compare to compute current_hash using the same contract builder.
    • Current-state observed_at is derived from persisted inventory evidence (inventory_items.last_seen_at) and MUST NOT require per-item external hydration calls during compare.
  • Define “latest successful snapshot” (v1) as baseline_profiles.active_snapshot_id and ensure compare start is blocked when it is null (no “pick newest captured_at” fallback).

Tests:

  • Add a small Pest test for contract normalization stability (ordering, missing fields, nullability) in tests/Unit/Baselines/InventoryMetaContractTest.php.
  • Update baseline capture/compare tests if they currently assume hashing full meta_jsonb.

Step 3 — Inventory sync coverage recording

Goal: provide coverage for FR-116v1-07.

Changes:

  • Extend inventory sync pipeline (in App\Services\Inventory\InventorySyncService and/or the job that orchestrates sync) to write a coverage payload into the inventory sync OperationRun.context:
    • Per policy type: status (succeeded|failed|skipped) and optional item_count.
    • Foundations can be included in the same shape if they are part of selection.
  • Ensure this is written even when some types fail, so downstream compare can determine uncovered types.

Tests:

  • Add/extend tests around inventory sync operation context writing (mocking Graph calls as needed; keep scope minimal).

Step 4 — Baseline compare coverage guard + outcome semantics

Goal: implement FR-116v1-07 and align to Ops-UX.

Changes:

  • In baseline compare job/service:
    • Resolve the latest inventory sync run for the tenant.
    • Compute covered_policy_types from sync run context.
    • Compute uncovered_policy_types = effective_scope.policy_types - covered_policy_types.
    • Skip emission of all finding types for uncovered policy types.
    • Record coverage details into the compare run context for auditability.
    • If uncovered types exist, set compare outcome to partially_succeeded via OperationRunService and set summary_counts.errors_recorded = count(uncovered_policy_types).
    • If effective scope expands to zero types, complete as partially_succeeded and set summary_counts.errors_recorded = 1 so the warning remains visible under numeric-only summary counts.
    • If there is no completed inventory sync run (or coverage proof is missing/unreadable), treat coverage as unproven for all effective-scope types (fail-safe): emit zero findings and complete as partially_succeeded.

Tests:

  • Add a new Pest test in tests/Feature/Baselines asserting:
    • uncovered types cause partial outcome
    • uncovered types produce zero findings (even if snapshot/current data would otherwise create missing/unexpected/different)
    • covered types still produce findings

Step 5 — Snapshot-scoped stable finding identity

Goal: implement FR-116v1-09 and FR-116v1-10.

Changes:

  • Replace hash-evidence-based fingerprint generation in baseline compare with a stable recurrence key:
    • Inputs: tenant_id, baseline_snapshot_id, policy_type, subject_external_id, change_type
  • Persist:
    • findings.recurrence_key = <computed>
    • findings.fingerprint = <same computed>
  • Keep scope_key = baseline_profile:{baselineProfileId}.
  • Ensure retry idempotency: do not increment lifecycle counters more than once per run identity.

Tests:

  • Update tests/Feature/Baselines/BaselineCompareFindingsTest.php:
    • Ensure fingerprint no longer depends on baseline/current hash.
    • Assert stable identity across re-runs with changed evidence hashes.
  • Add coverage for “recapture uses new snapshot id → new finding identity”.

Step 6 — Auto-close + stats compatibility

Goal: preserve existing operability expectations and keep UI stable.

Changes:

  • Ensure BaselineAutoCloseService still resolves stale findings after a fully successful compare, even though identities now include snapshot id.
  • Confirm BaselineCompareStats remains correct for grouping by scope_key = baseline_profile:{id}.

Tests:

  • Update/keep tests/Feature/Baselines/BaselineOperabilityAutoCloseTest.php passing.
  • Update tests/Feature/Baselines/BaselineCompareStatsTest.php only if scope semantics change.

Step 7 — Ops UX + auditability

Goal: implement FR-116v1-03 and FR-116v1-11.

Changes:

  • Ensure both capture and compare runs write:
    • effective_scope.* in run context
    • coverage summary and uncovered lists when partial
    • numeric summary counts using canonical keys only
    • per-change-type finding counts in operation_runs.context.findings.counts_by_change_type
    • Treat the operation_runs record as the canonical audit trail for this feature slice (do not add parallel “audit summary” persistence for the same data).

Tests:

  • Add a regression test that asserts summary_counts contains only allowed keys and numeric values (where a helper exists).

Post-design Constitution Re-check

Expected: PASS (no changes introduce new Graph endpoints or bypass services; OperationRun lifecycle + 3-surface feedback remain intact; RBAC deny-as-not-found semantics preserved).