spec(116): baseline drift engine specs

This commit is contained in:
Ahmed Darrazi 2026-03-02 02:08:28 +01:00
parent fdfb781144
commit add136cc3c
9 changed files with 1178 additions and 1 deletions

View File

@ -39,6 +39,7 @@ ## Active Technologies
- PostgreSQL (existing tables: `workspaces`, `workspace_memberships`, `users`, `audit_logs`) (107-workspace-chooser)
- PHP 8.4 (Laravel 12) + Filament v5, Livewire v4, Laravel Framework v12 (109-review-pack-export)
- PostgreSQL (jsonb columns for summary/options), local filesystem (`exports` disk) for ZIP artifacts (109-review-pack-export)
- PHP 8.4 + Laravel 12, Filament v5, Livewire v4 (116-baseline-drift-engine)
- PHP 8.4.15 (feat/005-bulk-operations)
@ -58,8 +59,8 @@ ## Code Style
PHP 8.4.15: Follow standard conventions
## Recent Changes
- 116-baseline-drift-engine: Added PHP 8.4 + Laravel 12, Filament v5, Livewire v4
- 110-ops-ux-enforcement: Added PHP 8.4.x + Laravel 12, Filament v5, Livewire v4
- 109-review-pack-export: Added PHP 8.4 (Laravel 12) + Filament v5, Livewire v4, Laravel Framework v12
- 109-review-pack-export: Added [if applicable, e.g., PostgreSQL, CoreData, files or N/A]
<!-- MANUAL ADDITIONS START -->
<!-- MANUAL ADDITIONS END -->

View File

@ -0,0 +1,35 @@
# Specification Quality Checklist: Baseline Drift Engine (Final Architecture)
**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-03-01
**Feature**: [specs/116-baseline-drift-engine/spec.md](../spec.md)
## Content Quality
- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed
## Requirement Completeness
- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified
## Feature Readiness
- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification
## Notes
- Items marked incomplete require spec updates before `/speckit.clarify` or `/speckit.plan`
- This spec uses internal domain terms like “Operation run”, “capability”, and “hash fidelity” intentionally; they are defined in-context and treated as product concepts rather than framework implementation.

View File

@ -0,0 +1,157 @@
openapi: 3.0.3
info:
title: Spec 116 - Baseline Drift Engine
version: 0.1.0
description: |
Minimal contracts for Baseline capture/compare operations and finding summaries.
This repo is primarily Filament-driven; these endpoints represent conceptual contracts
or internal routes/services rather than guaranteed public APIs.
servers:
- url: /
paths:
/internal/baselines/{baselineProfileId}/snapshots:
post:
summary: Capture a baseline snapshot
parameters:
- in: path
name: baselineProfileId
required: true
schema:
type: integer
requestBody:
required: false
responses:
'202':
description: Snapshot capture queued
content:
application/json:
schema:
$ref: '#/components/schemas/OperationQueued'
/internal/baselines/{baselineProfileId}/compare:
post:
summary: Compare baseline snapshot to tenant inventory
parameters:
- in: path
name: baselineProfileId
required: true
schema:
type: integer
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/BaselineCompareRequest'
responses:
'202':
description: Compare queued
content:
application/json:
schema:
$ref: '#/components/schemas/OperationQueued'
/internal/tenants/{tenantId}/findings:
get:
summary: List findings for a tenant (filtered)
parameters:
- in: path
name: tenantId
required: true
schema:
type: integer
- in: query
name: scope_key
required: false
schema:
type: string
- in: query
name: status
required: false
schema:
type: string
enum: [open, resolved]
responses:
'200':
description: Findings list
content:
application/json:
schema:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/Finding'
components:
schemas:
BaselineCompareRequest:
type: object
required: [tenant_id]
properties:
tenant_id:
type: integer
baseline_snapshot_id:
type: integer
nullable: true
description: Optional explicit snapshot selection. If omitted, latest successful snapshot is used.
OperationQueued:
type: object
required: [operation_run_id]
properties:
operation_run_id:
type: integer
Finding:
type: object
required: [id, tenant_id, fingerprint, scope_key, created_at]
properties:
id:
type: integer
tenant_id:
type: integer
fingerprint:
type: string
description: Stable identifier; for baseline drift equals recurrence_key.
recurrence_key:
type: string
nullable: true
scope_key:
type: string
change_type:
type: string
nullable: true
policy_type:
type: string
nullable: true
subject_external_id:
type: string
nullable: true
evidence:
type: object
additionalProperties: true
first_seen_at:
type: string
format: date-time
nullable: true
last_seen_at:
type: string
format: date-time
nullable: true
times_seen:
type: integer
nullable: true
resolved_at:
type: string
format: date-time
nullable: true
created_at:
type: string
format: date-time
updated_at:
type: string
format: date-time

View File

@ -0,0 +1,122 @@
# Phase 1 — Data Model (Baseline Drift Engine)
This document identifies the data/entities involved in Spec 116 and the minimal schema/config changes needed to implement it in this repository.
## Existing Entities (Confirmed)
### BaselineProfile
Represents a baseline definition.
- Fields (expected): `id`, `name`, `description`, `scope` (jsonb), `created_by`, timestamps
- Relationships: has many snapshots; assigned to tenants via `BaselineTenantAssignment`
### BaselineSnapshot
Immutable capture of baseline state at a point in time.
- Fields (expected): `id`, `baseline_profile_id`, `captured_at`, `status`, `operation_run_id`, timestamps
- Relationships: has many items; belongs to baseline profile
### BaselineSnapshotItem
One item in a baseline snapshot.
- Fields (expected):
- `id`, `baseline_snapshot_id`
- `policy_type`
- `external_id`
- `subject_json` (jsonb) or subject fields
- `baseline_hash` (string)
- `meta_jsonb` (jsonb)
- timestamps
### Finding
Generic drift finding storage.
- Fields (confirmed by usage): `tenant_id`, `fingerprint` (unique with tenant), `recurrence_key` (nullable), `scope_key`, lifecycle fields (`first_seen_at`, `last_seen_at`, `times_seen`), evidence (jsonb)
### OperationRun
Tracks long-running operations.
- Fields (by convention): `type`, `status/outcome`, `summary_counts` (numeric map), `context` (jsonb)
## New / Adjusted Data Requirements
### 1) Inventory sync coverage context
**Goal:** Baseline compare must know which policy types were actually processed successfully by inventory sync.
**Where:** `operation_runs.context` for the latest inventory sync run.
**Shape (proposed):**
```json
{
"inventory": {
"coverage": {
"policy_types": {
"deviceConfigurations": {"status": "succeeded", "item_count": 123},
"compliancePolicies": {"status": "failed", "error": "..."}
},
"foundation_types": {
"securityBaselines": {"status": "succeeded", "item_count": 4}
}
}
}
}
```
**Notes:**
- Only `summary_counts` must remain numeric; detailed coverage lists live in `context`.
- For Spec 116 v1, its sufficient to store `policy_types` coverage; adding `foundation_types` coverage at the same time keeps parity with scope rules.
### 2) Baseline scope schema
**Goal:** Support both policy and foundation scope with correct defaults.
**Current:** `policy_types` only.
**Target:**
```json
{
"policy_types": ["deviceConfigurations", "compliancePolicies"],
"foundation_types": ["securityBaselines"]
}
```
**Default semantics:**
- Empty `policy_types` means “all supported policy types excluding foundations”.
- Empty `foundation_types` means “none”.
### 3) Findings recurrence strategy
**Goal:** Stable identity per snapshot and per subject.
- `findings.recurrence_key`: populated for baseline compare findings.
- `findings.fingerprint`: set to the same recurrence key (to satisfy existing uniqueness constraint).
**Recurrence key inputs:**
- `tenant_id`
- `baseline_snapshot_id`
- `policy_type`
- `subject_external_id`
- `change_type`
**Grouping (scope_key):**
- Keep `findings.scope_key = baseline_profile:{baselineProfileId}` for baseline compare findings.
### 4) Inventory meta contract
**Goal:** Explicitly define what is hashed for v1 comparisons.
- Implemented as a dedicated builder class (no schema change required).
- Used by baseline capture to compute `baseline_hash` and by compare to compute `current_hash`.
## Potential Migrations (Likely)
- If `baseline_profiles.scope` is not jsonb or does not include foundation types → migration to adjust structure (jsonb stays the same, but add support in code; DB change may be optional).
- If coverage context needs persistence beyond operation run context → avoid adding tables unless proven necessary; context-based is sufficient for v1.
## Index / Performance Notes
- Findings queries commonly filter by `tenant_id` + `scope_key`; ensure there is an index on `(tenant_id, scope_key)`.
- Baseline snapshot items must be efficiently loaded by `(baseline_snapshot_id, policy_type)`.

View File

@ -0,0 +1,242 @@
# Implementation Plan: 116 — Baseline Drift Engine (Final Architecture)
**Branch**: `116-baseline-drift-engine` | **Date**: 2026-03-01 | **Spec**: `specs/116-baseline-drift-engine/spec.md`
**Input**: Feature specification from `specs/116-baseline-drift-engine/spec.md`
## Summary
Align the existing baseline capture/compare pipeline to Spec 116 by (1) defining an explicit meta-fidelity hash contract, (2) enforcing the “coverage guard” based on the latest inventory sync run, and (3) switching baseline-compare findings to snapshot-scoped stable identities (recurrence keys) while preserving existing baseline-profile grouping for UI/stats and auto-close semantics.
## Technical Context
**Language/Version**: PHP 8.4
**Primary Dependencies**: Laravel 12, Filament v5, Livewire v4
**Storage**: PostgreSQL (via Sail)
**Testing**: Pest v4 (PHPUnit 12)
**Target Platform**: Docker (Laravel Sail)
**Project Type**: Web application (Laravel)
**Performance Goals**: Compare jobs must remain bounded by scope size; avoid N+1 queries when loading snapshot + current inventory
**Constraints**:
- Ops-UX: OperationRun lifecycle + 3-surface feedback (toast queued-only, progress in widget/run detail, terminal DB notification exactly-once)
- Summary counts numeric-only and keys restricted to `OperationSummaryKeys`
- Tenant/workspace isolation + RBAC deny-as-not-found rules
**Scale/Scope**: Tenant inventories may be large; baseline compare must be efficient on `(tenant_id, policy_type)` filtering
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
- Inventory-first: PASS — compare uses Inventory as last observed state; baselines are immutable snapshots.
- Read/write separation: PASS — this feature is read-only analysis; no Graph writes.
- Graph contract path: PASS — inventory sync already uses `GraphClientInterface`; baseline compare itself is DB-only at render time.
- Deterministic capabilities: PASS — baseline capability checks use existing registries/policies; no new ad-hoc strings.
- Workspace + tenant isolation: PASS — baseline profiles are workspace-owned; runs/findings are tenant-owned; authorization remains deny-as-not-found for non-members.
- Run observability (OperationRun): PASS — capture/compare already use OperationRun + queued jobs.
- Ops-UX 3-surface feedback: PASS — existing pages use canonical queued toast presenter.
- Ops-UX lifecycle: PASS — transitions must remain inside `OperationRunService`.
- Ops-UX summary counts: PASS — only numeric summary counts using canonical keys.
- Filament UI contract: PASS — only small scope-picker adjustments; no new pages beyond what exists.
- Filament UX-001 layout: PASS — Baseline Profile Create/Edit will be updated to a Main/Aside layout as part of the scope-picker work.
## Project Structure
### Documentation (this feature)
```text
specs/116-baseline-drift-engine/
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│ └── openapi.yaml
└── checklists/
└── requirements.md
```
### Source Code (repository root)
```text
app/
├── Filament/
│ ├── Pages/ # Baseline compare landing + run detail links (existing)
│ ├── Resources/ # BaselineProfileResource (existing)
│ └── Widgets/ # Baseline compare widgets (existing)
├── Jobs/
│ ├── CaptureBaselineSnapshotJob.php
│ └── CompareBaselineToTenantJob.php
├── Models/
│ ├── BaselineProfile.php
│ ├── BaselineSnapshot.php
│ ├── BaselineSnapshotItem.php
│ ├── Finding.php
│ ├── InventoryItem.php
│ └── OperationRun.php
├── Services/
│ ├── Baselines/
│ │ ├── BaselineCaptureService.php
│ │ ├── BaselineCompareService.php
│ │ ├── BaselineAutoCloseService.php
│ │ └── BaselineSnapshotIdentity.php
│ ├── Drift/
│ │ ├── DriftFindingGenerator.php
│ │ └── DriftHasher.php
│ ├── Inventory/
│ │ └── InventorySyncService.php
│ └── OperationRunService.php
└── Support/
├── Baselines/BaselineCompareStats.php
└── OpsUx/OperationSummaryKeys.php
tests/
└── Feature/
└── Baselines/
├── BaselineCompareFindingsTest.php
├── BaselineComparePreconditionsTest.php
├── BaselineCompareStatsTest.php
└── BaselineOperabilityAutoCloseTest.php
```
**Structure Decision**: Web application (Laravel 12) — all work stays in existing `app/` services/jobs/models and `tests/Feature`.
## Complexity Tracking
No constitution violations are required for this feature.
## Phase 0 — Outline & Research (DONE)
Outputs:
- `specs/116-baseline-drift-engine/research.md`
Key reconciliations captured:
- Baseline compare finding identity will move to recurrence-key based upsert (snapshot-scoped identity) aligned with the existing `DriftFindingGenerator` pattern.
- Coverage guard requires persisting per-type coverage outcomes into the latest inventory sync run context.
- Scope must include `policy_types` + `foundation_types` with correct empty-default semantics.
## Phase 1 — Design & Contracts (DONE)
Outputs:
- `specs/116-baseline-drift-engine/data-model.md`
- `specs/116-baseline-drift-engine/contracts/openapi.yaml`
- `specs/116-baseline-drift-engine/quickstart.md`
Design highlights:
- Coverage lives in `operation_runs.context` for inventory sync runs (detailed lists), while `summary_counts` remain numeric-only.
- Findings use `recurrence_key` and `fingerprint = recurrence_key` for idempotent upserts.
- Findings remain grouped by `scope_key = baseline_profile:{id}` to preserve existing UI/stats and auto-close behavior.
## Phase 1 — Agent Context Update (REQUIRED)
Run:
- `.specify/scripts/bash/update-agent-context.sh copilot`
## Phase 2 — Implementation Plan
### Step 1 — Baseline scope schema + UI picker
Goal: implement FR-116v1-01 and FR-116v1-02.
Changes:
- Update baseline scope handling (`app/Support/Baselines/BaselineScope.php`) to support:
- `policy_types: []` meaning “all supported policy types excluding foundations”
- `foundation_types: []` meaning “none”
- Update `BaselineProfile` form schema (Filament Resource) to show multi-selects for Policy Types and Foundations.
Tests:
- Update/add Pest tests around scope expansion defaults (prefer a focused unit-like test if an expansion helper exists).
### Step 2 — Inventory Meta Contract (explicit hash input)
Goal: implement FR-116v1-04, FR-116v1-05, FR-116v1-06.
Changes:
- Introduce a dedicated contract builder (e.g. `App\Services\Baselines\InventoryMetaContract`) that returns a normalized array for hashing.
- Update baseline capture hashing (`BaselineSnapshotIdentity::hashItemContent()` or the capture service) to hash the contract output only.
- Update baseline compare to compute `current_hash` using the same contract builder.
Tests:
- Add a small Pest test for contract normalization stability (ordering, missing fields, nullability).
- Update baseline capture/compare tests if they currently assume hashing full `meta_jsonb`.
### Step 3 — Inventory sync coverage recording
Goal: provide coverage for FR-116v1-07.
Changes:
- Extend inventory sync pipeline (in `App\Services\Inventory\InventorySyncService` and/or the job that orchestrates sync) to write a coverage payload into the inventory sync `OperationRun.context`:
- Per policy type: status (`succeeded|failed|skipped`) and optional `item_count`.
- Foundations can be included in the same shape if they are part of selection.
- Ensure this is written even when some types fail, so downstream compare can determine uncovered types.
Tests:
- Add/extend tests around inventory sync operation context writing (mocking Graph calls as needed; keep scope minimal).
### Step 4 — Baseline compare coverage guard + outcome semantics
Goal: implement FR-116v1-07 and align to Ops-UX.
Changes:
- In baseline compare job/service:
- Resolve the latest inventory sync run for the tenant.
- Compute `covered_policy_types` from sync run context.
- Compute `uncovered_policy_types = effective_scope.policy_types - covered_policy_types`.
- Skip emission of *all* finding types for uncovered policy types.
- Record coverage details into the compare run `context` for auditability.
- If uncovered types exist, set compare outcome to `partially_succeeded` via `OperationRunService` and set `summary_counts.errors_recorded = count(uncovered_policy_types)`.
- If effective scope expands to zero types, complete as `partially_succeeded` and set `summary_counts.errors_recorded = 1` so the warning remains visible under numeric-only summary counts.
- If there is no completed inventory sync run (or coverage proof is missing/unreadable), treat coverage as unproven for all effective-scope types (fail-safe): emit zero findings and complete as `partially_succeeded`.
Tests:
- Add a new Pest test in `tests/Feature/Baselines` asserting:
- uncovered types cause partial outcome
- uncovered types produce zero findings (even if snapshot/current data would otherwise create missing/unexpected/different)
- covered types still produce findings
### Step 5 — Snapshot-scoped stable finding identity
Goal: implement FR-116v1-09 and FR-116v1-10.
Changes:
- Replace hash-evidence-based `fingerprint` generation in baseline compare with a stable recurrence key:
- Inputs: `tenant_id`, `baseline_snapshot_id`, `policy_type`, `subject_external_id`, `change_type`
- Persist:
- `findings.recurrence_key = <computed>`
- `findings.fingerprint = <same computed>`
- Keep `scope_key = baseline_profile:{baselineProfileId}`.
- Ensure retry idempotency: do not increment lifecycle counters more than once per run identity.
Tests:
- Update `tests/Feature/Baselines/BaselineCompareFindingsTest.php`:
- Ensure fingerprint no longer depends on baseline/current hash.
- Assert stable identity across re-runs with changed evidence hashes.
- Add coverage for “recapture uses new snapshot id → new finding identity”.
### Step 6 — Auto-close + stats compatibility
Goal: preserve existing operability expectations and keep UI stable.
Changes:
- Ensure `BaselineAutoCloseService` still resolves stale findings after a fully successful compare, even though identities now include snapshot id.
- Confirm `BaselineCompareStats` remains correct for grouping by `scope_key = baseline_profile:{id}`.
Tests:
- Update/keep `tests/Feature/Baselines/BaselineOperabilityAutoCloseTest.php` passing.
- Update `tests/Feature/Baselines/BaselineCompareStatsTest.php` only if scope semantics change.
### Step 7 — Ops UX + auditability
Goal: implement FR-116v1-03 and FR-116v1-11.
Changes:
- Ensure both capture and compare runs write:
- `effective_scope.*` in run context
- coverage summary and uncovered lists when partial
- numeric summary counts using canonical keys only
- per-change-type finding counts in `operation_runs.context.findings.counts_by_change_type`
Tests:
- Add a regression test that asserts `summary_counts` contains only allowed keys and numeric values (where a helper exists).
## Post-design Constitution Re-check
Expected: PASS (no changes introduce new Graph endpoints or bypass services; OperationRun lifecycle + 3-surface feedback remain intact; RBAC deny-as-not-found semantics preserved).

View File

@ -0,0 +1,57 @@
# Quickstart — Spec 116 Baseline Drift Engine
This quickstart is for developers validating the behavior locally.
## Prerequisites
- Sail up: `vendor/bin/sail up -d`
- Install deps (if needed): `vendor/bin/sail composer install`
- Run migrations: `vendor/bin/sail artisan migrate`
## Workflow
### 1) Run an inventory sync (establish coverage)
- Trigger inventory sync for a tenant using the existing UI/command for inventory sync.
- Verify the latest inventory sync `operation_runs.context` contains a coverage payload:
- `inventory.coverage.policy_types.{type}.status = succeeded|failed`
- (optionally) `inventory.coverage.foundation_types.{type}.status = ...`
Expected:
- If some types fail or are not processed, the coverage payload reflects that.
### 2) Capture a baseline snapshot
- Use the Baseline Profile UI to capture a snapshot.
Expected:
- Snapshot items store `baseline_hash` computed from the Inventory Meta Contract.
- Snapshot identity/dedupe follows existing snapshot identity rules, but content hashes come from the explicit contract.
### 3) Compare baseline to tenant
- Trigger “Compare now” from the Baseline Compare landing page.
Expected:
- Compare uses latest successful baseline snapshot by default (or explicit snapshot selection if provided).
- Compare uses the latest inventory sync run coverage:
- For uncovered policy types, **no findings are emitted**.
- OperationRun outcome becomes “completed with warnings” (partial) when uncovered types exist.
- `summary_counts.errors_recorded = count(uncovered_types)`.
- Edge case: if effective scope expands to zero types, outcome is still partial (warnings) and `summary_counts.errors_recorded = 1`.
- Findings identity:
- stable `recurrence_key` uses `baseline_snapshot_id` and does **not** include baseline/current hashes.
- `fingerprint == recurrence_key`.
- `scope_key` remains profile-scoped (`baseline_profile:{id}`).
### 4) Validate UI counts
- Verify baseline compare stats remain grouped by the profile scope (`scope_key = baseline_profile:{id}`), consistent with research.md Decision #2.
- Validate that re-capturing (new snapshot) creates a new set of findings due to snapshot-scoped identity (recurrence key includes `baseline_snapshot_id`).
## Minimal smoke test checklist
- Compare with full coverage: produces correct findings; outcome success.
- Compare with partial coverage: produces findings only for covered types; outcome partial; uncovered types listed in context.
- Re-run compare with no changes: no new findings; `times_seen` increments.
- Re-capture snapshot and compare: findings identity changes (snapshot-scoped).

View File

@ -0,0 +1,104 @@
# Phase 0 — Research (Baseline Drift Engine)
This document resolves the open design/implementation questions needed to produce a concrete implementation plan for Spec 116, grounded in the current codebase.
## Repo Reality Check (What already exists)
- Baseline domain tables exist: `baseline_profiles`, `baseline_snapshots`, `baseline_snapshot_items`, `baseline_tenant_assignments`.
- Baseline ops exist:
- Capture: `App\Services\Baselines\BaselineCaptureService``App\Jobs\CaptureBaselineSnapshotJob`.
- Compare: `App\Services\Baselines\BaselineCompareService``App\Jobs\CompareBaselineToTenantJob`.
- Findings lifecycle primitives exist (times_seen/first_seen/last_seen) and recurrence support exists (`findings.recurrence_key`).
- An existing recurrence-based drift generator exists: `App\Services\Drift\DriftFindingGenerator` (uses `recurrence_key` and also sets `fingerprint = recurrence_key` to satisfy the unique constraint).
- Inventory sync is OperationRun-based and stamps `inventory_items.last_seen_operation_run_id`.
## Decisions
### 1) Finding identity for baseline compare
**Decision:** Baseline compare findings MUST use a stable recurrence key derived from:
- `tenant_id`
- `baseline_snapshot_id` (not baseline profile id)
- `policy_type`
- `subject_external_id`
- `change_type`
This recurrence key is stored in `findings.recurrence_key` and ALSO used as `findings.fingerprint` (to keep the existing unique constraint `unique(tenant_id, fingerprint)` effective).
**Rationale:**
- Matches Spec 116 (identity tied to `baseline_snapshot_id` and independent of evidence hashes).
- Aligns with existing, proven pattern in `DriftFindingGenerator` (recurrence_key-based upsert; fingerprint reused).
**Alternatives considered:**
- Keep `DriftHasher::fingerprint(...)` with baseline/current hashes included → rejected because it changes identity when evidence changes (violates FR-116v1-09).
- Add a new unique DB constraint on `(tenant_id, recurrence_key)` → possible later hardening; not required initially because fingerprint uniqueness already enforces dedupe when `fingerprint = recurrence_key`.
### 2) Scope key for baseline compare findings
**Decision:** Keep findings grouped by baseline profile using `scope_key = baseline_profile:{baselineProfileId}`.
**Rationale:**
- Spec 116 requires snapshot-scoped *identity* (via `baseline_snapshot_id` in the recurrence key), but does not require snapshot-scoped grouping.
- The repository already has UI widgets/stats and auto-close behavior keyed to `baseline_profile:{id}`; keeping scope_key stable minimizes churn and preserves existing semantics.
- Re-captures still create new finding identities because the recurrence key includes `baseline_snapshot_id`.
**Alternatives considered:**
- Snapshot-scoped `scope_key = baseline_snapshot:{id}` → rejected for v1 because it would require larger refactors to stats, widgets, and auto-close queries, without being mandated by the spec.
### 3) Coverage guard (prevent false missing policies)
**Decision:** Coverage MUST be derived from the latest completed `inventory_sync` OperationRun for the tenant:
- Record per-policy-type processing outcomes into that runs context (coverage payload).
- Baseline compare MUST compute `uncovered_policy_types = effective_scope - covered_policy_types`.
- Baseline compare MUST emit **no findings of any kind** for uncovered policy types.
- The compare OperationRun outcome should be `partially_succeeded` when uncovered types exist ("completed with warnings" in Ops UX), and summary counts should include `errors_recorded = count(uncovered_policy_types)`.
**Rationale:**
- Spec FR-116v1-07 and SC-116-03.
- Current compare logic uses `inventory_items.last_seen_operation_run_id` filter only; without an explicit coverage list, a missing type looks identical to a truly empty tenant.
**Alternatives considered:**
- Infer coverage purely from "were there any inventory items for this policy type in the last sync run" → rejected because a legitimately empty type would be indistinguishable from "not synced".
### 4) Inventory meta contract hashing (v1 fidelity=meta)
**Decision:** Introduce an explicit "Inventory Meta Contract" builder used by BOTH capture and compare in v1.
- Inputs: `policy_type`, `external_id`, and a whitelist of stable signals from inventory/meta (etag, last modified, scope tags, assignment target count, version marker when available).
- Output: a normalized associative array, hashed deterministically.
**Rationale:**
- Spec FR-116v1-04: hashing must be based on a stable contract, not arbitrary meta.
- Current `BaselineSnapshotIdentity::hashItemContent()` hashes the entire `meta_jsonb` (including keys like `etag` which may be noisy and keys that may expand over time).
**Alternatives considered:**
- Keep current hashing of `meta_jsonb` → rejected because it is not an explicit contract and may drift as we add inventory metadata.
### 5) Baseline scope + foundations
**Decision:** Extend baseline scope JSON to include:
- `policy_types: []` (empty means default "all supported policy types excluding foundations")
- `foundation_types: []` (empty means default "none")
Foundations list must be derived from the same canonical foundation list used by inventory sync selection logic.
**Rationale:**
- Spec FR-116v1-01.
- Current `BaselineScope` only supports `policy_types` and treats empty as "all" (including foundations) which conflicts with the spec default.
### 6) v2 architecture strategy (content fidelity)
**Decision:** v2 is implemented as an extension of the same pipeline via a provider precedence chain:
`PolicyVersion (if available) → Inventory content (if available) → Meta contract fallback (degraded)`
The baseline compare engine stores dimension flags on the same finding (no additional finding identities).
**Rationale:**
- Spec FR-116v2-01 and FR-116v2-05.
- There is already a content-normalization + hashing stack in `DriftFindingGenerator` (policy snapshot / assignments / scope tags) which can inform the content fidelity provider.
## Notes / Risks
- Existing baseline compare findings are currently keyed by `fingerprint` that includes baseline/current hashes and uses `scope_key = baseline_profile:{id}`. The v1 migration should plan for “old findings become stale” behavior; do not attempt silent in-place identity rewriting without an explicit migration/backfill plan.
- Coverage persistence must remain numeric-only in `summary_counts` (per Ops-UX). Detailed coverage lists belong in `operation_runs.context`.

View File

@ -0,0 +1,226 @@
# Feature Specification: Baseline Drift Engine (Final Architecture)
**Feature Branch**: `116-baseline-drift-engine`
**Created**: 2026-03-01
**Status**: Draft
**Input**: User description: "Spec 116 — Baseline Drift Engine (Final Architecture)"
## Spec Scope Fields *(mandatory)*
- **Scope**: workspace (baseline definition + capture) + tenant (baseline compare monitoring)
- **Primary Routes**:
- Workspace (admin): Baseline Profiles (create/edit scope, capture baseline)
- Tenant-context (admin): Baseline Compare runs (compare now, run detail) and Drift Findings landing
- **Data Ownership**:
- Workspace-owned: Baseline profiles and baseline snapshots
- Tenant-scoped (within a workspace): Operation runs for baseline capture/compare; drift findings produced by compare
- **RBAC**:
- Workspace (Baselines):
- `workspace_baselines.view`: view baseline profiles + snapshots
- `workspace_baselines.manage`: create/edit/archive baseline profiles, start capture runs
- Tenant (Compare):
- `tenant.sync`: start baseline compare runs
- `tenant_findings.view`: view drift findings
- Tenant access is required for tenant-context surfaces, in addition to workspace membership
For canonical-view specs: not applicable (this is not a canonical-view feature).
## Clarifications
### Session 2026-03-01
- Q: Should finding identity be stable across baseline re-captures, or tied to a specific baseline snapshot? → A: Tie finding identity to `baseline_snapshot_id` (stable within a snapshot; re-capture creates new finding identities).
- Q: In v2, should drift dimensions be stored as flags on a single finding, or as separate findings per dimension? → A: Use one finding with dimension flags (no separate findings per dimension).
- Q: When running a compare, which baseline snapshot should be used by default? → A: Default to latest successful baseline snapshot for the baseline profile; allow explicitly selecting a snapshot.
- Q: When coverage is missing for a policy type, should compare emit any findings for that type? → A: Skip all finding emission for uncovered types (no `missing_policy`, no `unexpected_policy`, no `different_version`).
## Outcomes
- **O-1 One engine**: There is exactly one baseline drift compare engine; no parallel legacy compare/hash paths.
- **O-2 Stable findings (recurrence)**: The same underlying drift maps to the same finding identity across retries and across runs, with lifecycle counters.
- **O-3 Auditability & operator UX**: Each compare run records scope, coverage, and fidelity; partial coverage produces warnings (not misleading “missing policy” noise).
- **O-4 No legacy logic after v2**: After the v2 extension, there are no “meta compare here / diff there” special cases; all drift flows through the same pipeline.
## Definitions
- **Policy subject**: A compare object uniquely identified by `(tenant_id, policy_type, external_id)`.
- **Policy state**: A normalized representation of a policy subject, containing a deterministic hash, fidelity, and observation metadata.
- **Fidelity**:
- **meta**: drift signal based on a stable “inventory meta contract” (signal-based fields)
- **content**: drift signal based on canonicalized policy content (semantic)
- **Effective scope**: The expanded set of policy types processed by a run.
- **Coverage**: Which policy types are confirmed to be present/updated in the current state at the time of compare.
## Assumptions
- Baseline drift is sold as “signal-based drift detection” in v1 (meta fidelity), and later upgraded to deep drift (content fidelity) without changing the compare engine semantics.
- The system already has a tenant-scoped inventory sync mechanism capable of recording per-run coverage of which policy types were synced.
- Foundations are treated as opt-in policy types; they are excluded unless explicitly selected.
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Capture and compare a baseline with stable findings (Priority: P1)
As a workspace admin, I want to define a baseline scope, capture a baseline snapshot, and compare a tenant against that baseline, so I can reliably detect and track drift over time.
**Why this priority**: This is the core product slice that makes baseline drift sellable: consistent capture, consistent compare, and stable findings.
**Independent Test**: Can be tested by creating a baseline profile with a defined scope, capturing a snapshot, running compare twice, and verifying stable finding identity and lifecycle counters.
**Acceptance Scenarios**:
1. **Given** a baseline profile with scope “all policy types (excluding foundations)”, **When** I capture a baseline snapshot, **Then** the snapshot contains only in-scope policy subjects and each snapshot item records its hash and fidelity.
2. **Given** a captured baseline snapshot and a tenant current state, **When** I run compare twice with the same inputs, **Then** the same drift maps to the same finding identity and lifecycle counters increment at most once per run.
---
### User Story 2 - Coverage warnings prevent misleading missing-policy findings (Priority: P1)
As an operator, I want the compare run to warn when current-state coverage is partial, so that missing policies are not falsely reported when the system simply lacks data.
**Why this priority**: Trust depends on avoiding false negatives/positives; “missing policy” findings on partial sync is unacceptable noise.
**Independent Test**: Can be tested by running compare with an effective scope where some policy types are intentionally marked as not synced, verifying warning outcome and suppression behavior.
**Acceptance Scenarios**:
1. **Given** a compare run where some policy types in effective scope were not synced, **When** compare is executed, **Then** the run completes with warnings and produces no findings at all for those missing-coverage types.
2. **Given** a compare run where coverage is complete, **When** a baseline policy subject is missing in current state for a covered type, **Then** a missing-policy finding is produced.
---
### User Story 3 - Operators can understand scope, coverage, and fidelity in the UI (Priority: P2)
As an operator, I want drift screens to clearly show what was compared (scope), how complete the data was (coverage), and how “deep” the drift signal is (fidelity), so I can interpret findings correctly.
**Why this priority**: Drift findings are only actionable when the operator understands context and limitations.
**Independent Test**: Can be tested by executing a compare run with and without coverage warnings, verifying that run detail and drift landing surfaces render scope counts, coverage badge, and fidelity indicators.
**Acceptance Scenarios**:
1. **Given** a compare run with full coverage, **When** I open run detail, **Then** I see the compared scope and a coverage status of OK.
2. **Given** a compare run with partial coverage, **When** I open the drift landing and run detail, **Then** I see a warning banner and can see which types were missing coverage.
### Edge Cases
- Compare is retried after a transient failure: findings are not duplicated; lifecycle increments happen at most once per run identity.
- Baseline capture is executed with empty scope lists (interpreted as default semantics): policy types means “all supported types excluding foundations”; foundations list means “none”.
- Effective scope expands to zero types (e.g., no supported types): run completes with an explicit warning and produces no findings.
- Policy subjects appear/disappear between inventory sync and compare: handled according to coverage rules; does not create missing-policy noise for uncovered types.
- Two different policy subjects accidentally share an external identifier across types: identity is still unambiguous because `policy_type` is part of the subject key.
## Requirements *(mandatory)*
This feature introduces/extends long-running compare work and uses `OperationRun` for capture and compare runs.
It must comply with:
- **Run observability**: Every capture/compare run must have a visible run identity, scope context, coverage context, and outcome.
- **Safety**: Compare must never claim missing policies for policy types where current-state coverage is not proven.
- **Tenant isolation**: All stored states and findings are tenant-scoped; cross-tenant access must be deny-as-not-found.
### Operational UX Contract (Ops-UX)
- Capture and compare run lifecycle transitions are service-owned (not UI-owned).
- Run summaries provide numeric-only counters using ONLY keys from `app/Support/OpsUx/OperationSummaryKeys.php`.
- Coverage warnings MUST be represented using an existing canonical numeric key (default: `errors_recorded`).
- Warning semantics mapping (canonical):
- Any “completed with warnings” case MUST be represented as `OperationRun.outcome = partially_succeeded`.
- `summary_counts.errors_recorded` MUST be a numeric indicator of warning magnitude.
- Default: number of uncovered policy types in effective scope.
- Edge case (effective scope expands to zero types): `summary_counts.errors_recorded = 1` so the warning remains visible under the numeric-only summary_counts contract.
- Scheduled/system-initiated runs (if any) must not generate user terminal DB notifications; audit is handled via monitoring surfaces.
- Regression guard tests are added/updated to enforce correct run outcome semantics and summary counter rules.
### Authorization Contract (RBAC-UX)
- Workspace membership + capability gates:
- `workspace_baselines.view` is required to view baseline profiles and snapshots.
- `workspace_baselines.manage` is required to create/edit/archive baseline profiles and start capture runs.
- `tenant.sync` is required to start compare runs.
- `tenant_findings.view` is required to view drift findings.
- 404 vs 403 semantics:
- Non-member or not entitled to workspace/tenant scope → 404 (deny-as-not-found)
- Member but missing capability → 403
- Destructive-like actions (e.g., archiving a baseline profile) require an explicit confirmation step.
- At least one positive and one negative authorization test exist for each mutation surface.
### Functional Requirements
#### v1 — Meta-fidelity baseline compare (sellable)
- **FR-116v1-01 Baseline profile scope**: Baseline profiles MUST store a scope object with `policy_types` and `foundation_types` lists.
- Default semantics: `policy_types = []` means all supported policy types excluding foundations; `foundation_types = []` means no foundations.
- Foundations MUST only be included when explicitly selected.
- **FR-116v1-02 UI scope picker**: The UI MUST provide multi-select controls for Policy Types and Foundations and communicate the default semantics (empty selection = default behavior).
- **FR-116v1-03 Effective scope recorded on runs**: Capture and compare runs MUST record expanded effective scope in run context:
- `effective_scope.policy_types[]`, `effective_scope.foundation_types[]`, `effective_scope.all_types[]`, and a boolean `effective_scope.foundations_included`.
- **FR-116v1-04 Inventory meta contract**: The system MUST define and persist a stable “inventory meta contract” (signal-based fields) for drift hashing.
- Minimum required signals: type identifier, version marker (when available), last modified time (when available), scope tags (when available), and assignment target count (when available).
- Drift hashing for v1 MUST be based only on this contract (not arbitrary meta fields).
- **FR-116v1-05 Provide current-state policy states (meta fidelity)**: For all policy subjects in effective scope, the system MUST produce a normalized policy state for compare, including:
- subject key (policy type + external id), deterministic hash, fidelity=`meta`, source indicator, and observed timestamp.
- **FR-116v1-06 Baseline capture stores states (not raw)**: Baseline capture MUST store per-subject snapshot items that include the subject identity and the captured hash + fidelity + source.
- Baseline snapshots MUST NOT contain out-of-scope items.
- **FR-116v1-06a Compare snapshot selection**: Baseline compare MUST, by default, use the latest successful baseline snapshot of the selected baseline profile.
- The UI MAY allow selecting a specific snapshot explicitly for historical comparisons.
- **FR-116v1-07 Coverage guard**: Compare MUST check current-state coverage recorded by the most recent inventory sync run.
- If effective scope contains policy types not present in coverage, the compare run MUST complete with warnings.
- For any uncovered policy type, the compare MUST NOT emit findings of any kind for that type (no `missing_policy`, no `unexpected_policy`, no `different_version`).
- Drift findings for types with proven coverage may still be produced.
- If there is no completed inventory sync run (or coverage proof is missing/unreadable), coverage MUST be treated as unproven for all types and the compare MUST produce zero findings (fail-safe) and complete with warnings.
- **FR-116v1-08 Drift rules**: Compare MUST produce drift results per policy subject:
- Baseline-only → `missing_policy` (only when coverage is proven for the subjects type)
- Current-only → `unexpected_policy`
- Both present and hashes differ → `different_version` (with fidelity=`meta`)
- **FR-116v1-09 Stable finding identity**: Findings MUST have a stable identity key derived from: tenant, baseline snapshot, policy type, external id, and change type.
- Hashes are evidence fields and may update without changing identity.
- Finding identity MUST be tied to a specific baseline snapshot (re-capture creates a new baseline snapshot and therefore new finding identities).
- **FR-116v1-10 Finding lifecycle + retry idempotency**: Findings MUST record first seen, last seen, and times seen.
- For a given run identity, lifecycle counters MUST not increment more than once.
- **FR-116v1-11 Auditability**: Each capture and compare run MUST write an audit trail including effective scope counts, coverage warning summary (if any), and finding counts per change type.
- Audit trail storage (canonical):
- Aggregations that do not fit `summary_counts` MUST be stored in `operation_runs.context` (not new summary keys).
- Compare MUST store per-change-type counts in run context under `findings.counts_by_change_type` (e.g., keys: `missing_policy`, `unexpected_policy`, `different_version`).
- **FR-116v1-12 Drift UI context**: Compare run detail and drift landing MUST surface scope, coverage status, and fidelity (meta-based drift) and show a warning banner when coverage warnings were present.
#### v2 — Content-fidelity extension (deep drift, same engine)
**Deferred / out of scope for this delivery**: The v2 requirements below are intentionally not covered by `specs/116-baseline-drift-engine/tasks.md` and will be implemented in a follow-up spec/milestone.
- **FR-116v2-01 Provider precedence**: Current state MUST be sourced with a precedence chain per policy type: “policy version (if available) → inventory content (if available) → meta fallback (explicitly marked degraded)”.
- **FR-116v2-02 Content hash availability**: The inventory system MUST persist a content hash and capture timestamp for hydrated policy content.
- **FR-116v2-03 Quota-aware hydration**: Content hydration MUST be throttling-safe and resumable, with explicit per-run caps and concurrency limits, and must record hydration coverage in run context.
- **FR-116v2-04 Content normalization rules**: The system MUST define canonicalization rules per policy type, including volatile-field removal and (where needed) redaction hooks.
- **FR-116v2-05 Drift dimensions (optional but final)**: The compare output MAY include dimension flags (content, assignments, scope tags) without changing finding identity.
- If dimension flags are present, they MUST be stored on the same finding record as evidence/flags; the system MUST NOT create separate findings per dimension.
- `change_type` semantics remain compatible with v1 (dimensions refine the “different_version” class rather than multiplying identities).
- **FR-116v2-06 Capture/compare use the same pipeline**: Capture and compare MUST use the same policy state pipeline and hashing semantics; v2 must not introduce special-case compare paths.
- **FR-116v2-07 Coverage/fidelity guard**: If content hydration is incomplete for some types, compare MAY still run but must clearly indicate degraded fidelity and must follow registry-defined behavior for those types.
- **FR-116v2-08 No-legacy guarantee**: After v2 cutover, legacy compare/hash helpers are removed and CI guards prevent re-introduction.
## UI Action Matrix *(mandatory when Filament is changed)*
| Surface | Location | Header Actions | Inspect Affordance (List/Table) | Row Actions (max 2 visible) | Bulk Actions (grouped) | Empty-State CTA(s) | View Header Actions | Create/Edit Save+Cancel | Audit log? | Notes / Exemptions |
|---|---|---|---|---|---|---|---|---|---|
| Baseline Profiles | Workspace admin | Create Baseline Profile | View action / record inspection (per Action Surface Contract) | Edit, Archive (confirmed) | None | “Create Baseline Profile” | Capture Baseline (compare is tenant-context) | Save, Cancel | Yes | Archive requires confirmation; capture starts OperationRuns and is audited |
| Baseline Capture Run Detail | Workspace admin | None | Linked from runs list | None | None | None | None | N/A | Yes | Shows effective scope + fidelity + counts + warnings |
| Baseline Compare Run Detail | Tenant-context admin | Run Compare (if shown), Re-run Compare (if allowed) | Linked from runs list | None | None | None | None | N/A | Yes | Shows coverage badge and warning banner; uncovered types emit no findings |
| Drift Findings Landing | Tenant-context admin | None | Table filter by change type | View (optional), Acknowledge/Resolve (if workflow exists) | None | None | None | N/A | Yes | Surfaces fidelity + coverage context; no destructive actions required for v1 |
### Key Entities *(include if feature involves data)*
- **Baseline profile**: Defines scope (policy types + opt-in foundations) and is the parent for baseline snapshots.
- **Baseline snapshot item**: Stores per-policy-subject baseline state evidence (hash, fidelity, source, observed timestamp).
- **Compare run**: A recorded operation that compares a tenant current state to a baseline snapshot, including effective scope and coverage warnings.
- **Finding**: A stable, recurring drift finding with lifecycle fields (first seen, last seen, times seen) and evidence (baseline/current hashes, fidelity).
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-116-01 One engine**: All baseline compare and capture runs use exactly one drift pipeline; no alternative compare paths exist in production code.
- **SC-116-02 Stable recurrence**: For a fixed baseline snapshot + tenant + policy subject + change type, repeated compares (including retries) produce at most one finding identity, and lifecycle counters increment at most once per run.
- **SC-116-03 Coverage safety**: When coverage is partial for any effective-scope type, the compare run is visibly marked as “completed with warnings” and produces zero findings for those uncovered types.
- **SC-116-04 Operator clarity**: On the compare run detail screen, operators can see effective scope counts, coverage status, and fidelity within one page load, with a clear warning banner when applicable.
- **SC-116-05 Performance guard (v1)**: Compare runs complete without per-item external hydration calls; runtime scales with number of in-scope subjects via chunking.

View File

@ -0,0 +1,233 @@
---
description: "Executable task breakdown for Spec 116 implementation"
---
# Tasks: 116 — Baseline Drift Engine (Final Architecture)
**Input**: Design documents in `specs/116-baseline-drift-engine/` (`plan.md`, `spec.md`, `research.md`, `data-model.md`, `quickstart.md`, `specs/116-baseline-drift-engine/contracts/openapi.yaml`)
**Tests**: REQUIRED (Pest) for all runtime behavior changes.
---
## Phase 1: Setup (Shared Infrastructure)
**Purpose**: Ensure local dev + feature artifacts are ready.
- [ ] T001 Re-run Speckit prerequisites check via `.specify/scripts/bash/check-prerequisites.sh` (references `specs/116-baseline-drift-engine/plan.md`)
- [ ] T002 Ensure Sail + migrations are up for local validation (references `docker-compose.yml` and `database/migrations/`)
- [ ] T003 [P] Re-validate Spec 116 UI Action Matrix remains accurate after planned UI tweaks in `specs/116-baseline-drift-engine/spec.md`
- [ ] T004 [P] Confirm supported policy types + foundations config sources match UI selectors (`config/tenantpilot.php` and `app/Support/Inventory/InventoryPolicyTypeMeta.php`)
---
## Phase 2: Foundational (Blocking Prerequisites)
**Purpose**: Shared primitives that all user stories depend on (scope semantics, data defaults, and UI inputs).
**Independent Test**: Baseline Profile can be created with the new scope shape, and scope defaults expand deterministically.
- [ ] T005 Update baseline scope schema + default semantics (policy_types excludes foundations by default; foundation_types defaults to none) in `app/Support/Baselines/BaselineScope.php`
- [ ] T006 [P] Update BaselineProfile default scope shape to include `foundation_types` in `database/factories/BaselineProfileFactory.php`
- [ ] T007 [P] Ensure BaselineProfile scope casting/normalization supports `foundation_types` safely in `app/Models/BaselineProfile.php`
- [ ] T008 [P] Create focused tests for scope expansion defaults (empty policy_types => supported excluding foundations; empty foundation_types => none) in `tests/Unit/Baselines/BaselineScopeTest.php`
- [ ] T009 Update BaselineProfile Create/Edit form layout to satisfy UX-001 (Main/Aside 3-column layout; no naked inputs) while preserving existing fields in `app/Filament/Resources/BaselineProfileResource.php`
- [ ] T010 Update BaselineProfile scope picker UI to include Foundations multi-select + corrected helper text semantics in `app/Filament/Resources/BaselineProfileResource.php`
- [ ] T011 Update BaselineProfile infolist to display selected foundations (and default “None”) in `app/Filament/Resources/BaselineProfileResource.php`
- [ ] T012 Run focused verification for foundational scope/UI changes with `vendor/bin/sail artisan test --compact tests/Unit/Baselines/BaselineScopeTest.php` (references `tests/Unit/Baselines/BaselineScopeTest.php`)
**Checkpoint**: Scope semantics + scope UI are correct and test-covered.
---
## Phase 3: User Story 1 — Capture and compare a baseline with stable findings (Priority: P1) 🎯
**Goal**: Define the v1 meta-fidelity hash contract, use it for capture/compare, and make baseline-compare findings snapshot-scoped stable identities.
**Independent Test**: Capture a snapshot, run compare twice with the same inputs, and verify finding identity stability + lifecycle idempotency.
### Tests for User Story 1 (write first)
- [ ] T013 [P] [US1] Update capture tests for effective_scope recording + contract-based hashing in `tests/Feature/Baselines/BaselineCaptureTest.php`
- [ ] T014 [P] [US1] Update compare findings tests to assert recurrence-key-based identity (no hashes in fingerprint) + lifecycle idempotency per run + `run.context.findings.counts_by_change_type` is present and accurate in `tests/Feature/Baselines/BaselineCompareFindingsTest.php`
- [ ] T015 [P] [US1] Add/adjust preconditions tests for default baseline snapshot selection (latest successful snapshot) in `tests/Feature/Baselines/BaselineComparePreconditionsTest.php`
- [ ] T016 [P] [US1] Add test that re-capturing (new snapshot id) produces new finding identities (snapshot-scoped) in `tests/Feature/Baselines/BaselineCompareFindingsTest.php`
- [ ] T017 [P] [US1] Add negative auth test ensuring compare start surface is gated by capability (`tenant.sync` / `Capabilities::TENANT_SYNC`) in `tests/Feature/Filament/BaselineCompareLandingStartSurfaceTest.php`
### Implementation for User Story 1
- [ ] T018 [US1] Create + implement Inventory Meta Contract builder (normalized whitelist inputs; deterministic ordering) in `app/Services/Baselines/InventoryMetaContract.php`
- [ ] T019 [US1] Update snapshot hashing to hash ONLY the meta contract output (not entire meta_jsonb) in `app/Services/Baselines/BaselineSnapshotIdentity.php`
- [ ] T020 [US1] Update baseline capture job to compute/store baseline_hash via InventoryMetaContract + record fidelity/meta observation evidence in `app/Jobs/CaptureBaselineSnapshotJob.php`
- [ ] T021 [US1] Ensure capture OperationRun context records `effective_scope.*` (policy_types, foundation_types, all_types, foundations_included) in `app/Services/Baselines/BaselineCaptureService.php`
- [ ] T022 [US1] Update baseline compare job to compute current_hash via InventoryMetaContract consistently with capture in `app/Jobs/CompareBaselineToTenantJob.php`
- [ ] T023 [US1] Switch baseline-compare finding identity to recurrence key derived from (tenant, baseline_snapshot_id, policy_type, external_id, change_type) and set `fingerprint == recurrence_key` in `app/Jobs/CompareBaselineToTenantJob.php`
- [ ] T024 [US1] Enforce per-run idempotency by using `findings.current_operation_run_id` (and/or evidence) so `times_seen` increments at most once per run identity in `app/Jobs/CompareBaselineToTenantJob.php`
- [ ] T025 [US1] Ensure compare run context includes `baseline_profile_id`, `baseline_snapshot_id`, and `findings.counts_by_change_type` for stats/widgets/auditability in `app/Services/Baselines/BaselineCompareService.php`
**Checkpoint**: US1 tests pass: `vendor/bin/sail artisan test --compact tests/Feature/Baselines/BaselineCaptureTest.php tests/Feature/Baselines/BaselineCompareFindingsTest.php tests/Feature/Baselines/BaselineComparePreconditionsTest.php`.
---
## Phase 4: User Story 2 — Coverage warnings prevent misleading missing-policy findings (Priority: P1)
**Goal**: Persist per-type coverage from inventory sync and enforce the coverage guard in baseline compare (uncovered types emit zero findings; compare outcome is partial with warnings).
**Independent Test**: Run compare with missing coverage for some in-scope types; verify partial outcome + zero findings for uncovered types.
### Tests for User Story 2 (write first)
- [ ] T026 [P] [US2] Extend inventory sync tests to assert per-type coverage payload is written to OperationRun context in `tests/Feature/Inventory/InventorySyncStartSurfaceTest.php`
- [ ] T027 [P] [US2] Create coverage-guard regression test: (a) uncovered types => no findings of any kind; (b) no completed inventory sync run / missing coverage payload => fail-safe zero findings; (c) effective scope expands to zero types => warnings + zero findings; outcome partially_succeeded; covered types still emit findings in `tests/Feature/Baselines/BaselineCompareCoverageGuardTest.php`
### Implementation for User Story 2
- [ ] T028 [US2] Persist inventory sync coverage payload into latest inventory sync run context (`inventory.coverage.policy_types` + `inventory.coverage.foundation_types`) in `app/Services/Inventory/InventorySyncService.php`
- [ ] T029 [P] [US2] Create a small coverage parser/helper to normalize context payload for downstream consumers in `app/Support/Inventory/InventoryCoverage.php`
- [ ] T030 [US2] Update baseline compare to read latest inventory sync run coverage, compute uncovered types, skip emission for uncovered types, and write coverage details into compare run context in `app/Jobs/CompareBaselineToTenantJob.php`
- [ ] T031 [US2] Treat missing coverage proof (no completed inventory sync run, or unreadable/missing coverage payload) as uncovered-for-all-types (fail-safe): emit zero findings and mark outcome partially_succeeded (via OperationRunService), setting numeric summary_counts (including errors_recorded) using canonical keys only in `app/Jobs/CompareBaselineToTenantJob.php`
**Note (canonical warning magnitude)**: For (c) “effective scope expands to zero types”, the compare MUST still surface a warning and therefore MUST set `summary_counts.errors_recorded = 1` (even though uncovered-types count is 0), to keep the warning visible under the numeric-only summary_counts contract.
**Checkpoint**: US2 tests pass: `vendor/bin/sail artisan test --compact tests/Feature/Inventory/InventorySyncStartSurfaceTest.php tests/Feature/Baselines/BaselineCompareCoverageGuardTest.php`.
---
## Phase 5: User Story 3 — Operators can understand scope, coverage, and fidelity in the UI (Priority: P2)
**Goal**: Surface effective scope, coverage status, and fidelity (meta) in Baseline Compare landing + drift findings surfaces.
**Independent Test**: Execute a compare with and without coverage warnings; verify UI surfaces show badge/banner + scope/coverage/fidelity context.
### Tests for User Story 3 (write first)
- [ ] T032 [P] [US3] Update Baseline Compare landing tests to cover warning/coverage state rendering inputs (stats DTO fields) in `tests/Feature/Filament/BaselineCompareLandingStartSurfaceTest.php`
- [ ] T033 [P] [US3] Update drift landing comparison-info tests to include coverage/fidelity context when source is baseline compare in `tests/Feature/Drift/DriftLandingShowsComparisonInfoTest.php`
### Implementation for User Story 3
- [ ] T034 [US3] Extend BaselineCompareStats DTO to include coverage status + uncovered types summary + fidelity indicator sourced from latest compare run context in `app/Support/Baselines/BaselineCompareStats.php`
- [ ] T035 [US3] Wire new stats fields into the BaselineCompareLanding Livewire page state in `app/Filament/Pages/BaselineCompareLanding.php`
- [ ] T036 [US3] Render coverage badge + warning banner + fidelity label on the landing view in `resources/views/filament/pages/baseline-compare-landing.blade.php`
- [ ] T037 [US3] Add a findings-list banner when latest baseline compare run had uncovered types (linking to the run) in `app/Filament/Resources/FindingResource/Pages/ListFindings.php`
- [ ] T038 [US3] Ensure run detail already shows context; if needed, add baseline compare “Coverage” summary entry for readability in `app/Filament/Resources/OperationRunResource.php`
**Checkpoint**: US3 tests pass: `vendor/bin/sail artisan test --compact tests/Feature/Filament/BaselineCompareLandingStartSurfaceTest.php tests/Feature/Drift/DriftLandingShowsComparisonInfoTest.php`.
---
## Phase 6: Polish & Cross-Cutting Concerns
**Purpose**: Preserve operability semantics (auto-close, stats), Ops-UX compliance, and fast regression feedback.
- [ ] T039 Confirm baseline compare stats remain profile-grouped via `scope_key = baseline_profile:{id}` after identity change in `app/Support/Baselines/BaselineCompareStats.php`
- [ ] T040 Ensure baseline auto-close behavior still works with snapshot-scoped identities (no stale open findings after successful compare) in `app/Services/Baselines/BaselineAutoCloseService.php`
- [ ] T041 [P] Update/verify auto-close regression test remains valid after identity change in `tests/Feature/Baselines/BaselineOperabilityAutoCloseTest.php`
- [ ] T042 [P] Add/extend guard test asserting OperationRun summary_counts are numeric-only and keys are limited to `OperationSummaryKeys::all()` for baseline capture/compare runs in `tests/Feature/Baselines/BaselineCompareFindingsTest.php`
- [ ] T043 Run baseline-focused test pack for Spec 116: `vendor/bin/sail artisan test --compact tests/Feature/Baselines/` (references `tests/Feature/Baselines/`)
- [ ] T044 Run Pint formatter on changed files: `vendor/bin/sail bin pint --dirty --format agent` (references `app/` and `tests/`)
- [ ] T045 Validate developer quickstart still matches real behavior (update if needed) in `specs/116-baseline-drift-engine/quickstart.md`
- [ ] T046 [P] Create Baseline Profile archive action tests (confirmation required + RBAC 403/404 semantics + success path) in `tests/Feature/Baselines/BaselineProfileArchiveActionTest.php`
- [ ] T047 [P] Ensure archive action is declared in Action Surface slots and remains “More” row action only (max 2 visible row actions) in `tests/Feature/Guards/ActionSurfaceContractTest.php`
---
## Dependencies & Execution Order
### User Story Dependency Graph
```mermaid
graph TD
P1[Phase 1: Setup] --> P2[Phase 2: Foundational]
P2 --> US1[US1: Stable capture/compare + findings]
US1 --> US2[US2: Coverage guard]
US2 --> US3[US3: UI context]
US2 --> P6[Phase 6: Polish]
US3 --> P6
```
### Phase Dependencies
- **Setup (Phase 1)** → blocks nothing, but should be done first.
- **Foundational (Phase 2)** → BLOCKS all user stories.
- **US1 / US2 / US3** → can start after Foundational; in practice US1 then US2 reduces merge conflicts in `app/Jobs/CompareBaselineToTenantJob.php`.
- **Polish (Phase 6)** → after US1/US2 at minimum.
### User Story Dependencies
- **US1 (P1)**: Depends on Phase 2.
- **US2 (P1)**: Depends on Phase 2; touches compare + inventory sync. Strongly recommended after US1 to keep compare changes coherent.
- **US3 (P2)**: Depends on US2 (needs coverage context) and Phase 2.
---
## Parallel Example: User Story 1
```bash
# Tests can be updated in parallel:
Task: "Update capture tests" (tests/Feature/Baselines/BaselineCaptureTest.php)
Task: "Update compare findings tests" (tests/Feature/Baselines/BaselineCompareFindingsTest.php)
Task: "Update compare preconditions tests" (tests/Feature/Baselines/BaselineComparePreconditionsTest.php)
# Implementation can be split with care (different files):
Task: "Implement InventoryMetaContract" (app/Services/Baselines/InventoryMetaContract.php)
Task: "Update BaselineProfileResource scope UI" (app/Filament/Resources/BaselineProfileResource.php)
```
---
## Parallel Example: User Story 2
```bash
# Tests can be updated in parallel:
Task: "Assert inventory sync coverage is written" (tests/Feature/Inventory/InventorySyncStartSurfaceTest.php)
Task: "Add coverage-guard regression test" (tests/Feature/Baselines/BaselineCompareCoverageGuardTest.php)
# Implementation can be split (different files):
Task: "Write coverage payload to sync run context" (app/Services/Inventory/InventorySyncService.php)
Task: "Implement coverage parser helper" (app/Support/Inventory/InventoryCoverage.php)
```
---
## Parallel Example: User Story 3
```bash
# Tests can be updated in parallel:
Task: "Update landing test for coverage/fidelity state" (tests/Feature/Filament/BaselineCompareLandingStartSurfaceTest.php)
Task: "Update drift landing comparison-info test" (tests/Feature/Drift/DriftLandingShowsComparisonInfoTest.php)
# Implementation can be split (different files):
Task: "Extend stats DTO + wiring" (app/Support/Baselines/BaselineCompareStats.php)
Task: "Render landing banner/badges" (resources/views/filament/pages/baseline-compare-landing.blade.php)
Task: "Add findings list banner" (app/Filament/Resources/FindingResource/Pages/ListFindings.php)
```
---
## Implementation Strategy
### Suggested MVP Scope
- **MVP**: US1 (stable capture/compare + stable findings).
- **Trust-critical follow-up**: US2 (coverage guard) is also P1 in the spec and should typically ship immediately after MVP.
### Incremental Delivery
1. Phase 1 + Phase 2
2. US1 → validate stable identities + contract hashing
3. US2 → validate coverage guard + partial outcome semantics
4. US3 → validate operator clarity (badges/banners)
5. Phase 6 → ensure operability + guards + formatting
---
## Notes
- [P] tasks = parallelizable (different files, minimal dependency)
- All tasks include explicit file targets for fast handoff
- Destructive actions already require confirmation (Filament actions use `->requiresConfirmation()`); keep that invariant when editing UI surfaces
- Spec 116 includes a v2 section for future work; v2 requirements are explicitly deferred and are not covered by this tasks list