TenantAtlas/specs/118-baseline-drift-engine/plan.md
ahmido 92704a2f7e Spec 118: Resumable baseline evidence capture + snapshot UX (#143)
Implements Spec 118 baseline drift engine improvements:

- Resumable, budget-aware evidence capture for baseline capture/compare runs (resume token + UI action)
- “Why no findings?” reason-code driven explanations and richer run context panels
- Baseline Snapshot resource (list/detail) with fidelity visibility
- Retention command + schedule for pruning baseline-purpose PolicyVersions
- i18n strings for Baseline Compare landing

Verification:
- `vendor/bin/sail bin pint --dirty --format agent`
- `vendor/bin/sail artisan test --compact --filter=Baseline` (159 passed)

Note:
- `docs/audits/redaction-audit-2026-03-04.md` left untracked (not part of PR).

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #143
2026-03-04 22:34:13 +00:00

162 lines
10 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Implementation Plan: Golden Master Deep Drift v2 (Full Content Capture)
**Branch**: `118-baseline-drift-engine` | **Date**: 2026-03-03 | **Spec**: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/spec.md
**Input**: Feature specification from `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/spec.md`
## Summary
Enable reliable, settings-level drift detection (“deep drift”) for Golden Master baselines by making baseline capture and baseline compare self-sufficient:
- For baseline profiles configured for full-content capture, both capture and compare automatically capture the required policy content evidence on demand (quota-aware, resumable), rather than relying on opportunistic evidence.
- Drift comparison uses the existing canonical fingerprinting pipeline and evidence provider chain (content-first, explicit degraded fallback), with “no legacy” enforced via code paths and automated guards.
- Operations are observable and explainable: each run records effective scope, coverage proof, fidelity breakdown, evidence capture stats, evidence gaps, and “why no findings” reason codes.
- Security and governance constraints are enforced: captured policy evidence is redacted before persistence/fingerprinting, audit events are emitted for capture/compare/resume mutations, baseline-purpose evidence is pruned per retention policy, and full-content mode is gated by a short-lived rollout flag.
- Admin UX exposes single-click actions (“Capture baseline (full content)”, “Compare now (full content)”, and “Resume capture” when applicable), surfaces evidence gaps clearly, and provides baseline snapshot fidelity visibility (content-complete vs gaps).
## Technical Context
**Language/Version**: PHP 8.4.15
**Primary Dependencies**: Laravel 12.52, Filament 5.2, Livewire 4.1, Microsoft Graph integration via `GraphClientInterface`
**Storage**: PostgreSQL (JSONB-heavy for evidence/snapshots)
**Testing**: Pest 4.3 (PHPUnit 12.5)
**Target Platform**: Containerized web app (Local: Sail; Staging/Production: Dokploy)
**Project Type**: Web application (Laravel monolith + Filament admin panel)
**Performance Goals**: Capture/compare runs handle 200500 in-scope subjects per run under throttling constraints, without blocking UI; evidence capture is bounded and resumable.
**Constraints**: All long-running + remote work is async + observable via `OperationRun`; rate limits (429/503) must back off safely; no secrets/PII persisted in evidence or logs; tenant/workspace isolation is strict.
**Scale/Scope**: Multi-workspace, multi-tenant; per tenant potentially hundredsthousands of policies; baselines may be assigned to multiple tenants in a workspace.
**Initial budget defaults (v1, adjustable via config)**:
- `TENANTPILOT_BASELINE_EVIDENCE_MAX_ITEMS_PER_RUN=200`
- `TENANTPILOT_BASELINE_EVIDENCE_MAX_CONCURRENCY=5`
- `TENANTPILOT_BASELINE_EVIDENCE_MAX_RETRIES=3`
- `TENANTPILOT_BASELINE_EVIDENCE_RETENTION_DAYS=90`
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
- PASS — Inventory-first: Inventory remains the subject index (“last observed”), while content evidence is captured explicitly as immutable policy versions for comparison.
- PASS — Read/write separation: this feature adds/extends read-only capture/compare operations (no restore); any destructive UI actions remain confirmed + audited.
- PASS — Graph contract path: evidence capture uses existing Graph client abstractions and contract registry (`/Users/ahmeddarrazi/Documents/projects/TenantAtlas/config/graph_contracts.php`); no direct/adhoc endpoints in feature code.
- PASS — Deterministic capabilities: capability gating continues through the canonical capability resolvers and enforcement helpers (no role-string checks).
- PASS — RBAC-UX: workspace membership + capability gates enforced server-side; non-member access is deny-as-not-found; member missing capability is forbidden.
- PASS — Workspace & tenant isolation: baseline profiles/snapshots are workspace-owned; compare runs/findings/evidence remain tenant-scoped; canonical Monitoring pages remain DB-only at render time.
- PASS — Ops observability: baseline capture/compare are `OperationRun`-backed; start surfaces enqueue-only; no remote work at render time.
- PASS — Ops-UX 3-surface feedback + lifecycle + summary counts: enqueue toast uses the canonical presenter; progress shown only in global widget + run detail; completion emits exactly one terminal DB notification to initiator; status/outcome transitions remain service-owned; summary counts stay numeric-only using canonical keys.
- PASS — Automation & throttling: evidence capture respects 429/503 backoff + jitter (client + phase-level budget handling) and supports resumption via an opaque token stored in run context.
- PASS — BADGE-001: any new/changed badges use existing badge catalog mapping (no ad-hoc).
- PASS — Filament action surface + UX-001: actions are declared, capability-gated, and confirmed where destructive-like; tables maintain an inspect affordance; view uses infolists; empty states have 1 CTA.
## Project Structure
### Documentation (this feature)
```text
/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/
├── spec.md
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
└── tasks.md
```
### Source Code (repository root)
```text
/Users/ahmeddarrazi/Documents/projects/TenantAtlas/
app/
├── Filament/
│ ├── Pages/BaselineCompareLanding.php
│ ├── Resources/BaselineProfileResource.php
│ └── Resources/BaselineProfileResource/RelationManagers/BaselineTenantAssignmentsRelationManager.php
├── Jobs/
│ ├── CaptureBaselineSnapshotJob.php
│ └── CompareBaselineToTenantJob.php
├── Models/
│ ├── BaselineProfile.php
│ ├── BaselineSnapshot.php
│ ├── BaselineSnapshotItem.php
│ ├── BaselineTenantAssignment.php
│ ├── Policy.php
│ ├── PolicyVersion.php
│ ├── InventoryItem.php
│ ├── OperationRun.php
│ └── Finding.php
├── Services/
│ ├── Baselines/
│ │ ├── BaselineCaptureService.php
│ │ ├── BaselineCompareService.php
│ │ ├── CurrentStateHashResolver.php
│ │ └── Evidence/
│ ├── Intune/PolicyCaptureOrchestrator.php
│ └── OperationRunService.php
├── Support/
│ ├── Baselines/
│ ├── OpsUx/
│ └── OperationRunType.php
config/
├── graph_contracts.php
└── tenantpilot.php
database/
└── migrations/
tests/
└── Feature/
```
**Structure Decision**: Laravel monolith. Baseline drift orchestration lives in `app/Services/Baselines` + `app/Jobs`, UI in `app/Filament`, and evidence capture reuses `app/Services/Intune` capture orchestration.
Tasks are defined in `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/tasks.md`.
## Complexity Tracking
No constitution violations are required for Spec 118 planning. (Table intentionally omitted.)
## Phase 0 — Research (output: research.md)
Goals:
- Confirm precise extension points for adding full-content evidence capture to existing baseline capture/compare jobs.
- Decide the purpose-tagging and idempotency strategy for baseline evidence captured as `PolicyVersion`.
- Confirm Monitoring run detail requirements for `context.target_scope` and baseline-specific context sections.
Deliverable: `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/research.md`
## Phase 1 — Design (output: data-model.md + contracts/* + quickstart.md)
Deliverables:
- Data model changes + JSON context shapes: `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/data-model.md`
- Route surface contract reference: `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/contracts/openapi.yaml`
- Developer quickstart: `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/quickstart.md`
Post-design constitution re-check: PASS (see decisions in research + data model docs; Ops-UX and RBAC constraints preserved).
## Phase 2 — Implementation Planning (high-level)
1) Add migrations:
- `baseline_profiles.capture_mode`
- `baseline_snapshot_items.subject_key`
- `policy_versions.capture_purpose`, `operation_run_id`, `baseline_profile_id` + indexes
2) Implement quota-aware, resumable baseline evidence capture phase:
- reuse existing capture orchestration (policy payload + assignments + scope tags)
- emit capture stats + resume token in `OperationRun.context`
3) Integrate the capture phase into:
- baseline capture job (before snapshot build)
- baseline compare job (refresh phase before drift evaluation)
4) Update drift matching to use cross-tenant subject key (`policy_type + subject_key`) where `subject_key` is the normalized display name, and record ambiguous/missing match as evidence gaps (no finding).
5) Update Ops-UX context:
- ensure `context.target_scope` exists for baseline capture/compare runs
- add “why no findings” reason codes
6) Update UI action surfaces:
- Baseline profile: capture mode + “Capture baseline (full content)” + tenant-targeted “Compare now (full content)”
- Operation run detail: evidence capture panel + “Resume capture” when token exists
7) Add focused Pest tests:
- full-content capture creates content-fidelity snapshot items (or warnings + gaps)
- compare detects settings drift with content evidence
- throttling/resume semantics and “no silent zeros” reason codes
8) Add governance hardening:
- enforce rollout gate across UI/services/jobs for full-content mode
- redact secrets/PII from captured evidence before persistence/fingerprinting
- emit audit events for capture/compare/resume operations
- prune baseline-purpose evidence per retention policy (scheduled)