ahmido 92704a2f7e Spec 118: Resumable baseline evidence capture + snapshot UX (#143 )

Implements Spec 118 baseline drift engine improvements:

- Resumable, budget-aware evidence capture for baseline capture/compare runs (resume token + UI action)
- “Why no findings?” reason-code driven explanations and richer run context panels
- Baseline Snapshot resource (list/detail) with fidelity visibility
- Retention command + schedule for pruning baseline-purpose PolicyVersions
- i18n strings for Baseline Compare landing

Verification:
- `vendor/bin/sail bin pint --dirty --format agent`
- `vendor/bin/sail artisan test --compact --filter=Baseline` (159 passed)

Note:
- `docs/audits/redaction-audit-2026-03-04.md` left untracked (not part of PR).

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #143

2026-03-04 22:34:13 +00:00

10 KiB

Raw Permalink Blame History

Implementation Plan: Golden Master Deep Drift v2 (Full Content Capture)

Branch: 118-baseline-drift-engine | Date: 2026-03-03 | Spec: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/spec.md
Input: Feature specification from /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/spec.md

Summary

Enable reliable, settings-level drift detection (“deep drift”) for Golden Master baselines by making baseline capture and baseline compare self-sufficient:

For baseline profiles configured for full-content capture, both capture and compare automatically capture the required policy content evidence on demand (quota-aware, resumable), rather than relying on opportunistic evidence.
Drift comparison uses the existing canonical fingerprinting pipeline and evidence provider chain (content-first, explicit degraded fallback), with “no legacy” enforced via code paths and automated guards.
Operations are observable and explainable: each run records effective scope, coverage proof, fidelity breakdown, evidence capture stats, evidence gaps, and “why no findings” reason codes.
Security and governance constraints are enforced: captured policy evidence is redacted before persistence/fingerprinting, audit events are emitted for capture/compare/resume mutations, baseline-purpose evidence is pruned per retention policy, and full-content mode is gated by a short-lived rollout flag.
Admin UX exposes single-click actions (“Capture baseline (full content)”, “Compare now (full content)”, and “Resume capture” when applicable), surfaces evidence gaps clearly, and provides baseline snapshot fidelity visibility (content-complete vs gaps).

Technical Context

Language/Version: PHP 8.4.15
Primary Dependencies: Laravel 12.52, Filament 5.2, Livewire 4.1, Microsoft Graph integration via GraphClientInterface
Storage: PostgreSQL (JSONB-heavy for evidence/snapshots)
Testing: Pest 4.3 (PHPUnit 12.5)
Target Platform: Containerized web app (Local: Sail; Staging/Production: Dokploy)
Project Type: Web application (Laravel monolith + Filament admin panel)
Performance Goals: Capture/compare runs handle 200–500 in-scope subjects per run under throttling constraints, without blocking UI; evidence capture is bounded and resumable.
Constraints: All long-running + remote work is async + observable via OperationRun; rate limits (429/503) must back off safely; no secrets/PII persisted in evidence or logs; tenant/workspace isolation is strict.
Scale/Scope: Multi-workspace, multi-tenant; per tenant potentially hundreds–thousands of policies; baselines may be assigned to multiple tenants in a workspace.

Initial budget defaults (v1, adjustable via config):

TENANTPILOT_BASELINE_EVIDENCE_MAX_ITEMS_PER_RUN=200
TENANTPILOT_BASELINE_EVIDENCE_MAX_CONCURRENCY=5
TENANTPILOT_BASELINE_EVIDENCE_MAX_RETRIES=3
TENANTPILOT_BASELINE_EVIDENCE_RETENTION_DAYS=90

Constitution Check

GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.

PASS — Inventory-first: Inventory remains the subject index (“last observed”), while content evidence is captured explicitly as immutable policy versions for comparison.
PASS — Read/write separation: this feature adds/extends read-only capture/compare operations (no restore); any destructive UI actions remain confirmed + audited.
PASS — Graph contract path: evidence capture uses existing Graph client abstractions and contract registry (/Users/ahmeddarrazi/Documents/projects/TenantAtlas/config/graph_contracts.php); no direct/adhoc endpoints in feature code.
PASS — Deterministic capabilities: capability gating continues through the canonical capability resolvers and enforcement helpers (no role-string checks).
PASS — RBAC-UX: workspace membership + capability gates enforced server-side; non-member access is deny-as-not-found; member missing capability is forbidden.
PASS — Workspace & tenant isolation: baseline profiles/snapshots are workspace-owned; compare runs/findings/evidence remain tenant-scoped; canonical Monitoring pages remain DB-only at render time.
PASS — Ops observability: baseline capture/compare are OperationRun-backed; start surfaces enqueue-only; no remote work at render time.
PASS — Ops-UX 3-surface feedback + lifecycle + summary counts: enqueue toast uses the canonical presenter; progress shown only in global widget + run detail; completion emits exactly one terminal DB notification to initiator; status/outcome transitions remain service-owned; summary counts stay numeric-only using canonical keys.
PASS — Automation & throttling: evidence capture respects 429/503 backoff + jitter (client + phase-level budget handling) and supports resumption via an opaque token stored in run context.
PASS — BADGE-001: any new/changed badges use existing badge catalog mapping (no ad-hoc).
PASS — Filament action surface + UX-001: actions are declared, capability-gated, and confirmed where destructive-like; tables maintain an inspect affordance; view uses infolists; empty states have 1 CTA.

Project Structure

Documentation (this feature)

/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/
├── spec.md
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
└── tasks.md

Source Code (repository root)

/Users/ahmeddarrazi/Documents/projects/TenantAtlas/
app/
├── Filament/
│   ├── Pages/BaselineCompareLanding.php
│   ├── Resources/BaselineProfileResource.php
│   └── Resources/BaselineProfileResource/RelationManagers/BaselineTenantAssignmentsRelationManager.php
├── Jobs/
│   ├── CaptureBaselineSnapshotJob.php
│   └── CompareBaselineToTenantJob.php
├── Models/
│   ├── BaselineProfile.php
│   ├── BaselineSnapshot.php
│   ├── BaselineSnapshotItem.php
│   ├── BaselineTenantAssignment.php
│   ├── Policy.php
│   ├── PolicyVersion.php
│   ├── InventoryItem.php
│   ├── OperationRun.php
│   └── Finding.php
├── Services/
│   ├── Baselines/
│   │   ├── BaselineCaptureService.php
│   │   ├── BaselineCompareService.php
│   │   ├── CurrentStateHashResolver.php
│   │   └── Evidence/
│   ├── Intune/PolicyCaptureOrchestrator.php
│   └── OperationRunService.php
├── Support/
│   ├── Baselines/
│   ├── OpsUx/
│   └── OperationRunType.php
config/
├── graph_contracts.php
└── tenantpilot.php
database/
└── migrations/
tests/
└── Feature/

Structure Decision: Laravel monolith. Baseline drift orchestration lives in app/Services/Baselines + app/Jobs, UI in app/Filament, and evidence capture reuses app/Services/Intune capture orchestration.

Tasks are defined in /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/tasks.md.

Complexity Tracking

No constitution violations are required for Spec 118 planning. (Table intentionally omitted.)

Phase 0 — Research (output: research.md)

Goals:

Confirm precise extension points for adding full-content evidence capture to existing baseline capture/compare jobs.
Decide the purpose-tagging and idempotency strategy for baseline evidence captured as PolicyVersion.
Confirm Monitoring run detail requirements for context.target_scope and baseline-specific context sections.

Deliverable: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/research.md

Phase 1 — Design (output: data-model.md + contracts/* + quickstart.md)

Deliverables:

Data model changes + JSON context shapes: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/data-model.md
Route surface contract reference: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/contracts/openapi.yaml
Developer quickstart: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/quickstart.md

Post-design constitution re-check: PASS (see decisions in research + data model docs; Ops-UX and RBAC constraints preserved).

Phase 2 — Implementation Planning (high-level)

Add migrations:
- baseline_profiles.capture_mode
- baseline_snapshot_items.subject_key
- policy_versions.capture_purpose, operation_run_id, baseline_profile_id + indexes
Implement quota-aware, resumable baseline evidence capture phase:
- reuse existing capture orchestration (policy payload + assignments + scope tags)
- emit capture stats + resume token in OperationRun.context
Integrate the capture phase into:
- baseline capture job (before snapshot build)
- baseline compare job (refresh phase before drift evaluation)
Update drift matching to use cross-tenant subject key (policy_type + subject_key) where subject_key is the normalized display name, and record ambiguous/missing match as evidence gaps (no finding).
Update Ops-UX context:
- ensure context.target_scope exists for baseline capture/compare runs
- add “why no findings” reason codes
Update UI action surfaces:
- Baseline profile: capture mode + “Capture baseline (full content)” + tenant-targeted “Compare now (full content)”
- Operation run detail: evidence capture panel + “Resume capture” when token exists
Add focused Pest tests:
- full-content capture creates content-fidelity snapshot items (or warnings + gaps)
- compare detects settings drift with content evidence
- throttling/resume semantics and “no silent zeros” reason codes
Add governance hardening:
- enforce rollout gate across UI/services/jobs for full-content mode
- redact secrets/PII from captured evidence before persistence/fingerprinting
- emit audit events for capture/compare/resume operations
- prune baseline-purpose evidence per retention policy (scheduled)

10 KiB Raw Permalink Blame History Unescape Escape