# Implementation Plan: Golden Master Deep Drift v2 (Full Content Capture) **Branch**: `118-baseline-drift-engine` | **Date**: 2026-03-03 | **Spec**: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/spec.md **Input**: Feature specification from `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/spec.md` ## Summary Enable reliable, settings-level drift detection (“deep drift”) for Golden Master baselines by making baseline capture and baseline compare self-sufficient: - For baseline profiles configured for full-content capture, both capture and compare automatically capture the required policy content evidence on demand (quota-aware, resumable), rather than relying on opportunistic evidence. - Drift comparison uses the existing canonical fingerprinting pipeline and evidence provider chain (content-first, explicit degraded fallback), with “no legacy” enforced via code paths and automated guards. - Operations are observable and explainable: each run records effective scope, coverage proof, fidelity breakdown, evidence capture stats, evidence gaps, and “why no findings” reason codes. - Security and governance constraints are enforced: captured policy evidence is redacted before persistence/fingerprinting, audit events are emitted for capture/compare/resume mutations, baseline-purpose evidence is pruned per retention policy, and full-content mode is gated by a short-lived rollout flag. - Admin UX exposes single-click actions (“Capture baseline (full content)”, “Compare now (full content)”, and “Resume capture” when applicable), surfaces evidence gaps clearly, and provides baseline snapshot fidelity visibility (content-complete vs gaps). ## Technical Context **Language/Version**: PHP 8.4.15 **Primary Dependencies**: Laravel 12.52, Filament 5.2, Livewire 4.1, Microsoft Graph integration via `GraphClientInterface` **Storage**: PostgreSQL (JSONB-heavy for evidence/snapshots) **Testing**: Pest 4.3 (PHPUnit 12.5) **Target Platform**: Containerized web app (Local: Sail; Staging/Production: Dokploy) **Project Type**: Web application (Laravel monolith + Filament admin panel) **Performance Goals**: Capture/compare runs handle 200–500 in-scope subjects per run under throttling constraints, without blocking UI; evidence capture is bounded and resumable. **Constraints**: All long-running + remote work is async + observable via `OperationRun`; rate limits (429/503) must back off safely; no secrets/PII persisted in evidence or logs; tenant/workspace isolation is strict. **Scale/Scope**: Multi-workspace, multi-tenant; per tenant potentially hundreds–thousands of policies; baselines may be assigned to multiple tenants in a workspace. **Initial budget defaults (v1, adjustable via config)**: - `TENANTPILOT_BASELINE_EVIDENCE_MAX_ITEMS_PER_RUN=200` - `TENANTPILOT_BASELINE_EVIDENCE_MAX_CONCURRENCY=5` - `TENANTPILOT_BASELINE_EVIDENCE_MAX_RETRIES=3` - `TENANTPILOT_BASELINE_EVIDENCE_RETENTION_DAYS=90` ## Constitution Check *GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.* - PASS — Inventory-first: Inventory remains the subject index (“last observed”), while content evidence is captured explicitly as immutable policy versions for comparison. - PASS — Read/write separation: this feature adds/extends read-only capture/compare operations (no restore); any destructive UI actions remain confirmed + audited. - PASS — Graph contract path: evidence capture uses existing Graph client abstractions and contract registry (`/Users/ahmeddarrazi/Documents/projects/TenantAtlas/config/graph_contracts.php`); no direct/adhoc endpoints in feature code. - PASS — Deterministic capabilities: capability gating continues through the canonical capability resolvers and enforcement helpers (no role-string checks). - PASS — RBAC-UX: workspace membership + capability gates enforced server-side; non-member access is deny-as-not-found; member missing capability is forbidden. - PASS — Workspace & tenant isolation: baseline profiles/snapshots are workspace-owned; compare runs/findings/evidence remain tenant-scoped; canonical Monitoring pages remain DB-only at render time. - PASS — Ops observability: baseline capture/compare are `OperationRun`-backed; start surfaces enqueue-only; no remote work at render time. - PASS — Ops-UX 3-surface feedback + lifecycle + summary counts: enqueue toast uses the canonical presenter; progress shown only in global widget + run detail; completion emits exactly one terminal DB notification to initiator; status/outcome transitions remain service-owned; summary counts stay numeric-only using canonical keys. - PASS — Automation & throttling: evidence capture respects 429/503 backoff + jitter (client + phase-level budget handling) and supports resumption via an opaque token stored in run context. - PASS — BADGE-001: any new/changed badges use existing badge catalog mapping (no ad-hoc). - PASS — Filament action surface + UX-001: actions are declared, capability-gated, and confirmed where destructive-like; tables maintain an inspect affordance; view uses infolists; empty states have 1 CTA. ## Project Structure ### Documentation (this feature) ```text /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/ ├── spec.md ├── plan.md ├── research.md ├── data-model.md ├── quickstart.md ├── contracts/ └── tasks.md ``` ### Source Code (repository root) ```text /Users/ahmeddarrazi/Documents/projects/TenantAtlas/ app/ ├── Filament/ │ ├── Pages/BaselineCompareLanding.php │ ├── Resources/BaselineProfileResource.php │ └── Resources/BaselineProfileResource/RelationManagers/BaselineTenantAssignmentsRelationManager.php ├── Jobs/ │ ├── CaptureBaselineSnapshotJob.php │ └── CompareBaselineToTenantJob.php ├── Models/ │ ├── BaselineProfile.php │ ├── BaselineSnapshot.php │ ├── BaselineSnapshotItem.php │ ├── BaselineTenantAssignment.php │ ├── Policy.php │ ├── PolicyVersion.php │ ├── InventoryItem.php │ ├── OperationRun.php │ └── Finding.php ├── Services/ │ ├── Baselines/ │ │ ├── BaselineCaptureService.php │ │ ├── BaselineCompareService.php │ │ ├── CurrentStateHashResolver.php │ │ └── Evidence/ │ ├── Intune/PolicyCaptureOrchestrator.php │ └── OperationRunService.php ├── Support/ │ ├── Baselines/ │ ├── OpsUx/ │ └── OperationRunType.php config/ ├── graph_contracts.php └── tenantpilot.php database/ └── migrations/ tests/ └── Feature/ ``` **Structure Decision**: Laravel monolith. Baseline drift orchestration lives in `app/Services/Baselines` + `app/Jobs`, UI in `app/Filament`, and evidence capture reuses `app/Services/Intune` capture orchestration. Tasks are defined in `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/tasks.md`. ## Complexity Tracking No constitution violations are required for Spec 118 planning. (Table intentionally omitted.) ## Phase 0 — Research (output: research.md) Goals: - Confirm precise extension points for adding full-content evidence capture to existing baseline capture/compare jobs. - Decide the purpose-tagging and idempotency strategy for baseline evidence captured as `PolicyVersion`. - Confirm Monitoring run detail requirements for `context.target_scope` and baseline-specific context sections. Deliverable: `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/research.md` ## Phase 1 — Design (output: data-model.md + contracts/* + quickstart.md) Deliverables: - Data model changes + JSON context shapes: `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/data-model.md` - Route surface contract reference: `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/contracts/openapi.yaml` - Developer quickstart: `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/quickstart.md` Post-design constitution re-check: PASS (see decisions in research + data model docs; Ops-UX and RBAC constraints preserved). ## Phase 2 — Implementation Planning (high-level) 1) Add migrations: - `baseline_profiles.capture_mode` - `baseline_snapshot_items.subject_key` - `policy_versions.capture_purpose`, `operation_run_id`, `baseline_profile_id` + indexes 2) Implement quota-aware, resumable baseline evidence capture phase: - reuse existing capture orchestration (policy payload + assignments + scope tags) - emit capture stats + resume token in `OperationRun.context` 3) Integrate the capture phase into: - baseline capture job (before snapshot build) - baseline compare job (refresh phase before drift evaluation) 4) Update drift matching to use cross-tenant subject key (`policy_type + subject_key`) where `subject_key` is the normalized display name, and record ambiguous/missing match as evidence gaps (no finding). 5) Update Ops-UX context: - ensure `context.target_scope` exists for baseline capture/compare runs - add “why no findings” reason codes 6) Update UI action surfaces: - Baseline profile: capture mode + “Capture baseline (full content)” + tenant-targeted “Compare now (full content)” - Operation run detail: evidence capture panel + “Resume capture” when token exists 7) Add focused Pest tests: - full-content capture creates content-fidelity snapshot items (or warnings + gaps) - compare detects settings drift with content evidence - throttling/resume semantics and “no silent zeros” reason codes 8) Add governance hardening: - enforce rollout gate across UI/services/jobs for full-content mode - redact secrets/PII from captured evidence before persistence/fingerprinting - emit audit events for capture/compare/resume operations - prune baseline-purpose evidence per retention policy (scheduled)