TenantAtlas/specs/118-baseline-drift-engine/tasks.md
ahmido 92704a2f7e Spec 118: Resumable baseline evidence capture + snapshot UX (#143)
Implements Spec 118 baseline drift engine improvements:

- Resumable, budget-aware evidence capture for baseline capture/compare runs (resume token + UI action)
- “Why no findings?” reason-code driven explanations and richer run context panels
- Baseline Snapshot resource (list/detail) with fidelity visibility
- Retention command + schedule for pruning baseline-purpose PolicyVersions
- i18n strings for Baseline Compare landing

Verification:
- `vendor/bin/sail bin pint --dirty --format agent`
- `vendor/bin/sail artisan test --compact --filter=Baseline` (159 passed)

Note:
- `docs/audits/redaction-audit-2026-03-04.md` left untracked (not part of PR).

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #143
2026-03-04 22:34:13 +00:00

244 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
description: "Task list for Spec 118 implementation"
---
# Tasks: Golden Master Deep Drift v2 (Full Content Capture)
**Input**: Design documents from `/specs/118-baseline-drift-engine/`
**Tests**: REQUIRED (Pest) — this feature changes runtime behavior.
**Terminology**: `subject_key` = Spec 118 `normalized display_name` (trim + collapse internal whitespace + lowercase).
**Data isolation (SCOPE-001)**: Workspace-owned `baseline_snapshot_items` MUST NOT persist tenant identifiers (no tenant IDs, no tenant external IDs, no operation run IDs, no policy version IDs) — only cross-tenant keys + non-tenant metadata.
## Phase 1: Setup (Shared Infrastructure)
**Purpose**: Establish a safe baseline and introduce feature-level configuration scaffolding.
- [X] T001 Capture current baseline behavior by running existing suites in `tests/Feature/Baselines/BaselineCaptureTest.php`, `tests/Feature/Baselines/BaselineCompareFindingsTest.php`, `tests/Feature/Filament/BaselineProfileCaptureStartSurfaceTest.php`, and `tests/Feature/Filament/BaselineCompareLandingStartSurfaceTest.php`
- [X] T002 [P] Add Spec 118 rollout + budget env vars to `.env.example` (e.g. `TENANTPILOT_BASELINE_FULL_CONTENT_CAPTURE_ENABLED`, `TENANTPILOT_BASELINE_EVIDENCE_MAX_ITEMS_PER_RUN=200`, `TENANTPILOT_BASELINE_EVIDENCE_MAX_CONCURRENCY=5`, `TENANTPILOT_BASELINE_EVIDENCE_MAX_RETRIES=3`, `TENANTPILOT_BASELINE_EVIDENCE_RETENTION_DAYS=90`)
- [X] T003 [P] Add config surface for Spec 118 rollout + budgets in `config/tenantpilot.php` (new `baselines.full_content_capture.*` keys sourced from env)
---
## Phase 2: Foundational (Blocking Prerequisites)
**Purpose**: Shared primitives required by ALL user stories.
**⚠️ CRITICAL**: No user story work should begin until this phase is complete.
- [X] T004 Add baseline capture mode enum in `app/Support/Baselines/BaselineCaptureMode.php` (values: `meta_only`, `opportunistic`, `full_content`)
- [X] T005 [P] Add policy version capture purpose enum in `app/Support/Baselines/PolicyVersionCapturePurpose.php` (values: `backup`, `baseline_capture`, `baseline_compare`)
- [X] T006 [P] Add subject-key helper in `app/Support/Baselines/BaselineSubjectKey.php` (normalize display name + derive workspace-safe subject id as `sha256(policy_type|subject_key)` for `baseline_snapshot_items.subject_external_id`)
- [X] T007 [P] Add baseline compare “why no findings” reason codes in `app/Support/Baselines/BaselineCompareReasonCode.php` (e.g. `no_subjects_in_scope`, `coverage_unproven`, `evidence_capture_incomplete`, `rollout_disabled`, `no_drift_detected`)
- [X] T008 [P] Add full-content rollout gate helper in `app/Support/Baselines/BaselineFullContentRolloutGate.php` (reads `config('tenantpilot.baselines.full_content_capture.enabled')`, provides an `assertEnabled()` used by services + jobs)
- [X] T009 [P] Add resume token contract in `app/Support/Baselines/BaselineEvidenceResumeToken.php` (versioned encode/decode; stored as opaque string in `operation_runs.context.*.resume_token`)
- [X] T010 [P] Add policy snapshot redactor in `app/Services/Intune/PolicySnapshotRedactor.php` (remove secrets/PII from payload/assignments/scope tags before persistence + hashing)
- [X] T011 [P] Add redaction coverage test in `tests/Feature/Intune/PolicySnapshotRedactionTest.php` (assert stored `PolicyVersion.snapshot` is redacted and content hash uses redacted content)
- [X] T012 Add migration for `baseline_profiles.capture_mode` in `database/migrations/2026_03_03_100001_add_capture_mode_to_baseline_profiles_table.php`
- [X] T013 [P] Add migration for `baseline_snapshot_items.subject_key` + index in `database/migrations/2026_03_03_100002_add_subject_key_to_baseline_snapshot_items_table.php`
- [X] T014 [P] Add migration for `policy_versions.capture_purpose`, `policy_versions.operation_run_id`, `policy_versions.baseline_profile_id` + indexes in `database/migrations/2026_03_03_100003_add_baseline_purpose_to_policy_versions_table.php`
- [X] T015 Update `app/Models/BaselineProfile.php` to store/cast `capture_mode` via `BaselineCaptureMode` and include it in `$fillable` (default: `opportunistic`)
- [X] T016 [P] Update factory defaults/states for capture mode in `database/factories/BaselineProfileFactory.php`
- [X] T017 [P] Update `database/factories/BaselineSnapshotItemFactory.php` to set `subject_key` derived from `meta_jsonb.display_name` via `BaselineSubjectKey` and set `subject_external_id` using the workspace-safe subject id (no tenant external IDs)
- [X] T018 Update `app/Models/PolicyVersion.php` to cast `capture_purpose` and define relationships to `OperationRun` + `BaselineProfile` (new nullable FKs)
- [X] T019 [P] Update `database/factories/PolicyVersionFactory.php` to default `capture_purpose` to `backup`
- [X] T020 Update `app/Services/Intune/VersionService.php` to apply `PolicySnapshotRedactor` before persistence/hashing and persist `capture_purpose`, `operation_run_id`, and `baseline_profile_id` when capturing versions (including via `captureFromGraph()`)
- [X] T021 Update `app/Services/Intune/PolicyCaptureOrchestrator.php` to pass baseline-purpose attribution into `PolicyVersion` creation/reuse/backfill and ensure snapshot dedupe uses redacted payloads (no secrets/PII in stored snapshots)
- [X] T022 Update content hashing to include settings + assignments + scope tags in `app/Services/Baselines/Evidence/ContentEvidenceProvider.php` (use `SettingsNormalizer`, hash normalized `assignments`, and hash normalized scope-tag IDs via `ScopeTagsNormalizer`)
- [X] T023 Ensure content evidence provenance includes `policy_version_id`, `operation_run_id`, and `capture_purpose` in `app/Services/Baselines/Evidence/ContentEvidenceProvider.php` (tenant-scoped only; snapshot items must strip tenant identifiers)
- [X] T024 Implement quota-aware baseline evidence capture phase scaffold in `app/Services/Baselines/BaselineContentCapturePhase.php` (inputs: tenant + subjects + purpose + budgets incl. concurrency + optional resume token; outputs: stats + gaps + optional resume token)
- [X] T025 Update run start context to include `target_scope` + `capture_mode` and enforce rollout gate for `full_content` in `app/Services/Baselines/BaselineCaptureService.php` (reject start if disabled)
- [X] T026 [P] Update run start context to include `target_scope` + `capture_mode` and enforce rollout gate for `full_content` in `app/Services/Baselines/BaselineCompareService.php` (reject start if disabled)
- [X] T027 Add capture mode field + badge to Filament baseline profile CRUD in `app/Filament/Resources/BaselineProfileResource.php` (hide/disable `full_content` option when rollout flag is disabled)
**Checkpoint**: DB + enums + capture phase scaffolding are in place; user stories can be implemented and tested independently.
---
## Phase 3: User Story 1 — Capture a full-content baseline without per-policy steps (Priority: P1) 🎯 MVP
**Goal**: Capture a baseline snapshot that uses full-content evidence by default (with explicit gaps + warnings if capture is incomplete).
**Independent Test**: Create a baseline profile configured for full-content capture, run “Capture baseline (full content)”, and validate the snapshot items have content-fidelity evidence (or explicit gaps) and the run context records capture stats.
### Tests (write first)
- [X] T028 [P] [US1] Add baseline full-content on-demand evidence test in `tests/Feature/BaselineDriftEngine/CaptureBaselineFullContentOnDemandTest.php` (no PolicyVersion exists → capture creates one with `capture_purpose=baseline_capture` and snapshot item fidelity is `content`)
- [X] T029 [P] [US1] Update meta-fallback test to assert opportunistic mode degrades to meta when evidence is missing in `tests/Feature/BaselineDriftEngine/CaptureBaselineMetaFallbackTest.php`
- [X] T030 [P] [US1] Update capture start surface expectations for full-content labeling + rollout gating in `tests/Feature/Filament/BaselineProfileCaptureStartSurfaceTest.php`
- [X] T031 [P] [US1] Add snapshot item isolation test in `tests/Feature/BaselineDriftEngine/BaselineSnapshotNoTenantIdentifiersTest.php` (assert `baseline_snapshot_items` do not store tenant external IDs and `meta_jsonb` omits tenant identifiers like `meta_contract.subject_external_id` and `evidence.observed_operation_run_id`)
- [X] T032 [P] [US1] Add audit event coverage for baseline capture start/completion in `tests/Feature/BaselineDriftEngine/BaselineCaptureAuditEventsTest.php` (assert action metadata includes purpose, scope counts, and gap/warning summary)
### Implementation
- [X] T033 [US1] Update baseline capture action labeling + modal copy + rollout gate messaging in `app/Filament/Resources/BaselineProfileResource/Pages/ViewBaselineProfile.php` (show “Capture baseline (full content)” when `capture_mode=full_content`)
- [X] T034 [US1] Integrate `BaselineContentCapturePhase` into baseline capture in `app/Jobs/CaptureBaselineSnapshotJob.php` (purpose `baseline_capture`, budgeted, record `context.baseline_capture.evidence_capture`, `context.baseline_capture.gaps`, `context.baseline_capture.resume_token`, and add job-level rollout gate guard)
- [X] T035 [US1] Persist `subject_key` and workspace-safe `subject_external_id` (derived via `BaselineSubjectKey`) when building snapshot items, and sanitize `meta_jsonb` to exclude tenant identifiers in `app/Jobs/CaptureBaselineSnapshotJob.php`
- [X] T036 [US1] Update baseline snapshot identity hashing to use `policy_type + subject_key + baseline_hash` in `app/Services/Baselines/BaselineSnapshotIdentity.php` (dedupe must not depend on tenant-specific external IDs)
- [X] T037 [US1] Ensure capture run `status`/`outcome` transitions go through `OperationRunService` and mark warnings (`OperationRunOutcome::PartiallySucceeded`) when any subject falls back to meta or is skipped in `app/Jobs/CaptureBaselineSnapshotJob.php`
- [X] T038 [US1] Expand capture audit events to include purpose, scope counts, evidence capture stats, and gap/warning summary in `app/Jobs/CaptureBaselineSnapshotJob.php`
- [X] T039 [US1] Add snapshot fidelity + gaps counts into `baseline_snapshots.summary_jsonb` for snapshot list/detail UX in `app/Jobs/CaptureBaselineSnapshotJob.php`
**Parallel execution example (US1)**:
- Developer A: T028, T034, T036
- Developer B: T030, T033, T035, T038
**Checkpoint**: A baseline snapshot can be captured in full-content mode without per-policy steps, and runs are explainable when gaps exist.
---
## Phase 4: User Story 2 — Compare now with full content and get explainable drift (Priority: P1)
**Goal**: Compare baseline vs current using content-first evidence refresh, cross-tenant subject matching, and explainable run context.
**Independent Test**: Capture a full-content baseline, simulate a settings-only change for a subject, run “Compare now (full content)”, and assert a “different version” finding exists with content provenance.
### Tests (write first)
- [X] T040 [P] [US2] Add cross-tenant match test (policy_type + `subject_key`) in `tests/Feature/Baselines/BaselineCompareCrossTenantMatchTest.php`
- [X] T041 [P] [US2] Add ambiguous match suppression test in `tests/Feature/Baselines/BaselineCompareAmbiguousMatchGapTest.php` (duplicate `subject_key` values → evidence gap; no finding)
- [X] T042 [P] [US2] Add coverage proof guard test in `tests/Feature/Baselines/BaselineCompareCoverageProofGuardTest.php` (uncovered types suppress `missing_policy` outcomes; run completes with warnings + records context)
- [X] T043 [P] [US2] Add stable recurrence identity test in `tests/Feature/Baselines/BaselineCompareFindingRecurrenceKeyTest.php` (recurrence key independent of hashes; retries dont duplicate; lifecycle fields update)
- [X] T044 [P] [US2] Update compare start surface expectations for full-content labeling + rollout gating in `tests/Feature/Filament/BaselineCompareLandingStartSurfaceTest.php`
- [X] T045 [P] [US2] Add baseline profile “Compare now (full content)” start-surface test in `tests/Feature/Filament/BaselineProfileCompareStartSurfaceTest.php`
- [X] T046 [P] [US2] Add audit event coverage for baseline compare start/completion in `tests/Feature/Baselines/BaselineCompareAuditEventsTest.php` (purpose, scope counts, gaps/warnings summary)
### Implementation
- [X] T047 [US2] Add “Compare now (full content)” header action to baseline profile view in `app/Filament/Resources/BaselineProfileResource/Pages/ViewBaselineProfile.php` (select target tenant; require `tenant.sync`; enforce rollout gate server-side)
- [X] T048 [US2] Integrate `BaselineContentCapturePhase` refresh into compare in `app/Jobs/CompareBaselineToTenantJob.php` (purpose `baseline_compare`, budgeted, record `context.baseline_compare.evidence_capture`, `context.baseline_compare.evidence_gaps`, `context.baseline_compare.resume_token`, and add job-level rollout gate guard)
- [X] T049 [US2] Switch compare matching to `policy_type + subject_key` in `app/Jobs/CompareBaselineToTenantJob.php` (load baseline items by `subject_key`; compute current `subject_key` from inventory display name; detect missing/empty/duplicate keys on either side; record gap reasons; suppress drift evaluation for those keys)
- [X] T050 [US2] Enforce coverage proof guard behavior in `app/Jobs/CompareBaselineToTenantJob.php` (suppress `missing_policy` for uncovered/unproven types; record warning + `BaselineCompareReasonCode` when suppression affects outcomes)
- [X] T051 [US2] Update finding recurrence identity to be stable and independent of hashes in `app/Jobs/CompareBaselineToTenantJob.php` (recurrence key uses tenant_id + baseline_profile_id + policy_type + subject_key + change_type; retries must not duplicate findings)
- [X] T052 [US2] Ensure findings carry `subject_key` + `display_name` fallbacks in `evidence_jsonb` and update subject display name fallback logic in `app/Filament/Resources/FindingResource.php` (COALESCE inventory display name with evidence display name)
- [X] T053 [US2] Ensure compare run context contains scope totals, processed counts, coverage proof status, fidelity breakdown, evidence capture stats, and top gap reasons in `app/Jobs/CompareBaselineToTenantJob.php`
- [X] T054 [US2] Update baseline compare landing to label “Compare now (full content)” when applicable in `app/Filament/Pages/BaselineCompareLanding.php` and `resources/views/filament/pages/baseline-compare-landing.blade.php`
- [X] T055 [US2] Extend stats DTO to surface fidelity + evidence gap summary from run context in `app/Support/Baselines/BaselineCompareStats.php`
- [X] T056 [US2] Add evidence capture + gaps panels for baseline capture/compare runs in Monitoring detail in `app/Filament/Resources/OperationRunResource.php`
- [X] T057 [US2] Expand compare audit events to include purpose, scope counts, evidence capture stats, and gaps/warnings summary in `app/Jobs/CompareBaselineToTenantJob.php`
**Parallel execution example (US2)**:
- Developer A: T040, T048, T050, T056
- Developer B: T045, T047, T054, T055, T052
**Checkpoint**: Compare runs refresh evidence when needed, generate findings reliably, and provide explainable context even with coverage warnings or gaps.
---
## Phase 5: User Story 3 — Throttling-safe, resumable evidence capture (Priority: P1)
**Goal**: Evidence capture respects quotas, records a resume token, and resumes deterministically without duplicating work.
**Independent Test**: Simulate throttling/budget exhaustion, verify run records a resume token, then resume and complete without re-capturing already-captured subjects.
### Tests (write first)
- [X] T058 [P] [US3] Add “budget exhaustion produces resume token” test in `tests/Feature/Baselines/BaselineCompareResumeTokenTest.php`
- [X] T059 [P] [US3] Add “resume is idempotent” test in `tests/Feature/Baselines/BaselineCompareResumeIdempotencyTest.php`
- [X] T060 [P] [US3] Add resume token contract test in `tests/Feature/Baselines/BaselineEvidenceResumeTokenContractTest.php` (token is opaque; decode yields deterministic resume state)
- [X] T061 [P] [US3] Add run-detail resume action test in `tests/Feature/Filament/OperationRunResumeCaptureActionTest.php`
- [X] T062 [P] [US3] Add audit event coverage for resume capture in `tests/Feature/Baselines/BaselineResumeCaptureAuditEventsTest.php`
### Implementation
- [X] T063 [US3] Implement budgets (items-per-run + concurrency + retries) + retry/backoff/jitter + throttling gap reasons + resume cursor handling in `app/Services/Baselines/BaselineContentCapturePhase.php` (use `BaselineEvidenceResumeToken` encode/decode)
- [X] T064 [US3] Add resume starter service in `app/Services/Baselines/BaselineEvidenceCaptureResumeService.php` (start follow-up `baseline_capture`/`baseline_compare` runs from a prior run + resume token; enforce RBAC; write audit events)
- [X] T065 [US3] Add “Resume capture” header action for eligible runs in `app/Filament/Pages/Operations/TenantlessOperationRunViewer.php` (requires confirmation; uses Ops-UX queued toast + canonical view-run link)
- [X] T066 [US3] Wire resume token consumption + re-emission into `app/Jobs/CaptureBaselineSnapshotJob.php` (baseline capture) and `app/Jobs/CompareBaselineToTenantJob.php` (baseline compare)
**Parallel execution example (US3)**:
- Developer A: T058, T063, T066
- Developer B: T061, T064, T065
**Checkpoint**: Operators can safely complete large scopes via resumable capture without manual per-policy capture.
---
## Phase 6: User Story 4 — “Why no findings?” is always clear (Priority: P2)
**Goal**: Zero findings never looks like a silent failure; compare run detail clearly explains the outcome.
**Independent Test**: Run compare with zero subjects (or with suppressed findings due to coverage/gaps) and verify a clear explanation sourced from run context is displayed.
### Tests (write first)
- [X] T067 [P] [US4] Add reason-code coverage test for zero-subject / zero-findings / suppressed-by-coverage outcomes in `tests/Feature/Baselines/BaselineCompareWhyNoFindingsReasonCodeTest.php`
- [X] T068 [P] [US4] Add UI assertion test for “why no findings” messaging in `tests/Feature/Filament/BaselineCompareLandingWhyNoFindingsTest.php`
### Implementation
- [X] T069 [US4] Populate `context.baseline_compare.reason_code` for all 0-subject / 0-findings outcomes in `app/Jobs/CompareBaselineToTenantJob.php` (use `BaselineCompareReasonCode`, including `coverage_unproven`/`rollout_disabled` where applicable)
- [X] T070 [US4] Render reason-code explanation + evidence context in Monitoring run detail in `app/Filament/Resources/OperationRunResource.php`
- [X] T071 [US4] Replace “All clear” copy with reason-aware messaging on baseline compare landing in `resources/views/filament/pages/baseline-compare-landing.blade.php` (source reason code from `BaselineCompareStats`)
- [X] T072 [US4] Propagate reason code + human message from run context in `app/Support/Baselines/BaselineCompareStats.php`
**Parallel execution example (US4)**:
- Developer A: T067, T069
- Developer B: T068, T071, T072
**Checkpoint**: Every compare run with “0 findings” has a clear, user-visible explanation and supporting evidence context.
---
## Phase 7: Polish & Cross-Cutting Concerns
**Purpose**: Guardrails, visibility, and validation across all stories.
- [X] T073 [P] Add Spec 118 no-legacy regression guard(s) in `tests/Feature/Guards/Spec118NoLegacyBaselineDriftGuardTest.php` (assert capture/compare do not implement hashing outside the provider/hasher pipeline and do not reference deprecated helpers)
- [X] T074 Update PolicyVersion listing to hide baseline-purpose evidence by default (unless the actor has `tenant.sync` or `tenant_findings.view`) in `app/Filament/Resources/PolicyVersionResource.php`
- [X] T075 [P] Add visibility/authorization coverage for baseline-purpose PolicyVersions in `tests/Feature/Filament/PolicyVersionBaselineEvidenceVisibilityTest.php` (assert baseline-purpose rows are hidden for `tenant.view`-only actors)
- [X] T076 Implement baseline-purpose PolicyVersion retention enforcement in `app/Console/Commands/PruneBaselineEvidencePolicyVersionsCommand.php` and schedule it in `routes/console.php` (prune `baseline_capture`/`baseline_compare` older than configured retention; do not prune `backup`) + tests in `tests/Feature/Retention/PruneBaselineEvidencePolicyVersionsTest.php` and `tests/Feature/Scheduling/PruneBaselineEvidencePolicyVersionsScheduleTest.php`
- [X] T077 Add Baseline Snapshot list/detail surfaces with fidelity visibility in `app/Filament/Resources/BaselineSnapshotResource.php`, `app/Filament/Resources/BaselineSnapshotResource/Pages/ListBaselineSnapshots.php`, and `app/Filament/Resources/BaselineSnapshotResource/Pages/ViewBaselineSnapshot.php` (badge + counts by fidelity; “captured with gaps” state) + tests in `tests/Feature/Filament/BaselineSnapshotFidelityVisibilityTest.php`
- [X] T078 Run formatting on changed files using `vendor/bin/sail bin pint --dirty --format agent` (touchpoints include `app/Jobs/CaptureBaselineSnapshotJob.php`, `app/Jobs/CompareBaselineToTenantJob.php`, `app/Services/Baselines/BaselineContentCapturePhase.php`)
- [X] T079 Run targeted test suite from `specs/118-baseline-drift-engine/quickstart.md` and update it if any step is inaccurate in `specs/118-baseline-drift-engine/quickstart.md`
---
## Dependencies & Execution Order
### Story completion order
- Phase 1 (Setup) → Phase 2 (Foundational) → user stories.
- User stories after Phase 2:
- **US1 (P1)** is the MVP capture capability and should be implemented first end-to-end.
- **US2 (P1)** depends on US1 for end-to-end validation (a baseline snapshot must exist), but implementation can proceed in parallel after Phase 2.
- **US3 (P1)** depends on the capture phase being integrated in US1/US2.
- **US4 (P2)** depends on US2s run-context fields.
### Dependency graph
```mermaid
graph TD
P1["Phase 1: Setup"] --> P2["Phase 2: Foundational"]
P2 --> US1["US1: Capture baseline (full content)"]
P2 --> US2["US2: Compare now (full content)"]
US1 --> US2
US2 --> US3["US3: Resumable capture"]
US2 --> US4["US4: Why no findings"]
US3 --> POLISH["Phase 7: Polish"]
US4 --> POLISH
```
## Implementation Strategy (MVP-first)
1) Ship **US1** with a strict run-context contract and explicit gap reporting (no silent success).
2) Add **US2** compare refresh + cross-tenant matching with explainability.
3) Harden with **US3** resumability and throttle-safe behavior.
4) Complete operator trust with **US4** reason-code UX.
5) Enforce “no legacy” and visibility constraints in **Polish**.