TenantAtlas/specs/044-drift-mvp/spec.md

143 lines
6.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Feature Specification: Drift MVP
**Feature Branch**: `feat/044-drift-mvp`
**Created**: 2026-01-07
**Status**: Draft
## Purpose
Detect and report drift between expected and observed states using inventory and run metadata.
This MVP focuses on reporting and triage, not automatic remediation.
## Clarifications
### Session 2026-01-12
- Q: How should Drift pick the baseline run for a given tenant + scope? → A: Baseline = previous successful inventory run for the same scope; compare against the latest successful run.
- Q: Should Drift findings be persisted or computed on demand? → A: Persist findings in DB per comparison (baseline_run_id + current_run_id), including a deterministic fingerprint for stable identity + triage.
- Q: How define the fingerprint (Stable ID) for a drift finding? → A: `sha256(tenant_id + scope_key + subject_type + subject_external_id + change_type + baseline_hash + current_hash)` (normalized; excludes volatile fields).
- Q: Which inventory entities/types are in scope for Drift MVP? → A: Policies + Assignments.
- Q: When should drift findings be generated? → A: On-demand when opening Drift: if findings for (baseline,current,scope) dont exist yet, dispatch an async job to generate them.
## Pinned Decisions (MVP defaults)
- Drift is implemented as a generator that writes persisted Finding rows (not only an in-memory/on-demand diff).
- Baseline selection: baseline = previous successful inventory run for the same scope_key; comparison = latest successful inventory run for the same scope_key.
- Scope is first-class via `scope_key` and must be deterministic to support future pinned baselines and compare workflows.
- Fingerprints are deterministic and stable for triage/audit workflows.
- Drift MVP only uses `finding_type=drift` and `status` in {`new`, `acknowledged`}.
- Default severity: `medium` (until a rule engine exists).
- UI must not perform render-time Graph calls. Graph access (if any) is limited to background sync/jobs.
## Key Entities / Generic Findings (Future-proof)
### Finding (generic)
We want Drift MVP to remain MVP-sized, while making it easy to add future generators (Security Suite Audits, Cross-tenant Compare) without inventing a new model.
Rationale:
- Drift = delta engine over runs.
- Audit = rule engine over inventory.
- Both write Findings with the same semantics: deterministic fingerprint + triage + minimized evidence.
- `finding_type` (enum): `drift` (MVP), later `audit`, `compare`
- `tenant_id`
- `scope_key` (string): deterministic scope identifier (see Scope Definition / FR1)
- `baseline_run_id` (nullable; e.g. audit/compare)
- `current_run_id` (nullable; e.g. audit)
- `fingerprint` (string): deterministic; unique per tenant+scope+subject+change
- `subject_type` (string): e.g. policy type (or other inventory entity type)
- `subject_external_id` (string): Graph external id
- `severity` (enum): `low` / `medium` / `high` (MVP default: `medium`)
- `status` (enum): `new` / `acknowledged` (later: `snoozed` / `assigned` / `commented`)
- `acknowledged_at` (nullable)
- `acknowledged_by_user_id` (nullable)
- `evidence_jsonb` (jsonb): sanitized, small, secrets-free (no raw payload dumps)
- Optional/nullable for later (prepared; out of MVP): `rule_id`, `control_id`, `expected_value`, `source`
MVP implementation scope: only `finding_type=drift`, statuses `new/acknowledged`, and no rule engine.
## User Scenarios & Testing
### Scenario 1: View drift summary
- Given inventory sync has run at least twice
- When the admin opens Drift
- Then they see a summary of changes since the last baseline
### Scenario 2: Drill into a drift finding
- Given a drift finding exists
- When the admin opens the finding
- Then they see what changed, when, and which run observed it
### Scenario 3: Acknowledge/triage
- Given a drift finding exists
- When the admin marks it acknowledged
- Then it is hidden from “new” lists but remains auditable
## Functional Requirements
- FR1: Baseline + scope
- Define `scope_key` as a deterministic string derived from the Inventory Selection.
- Example: `scope_key = sha256(normalized selection payload)`.
- Must remain stable across equivalent selections (normalization), and allow future pinned baselines / compare baselines.
- Baseline run (MVP) = previous successful inventory run for the same `scope_key`.
- Comparison run (MVP) = latest successful inventory run for the same `scope_key`.
- FR2: Finding generation (Drift MVP)
- Findings are persisted per (`baseline_run_id`, `current_run_id`, `scope_key`).
- Findings cover adds, removals, and metadata changes for supported entities (Policies + Assignments).
- Findings are deterministic: same baseline/current + scope_key ⇒ same set of fingerprints.
- FR2a: Fingerprint definition (MVP)
- Fingerprint = `sha256(tenant_id + scope_key + subject_type + subject_external_id + change_type + baseline_hash + current_hash)`.
- `baseline_hash` / `current_hash` are hashes over normalized, sanitized comparison data (exclude volatile fields like timestamps).
- Goal: stable identity for triage + audit compatibility.
- FR2b: Drift MVP scope includes Policies and their Assignments.
- Assignment drift includes target changes (e.g., groupId) and intent changes.
- FR3: Provide Drift UI with summary and details.
- FR4: Triage (MVP)
- Admin can acknowledge a finding; record `acknowledged_by_user_id` + `acknowledged_at`.
- Findings are never deleted in the MVP.
## Non-Functional Requirements
- NFR1: Drift generation must be deterministic for the same baseline and scope.
- NFR2: Drift must remain tenant-scoped and safe to display.
- NFR3: Evidence minimization
- `evidence_jsonb` must be sanitized (no tokens/secrets) and kept small.
- MVP drift evidence should include only:
- `change_type`
- changed_fields / metadata summary (counts, field list)
- run refs (baseline_run_id/current_run_id, timestamps)
- No raw payload dumps.
## Dependencies / Name Resolution
- Drift/Audit UI should resolve labels via Inventory + Foundations (047) + Groups Cache (051) where applicable.
- No render-time Graph calls (Graph only in background sync/jobs, never in UI render).
## Success Criteria
- SC1: Admins can identify drift across supported types (Policies + Assignments) in under 3 minutes.
- SC2: Drift results are consistent across repeated generation for the same baseline.
## Out of Scope
- Automatic revert/promotion.
- Rule engine in MVP (Audit later), but the data model is prepared via `rule_id` / `control_id` / `expected_value`.
## Future Work (non-MVP)
- Security Suite Audits: add rule-based generators that write Findings (no new Finding model).
- Cross-tenant Compare: may write Findings (`finding_type=compare`) or emit a compatible format that can be stored as Findings.
## Related Specs
- Program: `specs/039-inventory-program/spec.md`
- Core: `specs/040-inventory-core/spec.md`
- Compare: `specs/043-cross-tenant-compare-and-promotion/spec.md`