# Phase 0 Output: Research (044) ## Decisions ### 1) `scope_key` reuse - Decision: Use the existing Inventory selection hash as `scope_key`. - Concretely: `scope_key = InventorySyncRun.selection_hash`. - Rationale: - Inventory already normalizes + hashes selection payload deterministically (via `InventorySelectionHasher`). - It is already used for concurrency/deduping inventory runs, so it’s the right stable scope identifier. - Alternatives considered: - Compute a second hash (duplicate of selection_hash) → adds drift without benefit. - Store the raw selection payload as the primary key → not stable without strict normalization. ### 2) Baseline selection (MVP) - Decision: Baseline run = previous successful inventory sync run for the same `scope_key`; comparison run = latest successful inventory sync run for the same `scope_key`. - Rationale: - Matches “run at least twice” scenario. - Deterministic and explainable. - Alternatives considered: - User-pinned baselines → valuable, but deferred (design must allow later via `scope_key`). ### 3) Persisted generic Findings - Decision: Persist Findings in a generic `findings` table. - Rationale: - Enables stable triage (`acknowledged`) without recomputation drift. - Reusable pipeline for Drift now, Audit/Compare later. - Alternatives considered: - Compute-on-demand and store only acknowledgements by fingerprint → harder operationally and can surprise users when diff rules evolve. ### 4) Generation trigger (MVP) - Decision: On opening Drift, if findings for (tenant, `scope_key`, baseline_run_id, current_run_id) do not exist, dispatch an async job to generate them. - Rationale: - Avoids long request times. - Avoids scheduled complexity in MVP. - Alternatives considered: - Generate after every inventory run → may be expensive; can be added later. - Nightly schedule → hides immediacy and complicates operations. ### 5) Fingerprint and state hashing - Decision: Use a deterministic fingerprint that changes when the underlying state changes. - Fingerprint = `sha256(tenant_id + scope_key + subject_type + subject_external_id + change_type + baseline_hash + current_hash)`. - baseline_hash/current_hash are computed over normalized, sanitized comparison data (exclude volatile fields like timestamps). - Rationale: - Stable identity for triage and audit. - Supports future generators (audit/compare) using same semantics. - Alternatives considered: - Fingerprint without baseline/current hash → cannot distinguish changed vs unchanged findings. ### 6) Evidence minimization - Decision: Store small, sanitized `evidence_jsonb` with an allowlist shape; no raw payload dumps. - Rationale: - Aligns with data minimization + safe logging. - Avoids storing secrets/tokens. ### 7) Name resolution and Graph safety - Decision: UI resolves human-readable labels using DB-backed Inventory + Foundations (047) + Groups Cache (051). No render-time Graph calls. - Rationale: - Works offline / when tokens are broken. - Keeps UI safe and predictable. ## Notes / Follow-ups for Phase 1 - Define the `findings` table indexes carefully for tenant-scoped filtering (status, type, scope_key, run_ids). - Consider using existing observable run patterns (BulkOperationRun + AuditLogger) for drift generation jobs.