3.3 KiB
3.3 KiB
Phase 0 Output: Research (044)
Decisions
1) scope_key reuse
- Decision: Use the existing Inventory selection hash as
scope_key.- Concretely:
scope_key = InventorySyncRun.selection_hash.
- Concretely:
- Rationale:
- Inventory already normalizes + hashes selection payload deterministically (via
InventorySelectionHasher). - It is already used for concurrency/deduping inventory runs, so it’s the right stable scope identifier.
- Inventory already normalizes + hashes selection payload deterministically (via
- Alternatives considered:
- Compute a second hash (duplicate of selection_hash) → adds drift without benefit.
- Store the raw selection payload as the primary key → not stable without strict normalization.
2) Baseline selection (MVP)
- Decision: Baseline run = previous successful inventory sync run for the same
scope_key; comparison run = latest successful inventory sync run for the samescope_key. - Rationale:
- Matches “run at least twice” scenario.
- Deterministic and explainable.
- Alternatives considered:
- User-pinned baselines → valuable, but deferred (design must allow later via
scope_key).
- User-pinned baselines → valuable, but deferred (design must allow later via
3) Persisted generic Findings
- Decision: Persist Findings in a generic
findingstable. - Rationale:
- Enables stable triage (
acknowledged) without recomputation drift. - Reusable pipeline for Drift now, Audit/Compare later.
- Enables stable triage (
- Alternatives considered:
- Compute-on-demand and store only acknowledgements by fingerprint → harder operationally and can surprise users when diff rules evolve.
4) Generation trigger (MVP)
- Decision: On opening Drift, if findings for (tenant,
scope_key, baseline_run_id, current_run_id) do not exist, dispatch an async job to generate them. - Rationale:
- Avoids long request times.
- Avoids scheduled complexity in MVP.
- Alternatives considered:
- Generate after every inventory run → may be expensive; can be added later.
- Nightly schedule → hides immediacy and complicates operations.
5) Fingerprint and state hashing
- Decision: Use a deterministic fingerprint that changes when the underlying state changes.
- Fingerprint =
sha256(tenant_id + scope_key + subject_type + subject_external_id + change_type + baseline_hash + current_hash). - baseline_hash/current_hash are computed over normalized, sanitized comparison data (exclude volatile fields like timestamps).
- Fingerprint =
- Rationale:
- Stable identity for triage and audit.
- Supports future generators (audit/compare) using same semantics.
- Alternatives considered:
- Fingerprint without baseline/current hash → cannot distinguish changed vs unchanged findings.
6) Evidence minimization
- Decision: Store small, sanitized
evidence_jsonbwith an allowlist shape; no raw payload dumps. - Rationale:
- Aligns with data minimization + safe logging.
- Avoids storing secrets/tokens.
7) Name resolution and Graph safety
- Decision: UI resolves human-readable labels using DB-backed Inventory + Foundations (047) + Groups Cache (051). No render-time Graph calls.
- Rationale:
- Works offline / when tokens are broken.
- Keeps UI safe and predictable.
Notes / Follow-ups for Phase 1
- Define the
findingstable indexes carefully for tenant-scoped filtering (status, type, scope_key, run_ids). - Consider using existing observable run patterns (BulkOperationRun + AuditLogger) for drift generation jobs.