73 lines
3.3 KiB
Markdown
73 lines
3.3 KiB
Markdown
# Phase 0 Output: Research (044)
|
||
|
||
## Decisions
|
||
|
||
### 1) `scope_key` reuse
|
||
|
||
- Decision: Use the existing Inventory selection hash as `scope_key`.
|
||
- Concretely: `scope_key = InventorySyncRun.selection_hash`.
|
||
- Rationale:
|
||
- Inventory already normalizes + hashes selection payload deterministically (via `InventorySelectionHasher`).
|
||
- It is already used for concurrency/deduping inventory runs, so it’s the right stable scope identifier.
|
||
- Alternatives considered:
|
||
- Compute a second hash (duplicate of selection_hash) → adds drift without benefit.
|
||
- Store the raw selection payload as the primary key → not stable without strict normalization.
|
||
|
||
### 2) Baseline selection (MVP)
|
||
|
||
- Decision: Baseline run = previous successful inventory sync run for the same `scope_key`; comparison run = latest successful inventory sync run for the same `scope_key`.
|
||
- Rationale:
|
||
- Matches “run at least twice” scenario.
|
||
- Deterministic and explainable.
|
||
- Alternatives considered:
|
||
- User-pinned baselines → valuable, but deferred (design must allow later via `scope_key`).
|
||
|
||
### 3) Persisted generic Findings
|
||
|
||
- Decision: Persist Findings in a generic `findings` table.
|
||
- Rationale:
|
||
- Enables stable triage (`acknowledged`) without recomputation drift.
|
||
- Reusable pipeline for Drift now, Audit/Compare later.
|
||
- Alternatives considered:
|
||
- Compute-on-demand and store only acknowledgements by fingerprint → harder operationally and can surprise users when diff rules evolve.
|
||
|
||
### 4) Generation trigger (MVP)
|
||
|
||
- Decision: On opening Drift, if findings for (tenant, `scope_key`, baseline_run_id, current_run_id) do not exist, dispatch an async job to generate them.
|
||
- Rationale:
|
||
- Avoids long request times.
|
||
- Avoids scheduled complexity in MVP.
|
||
- Alternatives considered:
|
||
- Generate after every inventory run → may be expensive; can be added later.
|
||
- Nightly schedule → hides immediacy and complicates operations.
|
||
|
||
### 5) Fingerprint and state hashing
|
||
|
||
- Decision: Use a deterministic fingerprint that changes when the underlying state changes.
|
||
- Fingerprint = `sha256(tenant_id + scope_key + subject_type + subject_external_id + change_type + baseline_hash + current_hash)`.
|
||
- baseline_hash/current_hash are computed over normalized, sanitized comparison data (exclude volatile fields like timestamps).
|
||
- Rationale:
|
||
- Stable identity for triage and audit.
|
||
- Supports future generators (audit/compare) using same semantics.
|
||
- Alternatives considered:
|
||
- Fingerprint without baseline/current hash → cannot distinguish changed vs unchanged findings.
|
||
|
||
### 6) Evidence minimization
|
||
|
||
- Decision: Store small, sanitized `evidence_jsonb` with an allowlist shape; no raw payload dumps.
|
||
- Rationale:
|
||
- Aligns with data minimization + safe logging.
|
||
- Avoids storing secrets/tokens.
|
||
|
||
### 7) Name resolution and Graph safety
|
||
|
||
- Decision: UI resolves human-readable labels using DB-backed Inventory + Foundations (047) + Groups Cache (051). No render-time Graph calls.
|
||
- Rationale:
|
||
- Works offline / when tokens are broken.
|
||
- Keeps UI safe and predictable.
|
||
|
||
## Notes / Follow-ups for Phase 1
|
||
|
||
- Define the `findings` table indexes carefully for tenant-scoped filtering (status, type, scope_key, run_ids).
|
||
- Consider using existing observable run patterns (BulkOperationRun + AuditLogger) for drift generation jobs.
|