5.6 KiB
Research: Finding Risk Acceptance Lifecycle
Decision 1: Use a dedicated tenant-owned exception aggregate instead of overloading Finding.closed_reason
Decision: Introduce a dedicated FindingException aggregate as the tenant-owned governance record for accepted risk, rather than continuing to encode risk acceptance purely as Finding.status = risk_accepted plus closed_reason.
Rationale: The existing finding model already exposes risk_accepted as a terminal status, but the handover and roadmap explicitly identify the absence of a formal exception entity as the product gap. A dedicated aggregate lets the system track request, approval, rejection, renewal, revocation, expiry, accountable owner, and linked evidence without distorting the meaning of generic finding workflow fields.
Alternatives considered:
- Reuse
Finding.closed_reasonand audit metadata only: rejected because it cannot represent a durable approval lifecycle or one current valid exception per finding. - Create a generic cross-domain waiver engine immediately: rejected because the spec is intentionally bounded to finding-specific exceptions in the first rollout.
Decision 2: Preserve history via append-only decision records under one exception root
Decision: Model exception history as one root FindingException record with append-only FindingExceptionDecision child records for request, approval, rejection, renewal, and revocation decisions.
Rationale: The product needs both a stable current-state record for efficient tenant and canonical queries and a durable history that survives renewals and revocations. A root aggregate with child decisions avoids rewriting old decisions, keeps list queries fast, and aligns with the repo's audit-first lifecycle design.
Alternatives considered:
- Create a new top-level exception row for every renewal: rejected because current-state lookup and canonical queue filtering become noisier and require additional dedupe logic.
- Store all history only in
AuditLog: rejected because lifecycle state and validity queries would depend on replaying historical events instead of reading domain state.
Decision 3: Block self-approval by default in v1
Decision: Normal workflow blocks the requester from approving their own exception request. No self-approval override is included in the first slice.
Rationale: The spec requires approval-separation rules, and a default no-self-approval rule is the clearest governance baseline. It reduces ambiguity, simplifies policy design, and matches the product's least-privilege posture.
Alternatives considered:
- Allow self-approval for owners or managers: rejected because it weakens the governance signal and creates policy ambiguity in the first rollout.
- Introduce a special override capability immediately: rejected because it expands RBAC and exception policy complexity before the core workflow is proven.
Decision 4: Keep normal exception decisions outside OperationRun
Decision: Request, approval, rejection, renewal, and revocation remain synchronous DB-backed mutations without a dedicated OperationRun in v1.
Rationale: These actions are local governance decisions, expected to complete quickly, and do not perform remote work. The constitution allows DB-only security-relevant actions to skip OperationRun as long as they remain auditable. Using OperationRun here would add operational surface area without adding observability value.
Alternatives considered:
- Use
OperationRunfor every exception decision: rejected because it violates the repo's preference to avoid long-running infrastructure for fast DB-only mutations. - Add a scheduled reminder/expiry job in the first slice: rejected because the first release can satisfy reminder semantics through explicit expiring-state UI and canonical queue visibility.
Decision 5: Link supporting evidence through structured references, not copied payloads
Decision: Exception records store structured evidence references such as source_type, source_id, source_fingerprint, and a small summary snapshot, following the evidence snapshot item pattern instead of embedding raw evidence payloads.
Rationale: The repo already uses fingerprinted and summarized evidence references in EvidenceSnapshotItem and review-pack generation. Reusing that pattern keeps exception history intelligible even when live artifacts change, while preserving data minimization.
Alternatives considered:
- Store raw evidence JSON directly on the exception: rejected because it increases payload size and risks leaking data better handled by the evidence domain.
- Store only foreign keys to live evidence records: rejected because history becomes opaque if referenced artifacts are later expired or superseded.
Decision 6: Risk governance validity is derived from exception state, not from finding status alone
Decision: A finding counts as currently valid accepted risk only when it is linked to an active, unexpired, unrevoked exception. The finding's risk_accepted status alone is insufficient.
Rationale: This closes the core audit gap identified in the handover and allows evidence and reporting consumers to distinguish governed accepted risk from stale or unsupported states. The existing FindingWorkflowService remains the single mutation path for the finding status, but validity becomes a cross-record rule.
Alternatives considered:
- Treat
Finding.status = risk_acceptedas sufficient forever: rejected because it preserves the current governance gap. - Automatically revert finding status when an exception expires: rejected because it mutates the finding lifecycle as a side effect and obscures historical operator intent.