TenantAtlas/specs/120-secret-redaction-integrity/data-model.md
2026-03-07 17:41:55 +01:00

71 lines
3.3 KiB
Markdown

# Data Model — Secret Redaction Hardening & Snapshot Data Integrity (Spec 120)
This spec extends existing persistence and introduces no new base tables.
## Entities
### 1) PolicyVersion (existing: `App\Models\PolicyVersion`)
Tenant-owned immutable policy evidence.
#### New / changed fields
- `workspace_id` (existing required scope field for newer rows; used for workspace-scoped fingerprint derivation)
- `snapshot` (JSON/array): protected snapshot payload with non-secret values preserved and secret values replaced by `[REDACTED]`
- `assignments` (JSON/array|null): protected assignment payload under the same contract
- `scope_tags` (JSON/array|null): protected scope-tag payload under the same contract
- `secret_fingerprints` (new JSON/array):
- shape:
- `snapshot`: object keyed by RFC 6901 JSON Pointer
- `assignments`: object keyed by RFC 6901 JSON Pointer
- `scope_tags`: object keyed by RFC 6901 JSON Pointer
- values: lowercase HMAC-SHA256 hex digests
- `redaction_version` (new integer contract marker for compliant writes):
- `1` = protected under the Spec 120 classifier contract
#### Relationships
- Belongs to `Tenant`
- Belongs to `Policy`
- Belongs to `OperationRun` (nullable)
- Belongs to `BaselineProfile` (nullable)
#### Validation / invariants
- New writes must set `redaction_version = 1`.
- If a protected value is persisted as `[REDACTED]`, a matching digest entry must exist in `secret_fingerprints` for the same source bucket + JSON Pointer.
- If `redaction_version = 1`, `secret_fingerprints` may be empty only when no protected fields were classified.
- Version identity for dedupe must consider both the protected payload and `secret_fingerprints` so secret-only changes create a new version.
### 2) ProtectedSnapshotResult (new transient service DTO)
The canonical output of the new protection pipeline before persistence.
#### Fields
- `snapshot` (array)
- `assignments` (array|null)
- `scope_tags` (array|null)
- `secret_fingerprints` (array{snapshot: array<string, string>, assignments: array<string, string>, scope_tags: array<string, string>})
- `redaction_version` (int)
- `protected_paths_count` (int)
#### Validation / invariants
- Must be deterministic for the same input payload, workspace, and classifier version.
- Must preserve original object/list shape.
- Must never include raw secret values in any field.
### 3) SecretClassificationRule (new application-level value object)
Non-persisted classifier rule consumed by snapshot, audit, verification, and ops sanitizers.
#### Fields
- `source_bucket` (`snapshot|assignments|scope_tags|audit|verification|ops_failure`)
- `json_pointer` (string|null)
- `field_name` (string)
- `decision` (`protected|visible`)
- `reason` (`exact_key`, `exact_path`, `message_pattern`, `default_visible`)
#### Validation / invariants
- Exact-path rules take precedence over exact-key rules.
- Unknown fields default to visible unless a protected rule matches.
- Message-level sanitizers may protect by exact token pattern, but must not broad-match harmless phrases.
## Derived / Computed Values
- `protection_digest` (implementation detail): composite hash of protected payload + `secret_fingerprints`, used for version dedupe.
- `protected_change_detected` (derived): true when compare/drift sees a fingerprint difference for the same protected path even though the visible payload remains `[REDACTED]`.