## Summary Implements Spec 284 for provider-neutral artifact source taxonomy. - add shared artifact source descriptor, resolver, taxonomy, and provider-detail support - update findings, evidence snapshots, stored reports, inventory items, and tenant review surfaces to disclose descriptor-first artifact summaries - add bounded Pest unit, feature, guard, and browser coverage for the taxonomy slice - include the completed Spec 284 package artifacts under `specs/284-provider-neutral-artifact-source-taxonomy/` ## Notes - branch: `284-provider-neutral-artifact-source-taxonomy` - commit: `bf8d59e0` - this PR was created as part of the requested commit/push/PR flow against `platform-dev` Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de> Reviewed-on: #343
179 lines
7.8 KiB
Markdown
179 lines
7.8 KiB
Markdown
# Data Model: Provider-neutral Artifact Source Taxonomy
|
|
|
|
## Existing persisted truth reused
|
|
|
|
### Finding
|
|
|
|
Existing persisted finding fields already provide the raw inputs for a provider-neutral descriptor:
|
|
|
|
- `workspace_id`
|
|
- `managed_environment_id`
|
|
- `finding_type`
|
|
- optional `source`
|
|
- `title`
|
|
- `status`
|
|
- `severity`
|
|
- `evidence_jsonb`
|
|
|
|
`finding_type` and `source` remain persisted provider or artifact detail. `284` adds a shared descriptor over them rather than replacing them as raw evidence.
|
|
|
|
### EvidenceSnapshotItem
|
|
|
|
Existing evidence snapshot item fields already provide the current evidence-source seam:
|
|
|
|
- `workspace_id`
|
|
- `managed_environment_id`
|
|
- `dimension_key`
|
|
- `state`
|
|
- `required`
|
|
- `source_kind`
|
|
- `source_record_type`
|
|
- `source_record_id`
|
|
- `source_fingerprint`
|
|
- `measured_at`
|
|
- `freshness_at`
|
|
- `summary_payload`
|
|
- `sort_order`
|
|
|
|
`284` extends this seam by adding or deriving a provider-neutral descriptor so `source_record_type` stops acting as the only top-level source identity.
|
|
|
|
### StoredReport
|
|
|
|
Existing stored-report truth already includes:
|
|
|
|
- `workspace_id`
|
|
- `managed_environment_id`
|
|
- `report_type`
|
|
- `payload`
|
|
- `fingerprint`
|
|
- `previous_fingerprint`
|
|
|
|
Current report producers already write provider-owned fields such as `provider_key` into payload. `284` lifts the shared lineage fields into the common descriptor without deleting provider-owned detail.
|
|
|
|
### InventoryItem
|
|
|
|
Existing inventory truth already includes:
|
|
|
|
- `workspace_id`
|
|
- `managed_environment_id`
|
|
- `policy_type`
|
|
- `external_id`
|
|
- `platform`
|
|
- `display_name`
|
|
- `meta_jsonb`
|
|
- `last_seen_at`
|
|
- `last_seen_operation_run_id`
|
|
|
|
`policy_type` remains provider-owned or legacy artifact detail after `284`; it no longer stands alone as the platform's only artifact type label.
|
|
|
|
## Pinned initial descriptor inventories
|
|
|
|
### `source_family`
|
|
|
|
| Value | Meaning |
|
|
|---|---|
|
|
| `finding` | artifact lineage originates from a finding or finding-derived summary |
|
|
| `stored_report` | artifact lineage originates from a stored report |
|
|
| `evidence_snapshot` | artifact lineage is summarized inside an evidence snapshot item or evidence snapshot view model |
|
|
| `inventory` | artifact lineage originates from inventory capture or inventory projection |
|
|
| `operation_run` | artifact lineage originates from operation-run rollup evidence |
|
|
|
|
### `source_kind`
|
|
|
|
| Value | Meaning |
|
|
|---|---|
|
|
| `model_summary` | summary derived directly from one or more model records |
|
|
| `stored_report` | summary or artifact read directly from stored-report persistence |
|
|
| `operation_rollup` | summary derived from operation-run history |
|
|
| `inventory_projection` | summary derived from inventory read models |
|
|
|
|
### `source_target_kind`
|
|
|
|
| Value | Meaning |
|
|
|---|---|
|
|
| `managed_environment` | artifact summarizes environment-wide state |
|
|
| `governed_subject` | artifact describes one governed subject or provider object under the environment |
|
|
| `provider_connection` | artifact primarily describes provider-connection state |
|
|
| `operation_run` | artifact primarily describes one operation run |
|
|
|
|
## New derived contracts
|
|
|
|
### ArtifactSourceDescriptor
|
|
|
|
Represents the provider-neutral lineage envelope for a finding, evidence summary, stored report, inventory item, or touched review summary.
|
|
|
|
| Field | Type | Notes |
|
|
|---|---|---|
|
|
| `source_family` | string | One of the pinned values above |
|
|
| `source_kind` | string | One of the pinned values above |
|
|
| `workspace_id` | integer | Derived workspace scope anchor for the artifact |
|
|
| `tenant_id` | integer | Derived tenant scope anchor for the artifact |
|
|
| `provider_key` | string | Provider-neutral contract field; current repo truth emits `microsoft` only |
|
|
| `provider_connection_id` | integer or null | Nullable because historical artifacts may not know the connection |
|
|
| `managed_environment_id` | integer | Required managed-environment anchor inside the derived workspace and tenant scope |
|
|
| `source_target_kind` | string | One of the pinned values above |
|
|
| `source_target_identifier` | string or null | Optional stable target identifier such as governed-subject key, record id, or run id |
|
|
| `detector_key` | string or null | Standardized field for detector or signal identity; no closed catalog in `284` v1 |
|
|
| `control_key` | string or null | Existing canonical-control key when available |
|
|
| `package_run_id` | integer or null | Optional future package hook only; remains null in current runtime |
|
|
|
|
### InventoryTypeDescriptor
|
|
|
|
Represents the inventory-specific type split.
|
|
|
|
| Field | Type | Notes |
|
|
|---|---|---|
|
|
| `canonical_type` | string | Platform-owned type used for top-level summary |
|
|
| `provider_object_type` | string | Raw provider object type such as the existing `policy_type` value |
|
|
| `provider_display_type` | string | Human-readable provider label for operators |
|
|
| `legacy_policy_type` | string or null | Optional carry-forward for old readers or diagnostics |
|
|
|
|
### ArtifactProviderDetail
|
|
|
|
Nested provider-owned evidence that stays below the shared descriptor.
|
|
|
|
| Field | Type | Notes |
|
|
|---|---|---|
|
|
| `legacy_finding_type` | string or null | Existing `finding_type` where relevant |
|
|
| `legacy_report_type` | string or null | Existing `report_type` where relevant |
|
|
| `legacy_policy_type` | string or null | Existing inventory or drift `policy_type` where relevant |
|
|
| `provider_object_type` | string or null | Raw provider object type |
|
|
| `provider_display_type` | string or null | Provider-owned display label |
|
|
| `detector_detail` | string or null | Provider-facing detector or signal detail |
|
|
|
|
### ArtifactSourceViewModel
|
|
|
|
Shared summary contract used by touched Filament pages and presenters.
|
|
|
|
| Field | Type | Notes |
|
|
|---|---|---|
|
|
| `headline` | string | Canonical operator-facing summary |
|
|
| `source_descriptor` | `ArtifactSourceDescriptor` | Shared lineage envelope |
|
|
| `provider_detail` | `ArtifactProviderDetail` | Nested provider-owned detail |
|
|
| `control_summary` | array or null | Derived control label, key, and status when existing resolver provides it |
|
|
| `freshness` | array or null | Existing freshness or timing metadata |
|
|
|
|
## Relationships
|
|
|
|
- One managed environment can own many findings, evidence snapshot items, stored reports, and inventory items.
|
|
- One finding or stored report can contribute one `ArtifactSourceDescriptor` per surfaced summary.
|
|
- One evidence snapshot can contain many `ArtifactSourceDescriptor` values, one per item.
|
|
- One inventory item can expose exactly one `InventoryTypeDescriptor` and one `ArtifactSourceDescriptor`.
|
|
- One tenant-review section can summarize zero or more underlying artifacts but should surface one canonical source summary per summarized item.
|
|
|
|
## Legacy-read normalization rules
|
|
|
|
- If a finding has `source = null`, derive `source_family` and `source_target_kind` from `finding_type` plus any qualifying evidence fields.
|
|
- If a drift finding only exposes `policy_type`, derive `canonical_type` from `InventoryPolicyTypeMeta` or adjacent subject metadata, keep the raw value as `provider_object_type` or `legacy_policy_type`, and never promote it back to the top-level headline.
|
|
- If a stored report payload already includes `provider_key`, reuse it; otherwise default the descriptor to the current provider for the producing service.
|
|
- If an evidence summary has no single `source_record_id`, keep `source_target_identifier` nullable and prefer `managed_environment` or `governed_subject` targeting instead of inventing synthetic ids.
|
|
- If inventory has no distinct provider display label, fall back to the best available metadata label while keeping `provider_object_type` separate from `canonical_type`.
|
|
- If canonical-control resolution returns no control, `control_key` remains null rather than forcing a fake mapping.
|
|
|
|
## Explicit non-goals for data modeling
|
|
|
|
- no `artifact_sources` table
|
|
- no persisted package-run ledger
|
|
- no detector registry table or config catalog
|
|
- no control-catalog expansion
|
|
- no full rewrite of provider-native fields out of existing tables |