# Research: Provider-neutral Artifact Source Taxonomy ## Decision 1: Use one shared descriptor over existing artifact truth, not a new artifact table - **Decision**: represent provider-neutral artifact lineage through one shared descriptor carried by existing finding, evidence, stored-report, inventory, and review-summary seams. - **Why**: the repo already stores the underlying truth in `Finding`, `EvidenceSnapshotItem`, `StoredReport`, and `InventoryItem`. A new artifact-source table would duplicate that truth and create lifecycle or ownership questions that the current release does not need. - **Alternatives considered**: - new `artifact_sources` table: rejected because it adds persistence and drift risk with no current-release operator value - page-local aliasing only: rejected because it would preserve conflicting summaries across findings, evidence, reports, inventory, and review sections ## Decision 2: Pin exact inventories for `source_family`, `source_kind`, and `source_target_kind` - **Decision**: keep the initial inventories exact and small. - **Pinned `source_family` set**: - `finding` - `stored_report` - `evidence_snapshot` - `inventory` - `operation_run` - **Pinned `source_kind` set**: - `model_summary` - `stored_report` - `operation_rollup` - `inventory_projection` - **Pinned `source_target_kind` set**: - `managed_environment` - `governed_subject` - `provider_connection` - `operation_run` - **Why**: the repo memory and readiness rules require exact inventories when a package introduces a bounded semantic family. Keeping the set explicit prevents later prep or implementation drift. - **Alternatives considered**: - open-ended family strings with only prose guidance: rejected because readiness analysis can flag vague inventories as premature - predeclaring package-output or multi-provider families now: rejected because those values are future-facing and not required by current repo truth ## Decision 3: Standardize `detector_key` and `control_key` placement without creating new registries - **Decision**: `284` standardizes where `detector_key` and `control_key` live in the shared descriptor and touched view models, but it does not introduce a closed detector catalog or a broader control-catalog expansion. - **Why**: the repo already has working canonical-control resolution. The real problem is inconsistent placement and summary wording, not the absence of a second registry. - **Alternatives considered**: - detector catalog or detector registry: rejected because it is future-facing and wider than current repo truth - control-catalog expansion in the same slice: rejected because `284` is about artifact-source semantics, not broader control governance ## Decision 4: Keep provider-native fields as nested detail - **Decision**: `finding_type`, `report_type`, raw `policy_type`, provider object types, report domains, and Graph-facing detector detail remain provider-owned nested evidence. - **Why**: the current release is still Microsoft-first in runtime. The goal is to stop using provider-native fields as top-level platform truth, not to erase them. - **Alternatives considered**: - full generic rewrite of provider detail: rejected because it would over-abstract current repo truth - leaving provider-native fields as top-level summary nouns: rejected because that preserves the current artifact interpretation drift ## Decision 5: Inventory type separation should live beside existing inventory metadata helpers - **Decision**: keep `canonical_type`, `provider_object_type`, and `provider_display_type` close to `InventoryPolicyTypeMeta` and the inventory read model rather than creating a new cross-product taxonomy engine. - **Why**: `InventoryPolicyTypeMeta` is already the narrowest place where inventory type meaning is derived and displayed. - **Alternatives considered**: - new global type registry for every artifact family: rejected because it is broader than the current inventory-only problem - leaving inventory on raw `policy_type`: rejected because it would keep one of the explicit 284 acceptance gaps alive ## Decision 6: Legacy rows should normalize on read, not through backfill - **Decision**: preserve the candidate's no-backfill rule and normalize legacy artifacts on read or during future writes only. - **Why**: the repo is still pre-production, but `284` does not need a backfill program to deliver operator and contributor value. Read-time normalization is enough for current artifact families. - **Alternatives considered**: - historical backfill migration: rejected because it adds risk and operational work without increasing the core value of the slice - leaving legacy rows unreadable until rewritten: rejected because acceptance requires current Microsoft outputs to remain valid as Microsoft provider sources ## Decision 7: Support or AI alignment stays bounded and package runtime remains deferred - **Decision**: if `SupportDiagnosticBundleBuilder`, `AiUseCaseCatalog`, or adjacent `source_family` consumers are touched, align them to the pinned source-family nouns only. Keep `package_run_id` optional and nullable; do not create package-execution runtime. - **Why**: the candidate explicitly says later package execution should be able to build on the descriptor, but `284` must not implement package runtime now. - **Alternatives considered**: - package-output or package-run implementation in the same slice: rejected because it is adjacent future work - ignoring existing `source_family` consumers entirely: rejected because they can become a second naming drift if touched later without the 284 vocabulary ## Implementation prerequisites present in current repo truth - Spec `281` provider-neutral provider-connection scope is already present in repo runtime. - Spec `282` workspace-first artifact surfaces are already present in repo runtime. - Spec `283` provider capability registry is already present in repo runtime. Because those inherited prerequisites are already present on the current branch, the remaining blocker is narrower: runtime work for `284` stays `prerequisite-blocked` until SCOPE-001 ownership compliance for the touched tenant-owned artifact tables is satisfied or explicitly excepted. ## Explicit non-goals carried into design - no new artifact table or ledger - no provider framework - no detector registry - no full control-catalog expansion - no package runtime or package-output surfaces - no historical backfill - no workspace-first RBAC redesign - no copy or localization neutralization