# Spec Candidate 381 - Baseline Matching Pipeline & Canonicalization v1 ## Candidate Status Candidate for implementation after Spec 380. This candidate changes baseline compare matching so TenantPilot resolves subjects through identity, canonicalization, and bindings before falling back to display-name matching. ## Depends On - Spec 380 - Provider Resource Identity & Binding Foundation v1 ## Spec Candidate Check - **Problem**: Baseline compare currently loads baseline/current subjects mainly by normalized `policy_type|subject_key`, so built-ins, virtual assignment targets, foundation objects, duplicate names, and restored/test/copied resources can be misclassified. - **Today's failure**: Operators get false ambiguity or false missing states, and display-name fallback can look more authoritative than it is. - **User-visible improvement**: Compare results become more trustworthy because exact identity, canonical provider defaults, active bindings, and safe fingerprints are attempted before display names. - **Smallest enterprise-capable version**: Add one matching pipeline seam, canonicalizer registry, foundation coverage registry, active-binding lookup, fake-provider contract tests, and bounded Microsoft/Intune adapter behavior behind the provider seam. - **Explicit non-goals**: No manual resolution UI, no evidence/review readiness remapping, no restore integration, no customer-facing copy changes, no broad historical migration, and no generic provider framework beyond the concrete canonicalization need. - **Permanent complexity imported**: Matching pipeline service, canonicalizer registry, coverage registry, typed matching result, provider descriptor use, adapter contract tests, and baseline compare integration tests. - **Why now**: Spec 380 would create durable binding truth, but compare remains unsafe until the matching order actually consumes it before display-name fallback. - **Why not local**: Local patches in `IntuneCompareStrategy` would keep provider-specific labels in core and would not provide a reusable identity/canonicalization path for evidence and review follow-up. - **Approval class**: Core Enterprise. - **Red flags triggered**: New meta-infrastructure, foundation/canonical terminology, and multi-step pipeline. The defense is that matching order changes operator trust and customer-readiness blockers directly; the v1 is bounded to existing compare flows and one fake-provider contract. - **Score**: Nutzen: 2 | Dringlichkeit: 2 | Scope: 1 | Komplexitaet: 1 | Produktnaehe: 2 | Wiederverwendung: 2 | **Gesamt: 10/12** - **Decision**: approve after Spec 380, with Microsoft behavior kept behind adapter seams. ## Proportionality Review 1. **Current operator problem**: Operators cannot trust whether compare blockers reflect real tenant-owned duplicates or expected provider defaults. 2. **Why existing structure is insufficient**: Existing compare code keys by `policy_type|subject_key` and current reason codes do not express canonical provider defaults or active bindings first. 3. **Narrowest correct implementation**: Insert one matching pipeline before existing compare strategy behavior; preserve legacy strategies where possible. 4. **Ownership cost**: Baseline compare owners maintain pipeline ordering, registry entries, adapter contracts, and fallback semantics. 5. **Rejected alternative**: Hardcoding Microsoft labels in core was rejected because it deepens provider coupling and still leaves display-name-like truth in shared code. 6. **Current-release truth or future prep**: Current-release trust issue; fake provider tests prove the seam without broad multi-provider productization. ## Problem Baseline compare currently loads baseline/current subjects mainly by normalized `policy_type|subject_key`. This causes false ambiguity and false missing states when: - Microsoft/default/provider built-ins exist, - virtual assignment targets appear as labels, - foundation objects are not policy-backed, - tenant-owned resources have duplicate names, - restored/test/copied resources share display names. The matching process needs a provider-agnostic pipeline. ## Goal Introduce a subject matching pipeline that resolves baseline subjects using: 1. active binding, 2. canonical built-in/virtual target recognition, 3. provider object identity, 4. stable external identity, 5. safe fingerprint, 6. unique descriptor match, 7. display-name fallback, 8. unresolved ambiguity, 9. missing/unsupported/limitation classification. ## Scope ### In Scope - Add `SubjectMatchingPipeline`. - Add `BuiltInCanonicalizerRegistry`. - Add `FoundationCoverageRegistry`. - Integrate active binding lookup from `provider_resource_bindings`. - Integrate provider resource descriptors from inventory/policy versions. - Add provider-adapter seam for canonicalization. - Update baseline compare flow to call the matching pipeline before existing compare strategy. - Preserve compatibility with existing compare strategies. - Add fake-provider contract tests. - Add Microsoft/Intune adapter implementation only behind provider adapter seam, not in core. ### Out of Scope - Full UI for manual resolution. - Evidence/review readiness remapping. - Generic workflow engine. - Full restore integration. - Broad historical migration of previous compare results. - Customer-facing output changes. ## Matching Priority The matching pipeline must evaluate in this order: ```text 1. Existing active binding 2. Provider built-in / virtual canonical key 3. Exact provider object identity 4. Stable provider-specific external identity 5. Unique fingerprint / payload identity where safe 6. Unique provider resource descriptor match 7. Unique normalized display-name fallback 8. Unresolved ambiguity 9. Missing resource/evidence/unsupported coverage ``` Display-name fallback must be explicitly marked as fallback and should never silently produce high-trust identity if stronger identity is available. ## Built-In Canonicalization Core baseline logic must not hardcode provider names or Microsoft labels. Provider adapters may register canonicalizers. Example Microsoft/Intune canonicalization behind adapter seam: ```text All users All devices Default role scope tag Known provider-default assignment targets Known provider-default foundation resources ``` These must resolve by provider discriminator/type/canonical key, not display name. ## Foundation Coverage Registry The registry must classify resource classes as: ```text fully_comparable inventory_only canonical_only unsupported excluded_by_profile requires_manual_binding ``` Foundation objects must not be forced into policy-backed comparison. Examples: ```text roleScopeTag default -> canonical built-in/default if provider identifies it roleScopeTag tenant-owned -> foundation resource by provider object ID assignmentFilter tenant-owned -> foundation inventory/comparable depending capability notificationMessageTemplate -> foundation/config object depending capability ``` ## Integration Points Expected areas to inspect/modify: - `BaselineCompareService` - `CompareBaselineToTenantJob` - `SubjectResolver` - `ResolutionOutcome` - `IntuneCompareStrategy` - `CompareStrategyRegistry` - `InventoryPolicyTypeMeta` - `BaselineSupportCapabilityGuard` - `GovernanceSubjectTaxonomyRegistry` - provider gateway / provider adapter seams - Graph contract registry integration where applicable ## Result Contract The matching pipeline should return a typed result, for example: ```text resolved_exact_identity resolved_active_binding resolved_canonical_builtin resolved_canonical_virtual_target resolved_unique_fallback unresolved_ambiguous_match missing_provider_resource missing_local_evidence unsupported_resource_class foundation_inventory_only excluded_non_governed accepted_limitation ``` Spec 382 will formalize result semantics, but Spec 381 must produce enough structure for that follow-up. ## Acceptance Criteria - Baseline compare uses matching pipeline before display-name fallback. - Built-ins/virtual targets can be resolved by provider canonicalizer. - Tenant-owned duplicate names remain unresolved unless an active binding exists. - Foundation inventory-only resources no longer produce false policy-backed matching attempts. - Existing compare strategies still receive matched baseline/current resources where possible. - No core class hardcodes Microsoft display names. - Fake provider can register canonical built-ins and resolve them. - YPTW2-style cases are representable: - `All users` / `All devices` canonicalized, - `default roleScopeTag` canonicalized or foundation-classified, - tenant-owned duplicate Settings Catalog policies remain ambiguous until binding, - assignment filters and notification templates are classified by capability. ## Required Tests - Built-in canonical object resolves without ambiguity. - Virtual assignment target resolves without display-name matching. - Tenant-owned duplicate display names remain unresolved. - Active manual binding resolves duplicate candidate. - Display-name fallback is only used after identity/canonical/binding attempts fail. - Foundation inventory-only object returns inventory-only limitation, not `foundation_not_policy_backed`. - Unsupported resource class returns unsupported result. - Fake provider canonicalization contract test. - Microsoft/Intune adapter does not leak display-name logic into core. ## Validation Commands ```bash cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Baselines/BaselineCompareAmbiguousMatchGapTest.php tests/Feature/Baselines/BaselineCompareGapClassificationTest.php ``` ```bash cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/Baselines/SubjectResolverTest.php ``` Add new tests for the matching pipeline and canonicalizer registry. ## Risks - Accidentally making display-name fallback look authoritative. - Hiding real duplicate tenant resources through over-aggressive canonicalization. - Hardcoding Microsoft-specific behavior into core. - Breaking existing compare strategy expectations. ## Recommendation Implement this second. This candidate fixes the core matching failure mode while still avoiding UI and evidence/review changes.