TenantAtlas/spec-candidates/381-baseline-matching-pipeline-canonicalization.md
ahmido dbff2a0a90 feat(report): implement management report pdf runtime (#450)
Added jobs, controllers, and PDF generation logic for management report runtime as defined in Spec 379. Includes artifact migrations, payload builders, and testing coverage.

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #450
2026-06-15 11:36:29 +00:00

10 KiB

Spec Candidate 381 - Baseline Matching Pipeline & Canonicalization v1

Candidate Status

Candidate for implementation after Spec 380.

This candidate changes baseline compare matching so TenantPilot resolves subjects through identity, canonicalization, and bindings before falling back to display-name matching.

Depends On

  • Spec 380 - Provider Resource Identity & Binding Foundation v1

Spec Candidate Check

  • Problem: Baseline compare currently loads baseline/current subjects mainly by normalized policy_type|subject_key, so built-ins, virtual assignment targets, foundation objects, duplicate names, and restored/test/copied resources can be misclassified.
  • Today's failure: Operators get false ambiguity or false missing states, and display-name fallback can look more authoritative than it is.
  • User-visible improvement: Compare results become more trustworthy because exact identity, canonical provider defaults, active bindings, and safe fingerprints are attempted before display names.
  • Smallest enterprise-capable version: Add one matching pipeline seam, canonicalizer registry, foundation coverage registry, active-binding lookup, fake-provider contract tests, and bounded Microsoft/Intune adapter behavior behind the provider seam.
  • Explicit non-goals: No manual resolution UI, no evidence/review readiness remapping, no restore integration, no customer-facing copy changes, no broad historical migration, and no generic provider framework beyond the concrete canonicalization need.
  • Permanent complexity imported: Matching pipeline service, canonicalizer registry, coverage registry, typed matching result, provider descriptor use, adapter contract tests, and baseline compare integration tests.
  • Why now: Spec 380 would create durable binding truth, but compare remains unsafe until the matching order actually consumes it before display-name fallback.
  • Why not local: Local patches in IntuneCompareStrategy would keep provider-specific labels in core and would not provide a reusable identity/canonicalization path for evidence and review follow-up.
  • Approval class: Core Enterprise.
  • Red flags triggered: New meta-infrastructure, foundation/canonical terminology, and multi-step pipeline. The defense is that matching order changes operator trust and customer-readiness blockers directly; the v1 is bounded to existing compare flows and one fake-provider contract.
  • Score: Nutzen: 2 | Dringlichkeit: 2 | Scope: 1 | Komplexitaet: 1 | Produktnaehe: 2 | Wiederverwendung: 2 | Gesamt: 10/12
  • Decision: approve after Spec 380, with Microsoft behavior kept behind adapter seams.

Proportionality Review

  1. Current operator problem: Operators cannot trust whether compare blockers reflect real tenant-owned duplicates or expected provider defaults.
  2. Why existing structure is insufficient: Existing compare code keys by policy_type|subject_key and current reason codes do not express canonical provider defaults or active bindings first.
  3. Narrowest correct implementation: Insert one matching pipeline before existing compare strategy behavior; preserve legacy strategies where possible.
  4. Ownership cost: Baseline compare owners maintain pipeline ordering, registry entries, adapter contracts, and fallback semantics.
  5. Rejected alternative: Hardcoding Microsoft labels in core was rejected because it deepens provider coupling and still leaves display-name-like truth in shared code.
  6. Current-release truth or future prep: Current-release trust issue; fake provider tests prove the seam without broad multi-provider productization.

Problem

Baseline compare currently loads baseline/current subjects mainly by normalized policy_type|subject_key. This causes false ambiguity and false missing states when:

  • Microsoft/default/provider built-ins exist,
  • virtual assignment targets appear as labels,
  • foundation objects are not policy-backed,
  • tenant-owned resources have duplicate names,
  • restored/test/copied resources share display names.

The matching process needs a provider-agnostic pipeline.

Goal

Introduce a subject matching pipeline that resolves baseline subjects using:

  1. active binding,
  2. canonical built-in/virtual target recognition,
  3. provider object identity,
  4. stable external identity,
  5. safe fingerprint,
  6. unique descriptor match,
  7. display-name fallback,
  8. unresolved ambiguity,
  9. missing/unsupported/limitation classification.

Scope

In Scope

  • Add SubjectMatchingPipeline.
  • Add BuiltInCanonicalizerRegistry.
  • Add FoundationCoverageRegistry.
  • Integrate active binding lookup from provider_resource_bindings.
  • Integrate provider resource descriptors from inventory/policy versions.
  • Add provider-adapter seam for canonicalization.
  • Update baseline compare flow to call the matching pipeline before existing compare strategy.
  • Preserve compatibility with existing compare strategies.
  • Add fake-provider contract tests.
  • Add Microsoft/Intune adapter implementation only behind provider adapter seam, not in core.

Out of Scope

  • Full UI for manual resolution.
  • Evidence/review readiness remapping.
  • Generic workflow engine.
  • Full restore integration.
  • Broad historical migration of previous compare results.
  • Customer-facing output changes.

Matching Priority

The matching pipeline must evaluate in this order:

1. Existing active binding
2. Provider built-in / virtual canonical key
3. Exact provider object identity
4. Stable provider-specific external identity
5. Unique fingerprint / payload identity where safe
6. Unique provider resource descriptor match
7. Unique normalized display-name fallback
8. Unresolved ambiguity
9. Missing resource/evidence/unsupported coverage

Display-name fallback must be explicitly marked as fallback and should never silently produce high-trust identity if stronger identity is available.

Built-In Canonicalization

Core baseline logic must not hardcode provider names or Microsoft labels.

Provider adapters may register canonicalizers.

Example Microsoft/Intune canonicalization behind adapter seam:

All users
All devices
Default role scope tag
Known provider-default assignment targets
Known provider-default foundation resources

These must resolve by provider discriminator/type/canonical key, not display name.

Foundation Coverage Registry

The registry must classify resource classes as:

fully_comparable
inventory_only
canonical_only
unsupported
excluded_by_profile
requires_manual_binding

Foundation objects must not be forced into policy-backed comparison.

Examples:

roleScopeTag default            -> canonical built-in/default if provider identifies it
roleScopeTag tenant-owned       -> foundation resource by provider object ID
assignmentFilter tenant-owned   -> foundation inventory/comparable depending capability
notificationMessageTemplate     -> foundation/config object depending capability

Integration Points

Expected areas to inspect/modify:

  • BaselineCompareService
  • CompareBaselineToTenantJob
  • SubjectResolver
  • ResolutionOutcome
  • IntuneCompareStrategy
  • CompareStrategyRegistry
  • InventoryPolicyTypeMeta
  • BaselineSupportCapabilityGuard
  • GovernanceSubjectTaxonomyRegistry
  • provider gateway / provider adapter seams
  • Graph contract registry integration where applicable

Result Contract

The matching pipeline should return a typed result, for example:

resolved_exact_identity
resolved_active_binding
resolved_canonical_builtin
resolved_canonical_virtual_target
resolved_unique_fallback
unresolved_ambiguous_match
missing_provider_resource
missing_local_evidence
unsupported_resource_class
foundation_inventory_only
excluded_non_governed
accepted_limitation

Spec 382 will formalize result semantics, but Spec 381 must produce enough structure for that follow-up.

Acceptance Criteria

  • Baseline compare uses matching pipeline before display-name fallback.
  • Built-ins/virtual targets can be resolved by provider canonicalizer.
  • Tenant-owned duplicate names remain unresolved unless an active binding exists.
  • Foundation inventory-only resources no longer produce false policy-backed matching attempts.
  • Existing compare strategies still receive matched baseline/current resources where possible.
  • No core class hardcodes Microsoft display names.
  • Fake provider can register canonical built-ins and resolve them.
  • YPTW2-style cases are representable:
    • All users / All devices canonicalized,
    • default roleScopeTag canonicalized or foundation-classified,
    • tenant-owned duplicate Settings Catalog policies remain ambiguous until binding,
    • assignment filters and notification templates are classified by capability.

Required Tests

  • Built-in canonical object resolves without ambiguity.
  • Virtual assignment target resolves without display-name matching.
  • Tenant-owned duplicate display names remain unresolved.
  • Active manual binding resolves duplicate candidate.
  • Display-name fallback is only used after identity/canonical/binding attempts fail.
  • Foundation inventory-only object returns inventory-only limitation, not foundation_not_policy_backed.
  • Unsupported resource class returns unsupported result.
  • Fake provider canonicalization contract test.
  • Microsoft/Intune adapter does not leak display-name logic into core.

Validation Commands

cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Baselines/BaselineCompareAmbiguousMatchGapTest.php tests/Feature/Baselines/BaselineCompareGapClassificationTest.php
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/Baselines/SubjectResolverTest.php

Add new tests for the matching pipeline and canonicalizer registry.

Risks

  • Accidentally making display-name fallback look authoritative.
  • Hiding real duplicate tenant resources through over-aggressive canonicalization.
  • Hardcoding Microsoft-specific behavior into core.
  • Breaking existing compare strategy expectations.

Recommendation

Implement this second.

This candidate fixes the core matching failure mode while still avoiding UI and evidence/review changes.