ahmido 788efee1c2 feat(baselines): implement baseline matching canonicalization (#453 )

Replaced legacy tenant and environment bindings in the BaselineDriftEngine with the new ProviderResourceIdentity framework as defined in Spec 382. This ensures cross-environment compatibility and deterministic baseline matching.

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #453

2026-06-15 22:48:48 +00:00

19 KiB

Raw Blame History

Implementation Plan: Spec 382 - Baseline Matching Pipeline and Canonicalization v1

Branch: 382-baseline-matching-canonicalization | Date: 2026-06-15 | Spec: spec.md Input: Feature specification from /specs/382-baseline-matching-canonicalization/spec.md

Summary

Add a deterministic baseline subject matching layer that consumes Spec 381 provider resource identities and active managed-environment-scoped bindings before existing baseline compare item keying and payload drift comparison. The runtime slice removes legacy subject-key and display-name matching, preserves real ambiguity, classifies built-ins/virtual targets/foundations through provider-neutral seams, and avoids UI, evidence/review readiness, result taxonomy rewrite, production provider registries, and new persisted entities.

Technical Context

Language/Version: PHP 8.4.15 Primary Dependencies: Laravel 12.52, Filament 5.2.1, Livewire 4.1.4, Pest 4.3.1, PostgreSQL 16 through Sail/Dokploy Storage: Existing PostgreSQL tables only; no new persisted entity approved. Existing provider_resource_bindings is consumed and the old legacy_subject_key column is dropped. Testing: Pest unit and feature tests; PostgreSQL lane only if implementation changes indexes, constraints, migrations, or PostgreSQL-specific queries. Validation Lanes: fast-feedback, confidence; conditional pgsql. Target Platform: Laravel monolith in apps/platform. Project Type: Web admin application, backend runtime change only. Performance Goals: Deterministic matching should remain in-process and use already persisted descriptors/bindings; no Graph, provider gateway, provider runtime client, or UI-render remote calls. Constraints: Workspace/environment scoped reads, binding-first priority, no customer-facing UI or report presentation changes, no historical payload readers. Scale/Scope: Existing baseline compare workload and current provider/resource identity foundation.

Existing Repository Surfaces Likely Affected

apps/platform/app/Jobs/CompareBaselineToTenantJob.php
apps/platform/app/Services/Baselines/BaselineCompareService.php only if start/context plumbing requires a narrow adjustment
apps/platform/app/Support/Baselines/BaselineSubjectKey.php
apps/platform/app/Support/Baselines/SubjectResolver.php
apps/platform/app/Support/Baselines/ResolutionOutcome.php
apps/platform/app/Support/Baselines/ResolutionOutcomeRecord.php
apps/platform/app/Support/Baselines/Compare/CompareState.php
apps/platform/app/Support/Baselines/Compare/CompareStrategyRegistry.php
apps/platform/app/Support/Baselines/Compare/IntuneCompareStrategy.php
apps/platform/app/Support/Inventory/InventoryPolicyTypeMeta.php
apps/platform/app/Support/Baselines/BaselineSupportCapabilityGuard.php
apps/platform/app/Support/Resources/ResourceIdentity.php
apps/platform/app/Support/Resources/ProviderResourceDescriptor.php
apps/platform/app/Models/ProviderResourceBinding.php
apps/platform/app/Services/Resources/ProviderResourceBindingService.php
apps/platform/database/factories/ProviderResourceBindingFactory.php
apps/platform/tests/Unit/Support/Baselines/*
apps/platform/tests/Unit/Support/Resources/*
apps/platform/tests/Feature/Baselines/*
apps/platform/tests/Feature/ProviderResources/*
apps/platform/tests/Feature/Evidence/BaselineDriftPostureSourceTest.php

UI / Surface Guardrail Plan

Guardrail scope: no operator-facing surface change.
Affected routes/pages/actions/states/navigation/panel/provider surfaces: N/A.
No-impact class, if applicable: backend runtime matching only.
Native vs custom classification summary: N/A.
Shared-family relevance: baseline identity and compare runtime, not UI.
State layers in scope: backend compare result proof only; no page/detail/query state.
Audience modes in scope: N/A.
Decision/diagnostic/raw hierarchy plan: N/A for UI; matching proof metadata must be sanitized and internal.
Raw/support gating plan: N/A.
One-primary-action / duplicate-truth control: N/A.
Handling modes by drift class or surface: result/gap semantics are deferred to Spec 383.
Repository-signal treatment: report-only if UI files are touched accidentally; implementation must stop and update spec if UI becomes necessary.
Special surface test profiles: N/A.
Required tests or manual browser validation: no browser validation. Targeted unit/feature tests only.
Exception path and spread control: none.
Active feature PR close-out entry: Baseline Matching Pipeline / Provider Identity Consumption.
UI/Productization coverage decision: No UI surface impact.
Coverage artifacts to update: none.
No-impact rationale: Existing surfaces continue rendering existing compare/operation channels. V1 changes backend matching, not reachable product surfaces.
Navigation / Filament provider-panel handling: unchanged.
Screenshot or page-report need: no.

Shared Pattern & System Fit

Cross-cutting feature marker: yes.
Systems touched: baseline subject identity, provider resource identity, provider resource bindings, compare job, compare strategy input, OperationRun proof context.
Shared abstractions reused: ResourceIdentity, ProviderResourceDescriptor, BaselineSubjectKey, ProviderResourceBinding, existing compare strategy registry, existing OperationRun lifecycle.
New abstraction introduced? why?: yes, a narrow SubjectMatchingPipeline, baseline subject descriptor, matching outcome, provider-owned canonicalization seam, and foundation coverage resolver. They replace scattered old identity assumptions before compare strategies run.
Why the existing abstraction was sufficient or insufficient: Existing compare strategies remain sufficient for payload comparison. Existing SubjectResolver is insufficient as the primary identity resolver because it does not consume active bindings before heuristics.
Bounded deviation / spread control: The matching pipeline is baseline-compare-owned, not a general provider workflow engine. Evidence/review/report consumption is follow-up scope. A production canonicalizer registry/interface is not planned for V1 and must be justified in spec/plan before introduction.

OperationRun UX Impact

Touches OperationRun start/completion/link UX?: no.
Central contract reused: existing baseline compare operation lifecycle.
Delegated UX behaviors: N/A.
Surface-owned behavior kept local: N/A.
Queued DB-notification policy: N/A.
Terminal notification path: existing lifecycle only.
Exception path: none.

Implementation may store sanitized matching proof in existing compare operation context or result payloads. It must not introduce new OperationRun types, new start UX, new notifications, or status/outcome transitions outside the existing service-owned path.

Provider Boundary & Portability Fit

Shared provider/platform boundary touched?: yes.
Provider-owned seams: built-in/default/virtual target canonicalization through a direct provider-owned seam/service; provider resource type and discriminator interpretation; Microsoft-specific signals if a minimal Microsoft adapter is implemented.
Platform-core seams: matching priority, descriptor shape, binding lookup, canonical subject key validation, outcome proof structure.
Neutral platform terms / contracts preserved: provider, provider key, managed environment, governed subject, baseline subject descriptor, provider resource descriptor, matching outcome, foundation coverage.
Retained provider-specific semantics and why: provider resource type/id/discriminator remain necessary identity fields.
Bounded extraction or follow-up path: document-in-feature for contained provider-specific canonicalization; follow-up-spec for broad Microsoft built-in catalog or customer-facing result semantics.

Constitution Check

Inventory-first: compare consumes persisted inventory/snapshot/provider descriptors as last observed truth; Microsoft remains external truth.
Read/write separation: V1 does not add write actions. Existing compare operation remains queued/observable. Binding mutations remain Spec 381 service behavior.
Graph contract path: no new Graph calls; no Graph/provider runtime calls during UI render or matching.
Deterministic capabilities: no new capability family planned.
RBAC-UX: workspace/environment entitlement is enforced before binding/candidate reads; non-members 404, members missing capability 403 where relevant.
Workspace isolation: binding and descriptor reads are workspace scoped.
Tenant isolation: managed-environment scoped bindings must not affect other environments.
Run observability: existing baseline compare OperationRun remains canonical execution truth.
OperationRun start UX: unchanged.
Ops-UX lifecycle: no direct status/outcome transitions may be added.
Data minimization: matching proof metadata must be sanitized.
Test governance: unit and feature lanes are narrowest; no browser/heavy family.
Proportionality: new runtime abstractions are justified by activating Spec 381's implemented foundation and preventing false compare identity.
No premature abstraction: no generic provider workflow engine and no production canonicalizer registry/interface by default; fake provider proves the seam without a broad multi-provider framework.
Persisted truth: no new table/entity approved.
Behavioral state: any matching outcome/reason value must change matching behavior or proof, not just display.
UI semantics: no UI semantics.
Shared pattern first: existing compare and OperationRun paths are reused.
Provider boundary: platform-core matching stays provider-neutral; provider-specific canonicalization stays behind seam.
V1 explicitness / few layers: direct baseline compare matching layer only.
Spec discipline / bloat check: Specs 383-385 remain follow-up; this spec does not absorb UI/evidence/report scope.
Filament-native UI: no Filament changes.
UI/Productization coverage: checked no UI surface impact with rationale.

Test Governance Check

Test purpose / classification by changed surface: Unit for matching components; Feature for compare integration, binding lookup, isolation, and canonical-key rejection.
Affected validation lanes: fast-feedback, confidence; pgsql only if migrations/indexes/constraints change.
Why this lane mix is the narrowest sufficient proof: Matching is deterministic service behavior plus existing DB-backed compare workflow. No UI/browser proof is needed.
Narrowest proving command(s):
- cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/Baselines/Matching
- cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Baselines/BaselineCompareProviderResourceBindingCanonicalIdentityTest.php tests/Feature/Baselines/BaselineCompareAmbiguousMatchGapTest.php tests/Feature/Baselines/BaselineCompareGapClassificationTest.php
Fixture / helper / factory / seed / context cost risks: fake-provider fixtures must stay local; no global provider/workspace defaults.
Expensive defaults or shared helper growth introduced?: no.
Heavy-family additions, promotions, or visibility changes: none.
Surface-class relief / special coverage rule: N/A.
Closing validation and reviewer handoff: reviewers verify no UI impact, no new persistence, no core Microsoft literals, and no evidence/review behavior change.
Budget / baseline / trend follow-up: none expected.
Review-stop questions: lane fit, hidden provider fixture cost, binding query isolation, and bloat scope.
Escalation path: document-in-feature for contained provider-specific canonicalization; follow-up-spec for broader semantics/UI/evidence.
Active feature PR close-out entry: Baseline Matching Pipeline / Provider Identity Consumption.
Why no dedicated follow-up spec is needed: Matching activation is the smallest direct follow-up to Spec 381. Result semantics, UI, and evidence readiness already have follow-up spec candidates.

Project Structure

Documentation (this feature)

specs/382-baseline-matching-canonicalization/
├── checklists/
│   └── requirements.md
├── plan.md
├── spec.md
└── tasks.md

Source Code (repository root)

apps/platform/app/
├── Jobs/
│   └── CompareBaselineToTenantJob.php
├── Services/
│   ├── Baselines/
│   │   └── Matching/                  # new focused matching services if implementation keeps this namespace
│   └── Resources/
│       └── ProviderResourceBindingService.php
└── Support/
    ├── Baselines/
    │   ├── BaselineSubjectKey.php
    │   ├── SubjectResolver.php
    │   └── Matching/                  # new descriptor/outcome/value support if implementation keeps this namespace
    ├── Inventory/
    │   └── InventoryPolicyTypeMeta.php
    └── Resources/
        ├── ProviderResourceDescriptor.php
        └── ResourceIdentity.php

apps/platform/tests/
├── Unit/Support/Baselines/Matching/
├── Unit/Support/Resources/
├── Feature/Baselines/
├── Feature/ProviderResources/
└── Feature/Evidence/

Structure Decision: Use the existing Laravel monolith under apps/platform. Keep matching code in baseline-owned namespaces. Do not create a new package, module root, provider framework, or persistence layer.

Complexity Tracking

Violation	Why Needed	Simpler Alternative Rejected Because
New matching pipeline/resolver family	Existing compare paths still encounter old subject-key/display-name identity data and do not consume active bindings first	Patching each compare strategy would duplicate identity priority rules and keep provider logic scattered
Canonicalization seam	Built-ins/defaults/virtual targets need provider-owned knowledge without hardcoding provider labels in core	Hardcoding Microsoft labels in core violates provider boundary rules; a registry/interface is broader than V1 unless a current-release need is documented
Matching outcome object/reason family	Compare needs to distinguish resolved, ambiguous, missing, unsupported, limited, excluded, and unresolved-identity outcomes before drift comparison	Reusing overloaded states keeps false blockers and false green risk

Proportionality Review

Current operator problem: Baseline compare can still produce false ambiguity, false blockers, or false confidence when display names or legacy subject keys are treated as identity.
Existing structure is insufficient because: Spec 381 persistence is passive until matching consumes it; existing compare strategies are drift comparers, not binding-first identity resolvers.
Narrowest correct implementation: One pre-compare matching layer, no new persistence, no UI, no evidence/review readiness, no result taxonomy rewrite beyond internal matching proof.
Ownership cost created: Focused matching services/value objects and tests; reviewer vigilance against turning this into a generic provider engine.
Alternative intentionally rejected: Leave bindings passive until resolution UI. That would keep the core compare workflow unsafe and make UI decisions depend on stale identity behavior.
Release truth: Current-release runtime truth required immediately after Spec 381.

Domain And Data Model Implications

Existing provider_resource_bindings remains the durable binding source.
Matching outcomes are derived runtime/result truth, not new persisted domain records.
canonical_subject_key validation must prevent arbitrary legacy/display-name keys from masquerading as provider-resource canonical keys.
legacy_subject_key is removed from active code paths and dropped by the Spec 382 migration.
Baseline descriptor source precedence must be repo-real: use existing ProviderResourceDescriptor or binding identity fields when present; otherwise derive current-side descriptors from scoped InventoryItem fields (external_id, policy_type, display_name, meta_jsonb) and baseline-side descriptors from BaselineSnapshotItem fields (subject_key, subject_external_id, policy_type, meta_jsonb). Provider key/resource identity must come from existing provider connection, binding, descriptor, or explicit test fixture context, not platform-core display-label assumptions.
If implementation needs a new table, enum family beyond matching behavior, or durable artifact, stop and update spec/plan before code changes continue.

Implementation Phases

Confirm repo state and completed dependency guardrails.
Add tests for binding-first matching, duplicate-name ambiguity, canonical built-ins/virtual targets, foundation coverage, canonical-key rejection, and compare strategy preservation.
Implement or extend canonical key validation so arbitrary overrides are rejected.
Add baseline subject descriptor and matching outcome support.
Add the narrow matching pipeline with active binding lookup, canonicalization seam, exact identity, unresolved-identity, and missing/unsupported/limitation outcomes.
Integrate the pipeline into CompareBaselineToTenantJob before old policy_type|subject_key keying can collapse candidates, and before compare strategy invocation.
Replace the Spec 381 no-op binding-consumption test with binding-consumption and identity-required coverage.
Run targeted tests, Pint, and diff check.

Filament v5 Output Contract For Later Implementation Report

Livewire v4.0+ compliance: unchanged; no Livewire code is planned.
Provider registration location: unchanged; Laravel 12 panel providers remain in apps/platform/bootstrap/providers.php.
Global search: no resource is added or changed; no global search behavior is planned.
Destructive/high-impact actions: no Filament action is added. Existing compare start behavior remains governed by existing authorization/OperationRun rules.
Asset strategy: no Filament assets are registered; no Spec 382-specific filament:assets deployment concern beyond normal release process.
Testing plan: unit/feature tests cover matching components and compare integration; no Livewire/browser tests unless implementation unexpectedly touches UI, in which case spec/plan must be updated first.

Rollout And Deployment Considerations

No environment variables, queue names, scheduler entries, storage volumes, reverse proxy changes, or asset build changes are expected.
Spec 382 includes a migration to drop provider_resource_bindings.legacy_subject_key. No new persisted entity, index family, or storage surface is introduced.
Staging validation should run the targeted compare/matching test commands and normal formatting checks before production promotion.
Because TenantPilot is pre-production, no legacy identity mapper or historical OperationRun payload reader is required.

19 KiB Raw Blame History