Implemented deterministic Baseline Result Semantics (Spec 383), introducing CompareSubjectResult and CompareEvidenceResult. Replaced generic arrays with strict Data Transfer Objects for Baseline engine output. Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de> Reviewed-on: #454
264 lines
22 KiB
Markdown
264 lines
22 KiB
Markdown
# Implementation Plan: Spec 383 - Baseline Compare Result Semantics and Gap Classification v1
|
|
|
|
**Branch**: `383-baseline-result-semantics` | **Date**: 2026-06-16 | **Spec**: [spec.md](./spec.md)
|
|
**Input**: Feature specification from `/specs/383-baseline-result-semantics/spec.md`
|
|
|
|
## Summary
|
|
|
|
Replace overloaded baseline compare result/gap semantics with a provider-neutral outcome model over existing Spec 382 matching and compare strategy output. The plan adds a narrow classifier/mapper, rewrites legacy authoritative reason strings, stores structured subject outcome proof in existing OperationRun/compare payloads, updates existing status/detail grouping, and keeps resolution UI, Evidence/Review final readiness, customer-facing Review Pack wording, report/PDF runtime work, and compatibility readers out of scope.
|
|
|
|
## Technical Context
|
|
|
|
**Language/Version**: PHP 8.4.15
|
|
**Primary Dependencies**: Laravel 12.52, Filament 5.2.1, Livewire 4.1.4, Pest 4.3.1, PostgreSQL 16 through Sail/Dokploy
|
|
**Storage**: Existing OperationRun context/result payloads and existing compare structures only. No new persisted entity/table/artifact is approved.
|
|
**Testing**: Pest unit and feature tests; Filament/Livewire feature tests only for existing status rendering touched by the new grouping. Browser lane only if implementation changes layout/navigation/action behavior.
|
|
**Validation Lanes**: fast-feedback, confidence; conditional pgsql/browser if implementation triggers those scopes.
|
|
**Target Platform**: Laravel monolith in `apps/platform`.
|
|
**Project Type**: Web admin application, runtime/result-semantics change with limited existing-surface status presentation impact.
|
|
**Performance Goals**: Deterministic in-process classification over existing matching/compare results; no new remote work and no UI-render Graph/provider calls.
|
|
**Constraints**: Provider-neutral top-level semantics, no legacy result compatibility, no new UI workflow, no final evidence/review readiness mapping, no OperationRun lifecycle transition outside `OperationRunService`.
|
|
**Scale/Scope**: Existing baseline compare workflow and existing OperationRun/evidence-gap consumers.
|
|
|
|
## Existing Repository Surfaces Likely Affected
|
|
|
|
- `apps/platform/app/Support/Baselines/Matching/MatchingOutcome.php`
|
|
- `apps/platform/app/Services/Baselines/Matching/SubjectMatchingPipeline.php`
|
|
- `apps/platform/app/Services/Baselines/Matching/FoundationCoverageResolver.php`
|
|
- `apps/platform/app/Support/Baselines/BaselineCompareReasonCode.php`
|
|
- `apps/platform/app/Support/Baselines/BaselineCompareEvidenceGapDetails.php`
|
|
- `apps/platform/app/Support/Baselines/SubjectResolver.php`
|
|
- `apps/platform/app/Support/Baselines/ResolutionOutcome.php`
|
|
- `apps/platform/app/Support/Baselines/ResolutionOutcomeRecord.php`
|
|
- `apps/platform/app/Support/Baselines/Compare/CompareState.php`
|
|
- `apps/platform/app/Support/Baselines/Compare/CompareSubjectResult.php`
|
|
- `apps/platform/app/Support/Baselines/Compare/IntuneCompareStrategy.php`
|
|
- `apps/platform/app/Jobs/CompareBaselineToTenantJob.php`
|
|
- `apps/platform/app/Support/OpsUx/OperationSummaryKeys.php` only if new summary count keys are required
|
|
- `apps/platform/app/Services/Evidence/Sources/BaselineDriftPostureSource.php` only for regression-safe consumption of new run summary truth
|
|
- Existing baseline compare and OperationRun detail presentation tests under `apps/platform/tests/Feature/Filament/`
|
|
- Existing baseline compare, evidence, and review-pack regression tests under `apps/platform/tests/Feature/`
|
|
|
|
Likely new focused support namespace if implementation keeps the plan shape:
|
|
|
|
```text
|
|
apps/platform/app/Support/Baselines/CompareSemantics/
|
|
├── BaselineCompareOutcomeClassifier.php
|
|
├── BaselineCompareRunSummaryClassifier.php
|
|
├── CompareResultActionability.php
|
|
├── CompareResultCategory.php
|
|
├── CompareResultCoverageStatus.php
|
|
├── CompareResultIdentityStatus.php
|
|
├── CompareResultReadinessImpact.php
|
|
├── CompareResultReason.php
|
|
├── CompareResultTrustLevel.php
|
|
└── CompareSubjectOutcome.php
|
|
```
|
|
|
|
If implementation can satisfy the spec by extending existing classes with less structure, prefer the narrower shape and update this plan before adding broader abstractions.
|
|
|
|
## UI / Surface Guardrail Plan
|
|
|
|
- **Guardrail scope**: existing status/evidence/detail presentation changes only.
|
|
- **Affected routes/pages/actions/states/navigation/panel/provider surfaces**: existing baseline compare and OperationRun detail contexts that render evidence gaps/status groups. No new route, navigation entry, action, modal, drawer, wizard, form, or panel provider.
|
|
- **No-impact class, if applicable**: N/A.
|
|
- **Native vs custom classification summary**: existing native/shared Filament/Livewire surfaces; no local design system.
|
|
- **Shared-family relevance**: status messaging, evidence-gap detail, badge/status labels.
|
|
- **State layers in scope**: backend payload state and existing detail/list grouping.
|
|
- **Audience modes in scope**: operator-MSP and support-platform. Customer/read-only output is out of scope until Spec 385.
|
|
- **Decision/diagnostic/raw hierarchy plan**: default-visible group/category/actionability/readiness first; matching proof/provider identifiers remain diagnostics/support detail.
|
|
- **Raw/support gating plan**: no new raw payload exposure. Keep existing diagnostics/support gating.
|
|
- **One-primary-action / duplicate-truth control**: no new actions. Use one canonical reason/category/actionability set for all rendered labels.
|
|
- **Handling modes by drift class or surface**: limitations, unsupported, missing evidence, missing provider, blockers, drift, no drift, excluded, and failed map to distinct groups.
|
|
- **Repository-signal treatment**: if implementation changes only labels/groups on existing surfaces, document in feature close-out. If route/layout/action hierarchy changes, update UI coverage artifacts before merge.
|
|
- **Special surface test profiles**: standard-native-filament relief for label/group changes; browser smoke only if layout/navigation/action behavior changes.
|
|
- **Required tests or manual smoke**: feature tests for existing Filament/Livewire status rendering when touched; no browser smoke by default.
|
|
- **Exception path and spread control**: none planned.
|
|
- **Active feature PR close-out entry**: Baseline Compare Result Semantics / Gap Classification.
|
|
- **UI/Productization coverage decision**: existing surface, no new route/page/archetype.
|
|
- **Coverage artifacts to update**: none during preparation. Implementation must update `docs/ui-ux-enterprise-audit/` only if actual rendered structure or route/archetype changes.
|
|
- **No-impact rationale**: N/A, because existing status presentation may change.
|
|
- **Navigation / Filament provider-panel handling**: unchanged; Laravel 12 panel providers remain in `apps/platform/bootstrap/providers.php`.
|
|
- **Screenshot or page-report need**: no unless implementation changes layout/navigation or customer-facing output.
|
|
|
|
## Shared Pattern & System Fit
|
|
|
|
- **Cross-cutting feature marker**: yes.
|
|
- **Systems touched**: baseline matching, compare strategies, OperationRun proof context, evidence-gap rendering, support diagnostics where they consume baseline compare context.
|
|
- **Shared abstractions reused**: Spec 382 `MatchingOutcome` and `SubjectMatchingPipeline`, existing compare strategy result objects, `OperationRunService`, `OperationSummaryKeys`, existing Filament/Livewire surfaces and badge/status helpers.
|
|
- **New abstraction introduced? why?**: yes, a narrow result semantics classifier/mapper is expected. It replaces overloaded result truth and gives future Specs 384/385 a stable input.
|
|
- **Why the existing abstraction was sufficient or insufficient**: Existing matching and compare abstractions identify subjects and payload differences, but they still express final result truth through old policy-shaped strings. Existing UI helpers can render mapped truth once the domain semantics are explicit.
|
|
- **Bounded deviation / spread control**: The classifier is baseline-compare-owned. It must not become a workflow engine, broad evidence readiness engine, customer report wording engine, or generic provider framework.
|
|
|
|
## OperationRun UX Impact
|
|
|
|
- **Touches OperationRun start/completion/link UX?**: no.
|
|
- **Central contract reused**: existing baseline compare operation lifecycle and Monitoring detail route/link behavior.
|
|
- **Delegated UX behaviors**: N/A.
|
|
- **Surface-owned behavior kept local**: N/A.
|
|
- **Queued DB-notification policy**: N/A.
|
|
- **Terminal notification path**: existing lifecycle only.
|
|
- **Exception path**: none.
|
|
|
|
Implementation changes baseline compare OperationRun context/proof and summary semantics. It must keep `OperationRun.status` and `OperationRun.outcome` transitions inside `OperationRunService`, and any new summary count keys must be added to `OperationSummaryKeys::all()` with tests.
|
|
|
|
## Provider Boundary & Portability Fit
|
|
|
|
- **Shared provider/platform boundary touched?**: yes.
|
|
- **Provider-owned seams**: provider metadata/proof fields that feed Spec 382 matching and compare strategies.
|
|
- **Platform-core seams**: result dimensions, result reasons, categories, actionability, readiness impact, trust level, OperationRun proof payload contract, and operator-facing result vocabulary.
|
|
- **Neutral platform terms / contracts preserved**: provider resource, governed subject, identity, binding, canonicalization, comparison, coverage, limitation, drift, evidence, actionability, readiness impact, trust level.
|
|
- **Retained provider-specific semantics and why**: provider key/type/id/discriminator remain proof metadata. They are not top-level result categories.
|
|
- **Bounded extraction or follow-up path**: document-in-feature for any contained provider-specific proof metadata; follow-up-spec for resolution UI or evidence/review readiness integration.
|
|
|
|
## Constitution Check
|
|
|
|
- Inventory-first: result semantics consume last-observed inventory, snapshots, policy versions, Spec 382 descriptors, and existing compare strategy output. Microsoft remains external truth.
|
|
- Read/write separation: V1 adds no write action. Existing compare operation remains queued/observable.
|
|
- Graph contract path: no new Graph calls. No Graph/provider runtime call during UI render or classification.
|
|
- Deterministic capabilities: no new capability family planned.
|
|
- RBAC-UX: existing workspace/managed-environment access checks remain required before baseline compare results are visible.
|
|
- Workspace isolation: OperationRun and baseline/evidence reads remain workspace scoped.
|
|
- Tenant isolation: managed-environment scoped compare result proof must not leak across environments.
|
|
- Run observability: existing baseline compare `OperationRun` remains canonical execution truth.
|
|
- OperationRun start UX: unchanged.
|
|
- Ops-UX lifecycle: no direct status/outcome transitions may be added.
|
|
- Ops-UX summary counts: new keys require `OperationSummaryKeys::all()` update and tests; otherwise reuse existing keys.
|
|
- Data minimization: structured proof must be sanitized and exclude secrets/raw provider payloads.
|
|
- Test governance: unit and feature lanes are narrowest; browser/pgsql conditional only.
|
|
- Proportionality: new semantic family is justified because old reason strings are product truth and block future resolution/readiness work.
|
|
- No premature abstraction: only baseline compare semantics, not a generic workflow/evidence/report framework.
|
|
- Persisted truth: no new table/entity approved; structured payloads use existing OperationRun context/result paths.
|
|
- Behavioral state: every new value must change actionability, readiness, aggregation, trust, or operator interpretation.
|
|
- UI semantics: direct domain-to-existing-surface mapping; no new UI taxonomy framework.
|
|
- Shared pattern first: existing OperationRun and Filament/Livewire rendering paths are reused.
|
|
- Provider boundary: top-level compare semantics are provider-neutral.
|
|
- V1 explicitness / few layers: replace old strings rather than stack compatibility aliases.
|
|
- Spec discipline / bloat check: result semantics grouped in one coherent spec; resolution UI and evidence/review readiness remain follow-ups.
|
|
- Filament-native UI: no new Filament surface/action/layout. Existing native/shared surfaces only.
|
|
- UI/Productization coverage: existing status presentation changes are documented; no new route/page/archetype.
|
|
|
|
## Test Governance Check
|
|
|
|
- **Test purpose / classification by changed surface**: Unit for semantic values/classifiers; Feature for compare integration, OperationRun payloads, existing status presentation, evidence/review regressions.
|
|
- **Affected validation lanes**: fast-feedback, confidence; pgsql/browser conditional.
|
|
- **Why this lane mix is the narrowest sufficient proof**: The behavior is deterministic classification and existing DB-backed compare result context. UI browser proof is not needed unless layout/navigation changes.
|
|
- **Narrowest proving command(s)**:
|
|
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/Baselines/CompareSemantics`
|
|
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Baselines/BaselineCompareGapClassificationTest.php tests/Feature/Baselines/BaselineCompareAmbiguousMatchGapTest.php tests/Feature/Baselines/BaselineCompareProviderResourceBindingCanonicalIdentityTest.php tests/Feature/Baselines/BaselineCompareExecutionGuardTest.php tests/Feature/Baselines/BaselineCompareResumeTokenTest.php`
|
|
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Filament/BaselineCompareExplanationSurfaceTest.php tests/Feature/Filament/BaselineCompareLandingWhyNoFindingsTest.php tests/Feature/Filament/BaselineCompareSummaryConsistencyTest.php tests/Feature/Filament/BaselineCompareEvidenceGapTableTest.php`
|
|
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Evidence/BaselineDriftPostureSourceTest.php tests/Feature/ReviewPack/Spec347ReviewPackReadinessSemanticsTest.php tests/Feature/ReviewPack/Spec349ReviewPackResolutionGuidanceTest.php`
|
|
- **Fixture / helper / factory / seed / context cost risks**: reuse existing baseline compare and Spec 382 fixtures. No global provider/workspace defaults.
|
|
- **Expensive defaults or shared helper growth introduced?**: no.
|
|
- **Heavy-family additions, promotions, or visibility changes**: none planned.
|
|
- **Surface-class relief / special coverage rule**: standard-native-filament relief unless UI structure changes.
|
|
- **Closing validation and reviewer handoff**: reviewers verify no legacy reason compatibility, no provider-specific top-level semantics, no false no-drift, no Spec 384/385 scope, and no hidden browser/pgsql lane change.
|
|
- **Budget / baseline / trend follow-up**: none expected.
|
|
- **Review-stop questions**: taxonomy bloat, old string leftovers, summary count key ownership, fixture cost, and scope bleed into evidence/review/customer output.
|
|
- **Escalation path**: document-in-feature for contained existing-surface label changes; follow-up-spec for structural UI or evidence/readiness integration.
|
|
- **Active feature PR close-out entry**: Baseline Compare Result Semantics / Gap Classification.
|
|
- **Why no dedicated follow-up spec is needed**: This spec is the dedicated follow-up to Spec 382. Specs 384 and 385 remain separate for UI decisions and readiness integration.
|
|
|
|
## Project Structure
|
|
|
|
### Documentation (this feature)
|
|
|
|
```text
|
|
specs/383-baseline-result-semantics/
|
|
├── checklists/
|
|
│ └── requirements.md
|
|
├── plan.md
|
|
├── spec.md
|
|
└── tasks.md
|
|
```
|
|
|
|
### Source Code (repository root)
|
|
|
|
```text
|
|
apps/platform/app/
|
|
├── Jobs/
|
|
│ └── CompareBaselineToTenantJob.php
|
|
├── Services/
|
|
│ ├── Baselines/
|
|
│ │ └── Matching/
|
|
│ └── Evidence/
|
|
│ └── Sources/
|
|
└── Support/
|
|
├── Baselines/
|
|
│ ├── Compare/
|
|
│ ├── CompareSemantics/ # expected new narrow support namespace
|
|
│ ├── Matching/
|
|
│ ├── BaselineCompareEvidenceGapDetails.php
|
|
│ ├── BaselineCompareReasonCode.php
|
|
│ ├── ResolutionOutcome.php
|
|
│ └── SubjectResolver.php
|
|
└── OpsUx/
|
|
└── OperationSummaryKeys.php
|
|
|
|
apps/platform/tests/
|
|
├── Unit/Support/Baselines/CompareSemantics/
|
|
├── Unit/Support/Baselines/Matching/
|
|
├── Feature/Baselines/
|
|
├── Feature/Filament/
|
|
├── Feature/Evidence/
|
|
└── Feature/ReviewPack/
|
|
```
|
|
|
|
**Structure Decision**: Use the existing Laravel monolith under `apps/platform`. Keep semantics code baseline-compare-owned. Do not create a new package, module root, route family, UI framework, or persistence layer.
|
|
|
|
## Complexity Tracking
|
|
|
|
| Violation | Why Needed | Simpler Alternative Rejected Because |
|
|
|---|---|---|
|
|
| New result reason/category/actionability/readiness family | Current reason strings mix identity, evidence, provider absence, limitations, unsupported scope, and failures | Renaming labels would preserve ambiguous product truth and leave future Specs 384/385 unsafe |
|
|
| New classifier/mapper | Spec 382 matching and existing compare strategies need one canonical mapping into final result semantics | Scattering mappings in `CompareBaselineToTenantJob`, `SubjectResolver`, and UI helpers would create duplicate truth |
|
|
| Structured OperationRun proof payload | Monitoring/support/evidence consumers need machine-readable result truth | Keeping flat `by_reason` strings forces every consumer to decode overloaded legacy labels |
|
|
|
|
## Proportionality Review
|
|
|
|
- **Current operator problem**: Operators cannot tell which compare outcomes are trusted, blocked, missing evidence, missing provider resource, unsupported, limited, excluded, or failed.
|
|
- **Existing structure is insufficient because**: Current runtime still uses old strings in `MatchingOutcome`, `SubjectResolver`, compare strategy diagnostics, OperationRun context, and tests.
|
|
- **Narrowest correct implementation**: One baseline compare semantics layer plus mapped structured payloads over existing matching/compare outputs.
|
|
- **Ownership cost created**: New value families and mapping tests; reviewer vigilance against compatibility aliases and UI/evidence/report scope creep.
|
|
- **Alternative intentionally rejected**: Keep old strings and add display labels. That would not remove false green/false red risk and would leave downstream readiness work ambiguous.
|
|
- **Release truth**: Current-release truth required after Spec 382.
|
|
|
|
## Domain And Data Model Implications
|
|
|
|
- `MatchingOutcome` remains upstream matching truth, but its reason codes must map to final compare semantics.
|
|
- `CompareSubjectResult` remains compare strategy output, but strategy gap reasons must map to final compare semantics.
|
|
- `BaselineCompareReasonCode` may be replaced, narrowed, or kept only as run-level summary codes if it no longer carries overloaded subject-level truth.
|
|
- `ResolutionOutcome` and `SubjectResolver` must not remain authoritative for new compare result semantics if their old values are policy-shaped.
|
|
- OperationRun baseline compare context may preserve the current rendering envelope only where existing surfaces still read it, but this is not legacy semantic compatibility: authoritative result truth must be structured under a new semantic payload path, and old reason aliases/readers remain prohibited.
|
|
- Existing local/dev rows need no compatibility reader. If implementation needs to purge/reset old local/dev payloads, document the operational step in close-out.
|
|
- No new table, migration, index, queue, scheduler, env var, or storage path is expected. If implementation needs any of these, update spec and plan before continuing.
|
|
|
|
## Implementation Phases
|
|
|
|
1. Confirm completed dependency guardrails for Specs 381 and 382, and confirm no changes to completed spec history.
|
|
2. Add unit tests for result reasons, categories, actionability, readiness impact, trust, clean-success rules, and run summary classification.
|
|
3. Add feature tests for baseline compare gap payloads, missing provider vs missing local evidence, foundation limitation mapping, active binding/matching outcome mapping, and old reason removal.
|
|
4. Add or update the narrow result semantics value family and classifier.
|
|
5. Map Spec 382 `MatchingOutcome` to final compare subject outcomes.
|
|
6. Map compare strategy states and diagnostics to final compare subject outcomes.
|
|
7. Update `CompareBaselineToTenantJob` to aggregate structured subject outcomes, gap subjects, category counts, actionability counts, readiness counts, and run summary decisions.
|
|
8. Update existing evidence-gap/detail/status label helpers and Filament/Livewire feature tests if rendered groups change.
|
|
9. Run evidence/review regression tests to prove no final readiness/customer output mapping is introduced.
|
|
10. Run targeted tests, Pint, and diff check; record close-out with Filament/Livewire/deploy impact.
|
|
|
|
## Filament v5 Output Contract For Later Implementation Report
|
|
|
|
- Livewire v4.0+ compliance: unchanged unless implementation unexpectedly touches Livewire. Project currently uses Livewire 4.1.4.
|
|
- Provider registration location: unchanged. Laravel 12 panel providers remain in `apps/platform/bootstrap/providers.php`.
|
|
- Global search: no resource is added or changed; no global search behavior is planned.
|
|
- Destructive/high-impact actions: no Filament action is added and no destructive action is introduced. Existing compare start behavior keeps existing authorization/OperationRun rules.
|
|
- Asset strategy: no Filament assets are registered; no Spec 383-specific `filament:assets` deployment concern beyond normal release process.
|
|
- Testing plan: unit/feature tests cover semantics, compare integration, OperationRun payloads, existing status rendering, and evidence/review regressions. No browser test unless UI layout/navigation/action behavior changes.
|
|
|
|
## Rollout And Deployment Considerations
|
|
|
|
- No environment variables, queue names, scheduler entries, storage volumes, reverse proxy changes, route changes, panel provider changes, or asset build changes are expected.
|
|
- No schema migration is expected. Because TenantPilot is pre-production, old local/dev compare payloads may be invalidated/reset instead of read through a compatibility mapper.
|
|
- Staging validation should run targeted compare/semantics/evidence/review tests and normal formatting checks before production promotion.
|
|
- Rollback is code rollback plus clearing/regenerating local/dev compare OperationRun payloads if necessary; no persisted compatibility layer is planned.
|