TenantAtlas/specs/383-baseline-result-semantics/tasks.md

# Tasks: Spec 383 - Baseline Compare Result Semantics and Gap Classification v1

**Input**: Design documents from `/specs/383-baseline-result-semantics/`
**Prerequisites**: `spec.md`, `plan.md`, completed Specs 381 and 382 close-outs

**Tests**: Runtime behavior changes require Pest unit and feature tests before or alongside implementation. Browser tests are not required unless implementation changes rendered layout, navigation, actions, or JavaScript behavior.

## Test Governance Checklist

- [x] TGC001 Lane assignment is named and is the narrowest sufficient proof for the changed behavior.
- [x] TGC002 New or changed tests stay in the smallest honest family, and any heavy-governance or browser addition is explicit.
- [x] TGC003 Shared helpers, factories, seeds, fixtures, and context defaults stay cheap by default; any widening is isolated or documented.
- [x] TGC004 Planned validation commands cover the change without pulling in unrelated lane cost.
- [x] TGC005 The declared surface test profile or `standard-native-filament` relief is explicit.
- [x] TGC006 Any material budget, baseline, trend, or escalation note is recorded in the active spec or PR.

## Phase 1: Preparation And Guardrails

**Purpose**: Protect completed history, confirm repo truth, and keep the implementation bounded.

- [x] T001 Confirm `specs/381-provider-resource-identity-binding/implementation-close-out.md` and `specs/382-baseline-matching-canonicalization/implementation-close-out.md` exist and treat both as dependency context only.
- [x] T002 Confirm no code or artifact changes are made to completed specs `specs/381-provider-resource-identity-binding/`, `specs/382-baseline-matching-canonicalization/`, `specs/163-baseline-subject-resolution/`, `specs/336-baseline-compare-product-process-flow-alignment/`, `specs/347-review-pack-output-contract-readiness-semantics/`, `specs/350-operator-resolution-guidance-framework-v1/`, or `specs/380-management-report-pdf-staging-runtime-validation/`.
- [x] T003 Re-read `apps/platform/app/Support/Baselines/Matching/MatchingOutcome.php`, `apps/platform/app/Services/Baselines/Matching/SubjectMatchingPipeline.php`, and `apps/platform/app/Services/Baselines/Matching/FoundationCoverageResolver.php` before implementation.
- [x] T004 Re-read `apps/platform/app/Jobs/CompareBaselineToTenantJob.php`, `apps/platform/app/Support/Baselines/Compare/CompareSubjectResult.php`, `apps/platform/app/Support/Baselines/Compare/IntuneCompareStrategy.php`, and `apps/platform/app/Support/Baselines/Compare/CompareState.php` before implementation.
- [x] T005 Re-read `apps/platform/app/Support/Baselines/BaselineCompareReasonCode.php`, `apps/platform/app/Support/Baselines/BaselineCompareEvidenceGapDetails.php`, `apps/platform/app/Support/Baselines/SubjectResolver.php`, and `apps/platform/app/Support/Baselines/ResolutionOutcome.php` before implementation.
- [x] T006 Confirm no new route, navigation entry, destructive action, Filament panel provider, Livewire component, queue name, scheduler entry, env var, storage path, or persisted entity is needed; if any is needed, stop and update `specs/383-baseline-result-semantics/spec.md` and `plan.md`.

---

## Phase 2: Tests First - Core Semantics

**Purpose**: Lock the new source of truth before changing runtime code.

- [x] T007 [P] [US1] Add coverage in `apps/platform/tests/Unit/Support/Baselines/CompareSemantics/BaselineCompareOutcomeClassifierTest.php` for every V1 reason, category, actionability, readiness impact, and trust-level mapping.
- [x] T008 [P] [US1] Add coverage in `apps/platform/tests/Unit/Support/Baselines/CompareSemantics/BaselineCompareOutcomeClassifierTest.php` for clean success rules, drift, no drift, blocker, limitation, unsupported, missing, excluded, and failed outcomes.
- [x] T009 [P] [US1] Add `apps/platform/tests/Unit/Support/Baselines/CompareSemantics/BaselineCompareOutcomeClassifierTest.php` covering trusted no-drift, trusted drift, identity required, duplicate candidates, missing provider resource, missing local evidence, unsupported resource, inventory-only foundation, identity-only foundation, accepted limitation, excluded non-governed, low-trust not no-drift, and compare failure.
- [x] T010 [P] [US3] Add coverage in `apps/platform/tests/Unit/Support/Baselines/CompareSemantics/BaselineCompareOutcomeClassifierTest.php` for completed, completed-with-drift, partial/limited, blocked, and failed run aggregation.
- [x] T011 [P] [US1] Add coverage in `apps/platform/tests/Unit/Support/Baselines/CompareSemantics/BaselineCompareOutcomeClassifierTest.php` asserting old overloaded reason values are not authoritative enum/constant values in the new semantics model.

---

## Phase 3: Tests First - Matching And Compare Integration

**Purpose**: Prove Spec 382 matching outcomes map to final compare semantics without false green or false red output.

- [x] T012 [P] [US2] Update `apps/platform/tests/Unit/Support/Baselines/Matching/MatchingOutcomeTest.php` so matching reasons expected by Spec 383 are provider-neutral and no longer assert `ambiguous_match`, `unsupported_subject`, or `foundation_not_policy_backed` as final result truth.
- [x] T013 [P] [US2] Update `apps/platform/tests/Unit/Support/Baselines/Matching/SubjectMatchingPipelineTest.php` to assert active binding, canonical identity, duplicate candidates, missing local evidence, missing provider resource, unsupported, limited, excluded, and identity-required outcomes map through the classifier.
- [x] T014 [P] [US2] Update `apps/platform/tests/Unit/Services/Baselines/Matching/FoundationCoverageResolverTest.php` so inventory-only, identity-only, canonical-only, unsupported, and accepted limitation coverage expect the new provider-neutral reason names.
- [x] T015 [P] [US2] Update `apps/platform/tests/Feature/Baselines/BaselineCompareProviderResourceBindingCanonicalIdentityTest.php` to assert binding-resolved identity produces trusted comparison eligibility but not no-drift by itself.
- [x] T016 [P] [US2] Update `apps/platform/tests/Feature/Baselines/BaselineCompareAmbiguousMatchGapTest.php` to expect `unresolved_duplicate_candidates` or `unresolved_ambiguous_identity` instead of `ambiguous_match`.
- [x] T017 [P] [US1] Update `apps/platform/tests/Feature/Baselines/BaselineCompareGapClassificationTest.php` to assert missing local evidence, missing provider resource, and foundation limitation states are distinct.
- [x] T018 [P] [US3] Update `apps/platform/tests/Feature/Baselines/BaselineCompareExecutionGuardTest.php` so compare strategy exceptions map to compare-failed semantics without relying on `strategy_failed` as an authoritative subject reason.
- [x] T019 [P] [US3] Update `apps/platform/tests/Feature/Baselines/BaselineCompareResumeTokenTest.php` so resumed full-content gaps use new missing-local-evidence semantics instead of `policy_record_missing`.
- [x] T020 [P] [US1] Update `apps/platform/tests/Feature/Baselines/BaselineCompareGapClassificationTest.php` to prove stale or absent current provider descriptors do not by themselves emit `missing_provider_resource`, and that missing-provider semantics require current-provider absence proof or an active binding that marks the expected resource missing.

---

## Phase 4: Tests First - Existing Presentation And Downstream Regressions

**Purpose**: Keep existing surfaces and downstream consumers honest without implementing Spec 384 or 385.

- [x] T021 [P] [US4] Update `apps/platform/tests/Feature/Filament/BaselineCompareEvidenceGapTableTest.php` to assert existing evidence-gap rows render the new result groups and do not expose old reason strings as primary operator truth.
- [x] T022 [P] [US4] Update `apps/platform/tests/Feature/Filament/BaselineCompareExplanationSurfaceTest.php` to assert provider-neutral blocker/limitation/missing/failure explanations.
- [x] T023 [P] [US4] Update `apps/platform/tests/Feature/Filament/BaselineCompareLandingWhyNoFindingsTest.php` to prove no-drift explanation appears only when the new clean-success rules allow it.
- [x] T024 [P] [US4] Update `apps/platform/tests/Feature/Filament/BaselineCompareSummaryConsistencyTest.php` to assert result group totals stay consistent with OperationRun context counts.
- [x] T025 [P] [US3] Update `apps/platform/tests/Feature/Evidence/BaselineDriftPostureSourceTest.php` to prove evidence posture does not treat blockers/limitations as verified no drift before Spec 385.
- [x] T026 [P] [US3] Update `apps/platform/tests/Feature/ReviewPack/Spec347ReviewPackReadinessSemanticsTest.php` and `apps/platform/tests/Feature/ReviewPack/Spec349ReviewPackResolutionGuidanceTest.php` to prove customer-facing readiness/output wording is unchanged by Spec 383.

---

## Phase 5: Define The Narrow Result Semantics Model

**Purpose**: Add provider-neutral value families with direct behavioral consequences.

- [x] T027 [US1] Create `apps/platform/app/Support/Baselines/CompareSemantics/CompareResultIdentityStatus.php` with resolved, binding-resolved, canonicalization-resolved, unresolved, missing, and unsupported identity values.
- [x] T028 [US1] Create `apps/platform/app/Support/Baselines/CompareSemantics/CompareResultComparisonStatus.php` with not-compared, no-drift, drift-detected, compare-failed, and compare-not-supported values.
- [x] T029 [US1] Create `apps/platform/app/Support/Baselines/CompareSemantics/CompareResultCoverageStatus.php` with fully-verified, verified-with-limitations, inventory-only, identity-only, canonical-only, unsupported, missing-local-evidence, missing-provider-resource, excluded, and accepted-limitation values.
- [x] T030 [US1] Create `apps/platform/app/Support/Baselines/CompareSemantics/CompareResultActionability.php`, `CompareResultReadinessImpact.php`, `CompareResultTrustLevel.php`, and `CompareResultCategory.php` with the V1 values from `specs/383-baseline-result-semantics/spec.md`.
- [x] T031 [US1] Create `apps/platform/app/Support/Baselines/CompareSemantics/CompareResultReason.php` with provider-neutral reasons and mapping methods for category, actionability, readiness impact, and default trust.
- [x] T032 [US1] Create `apps/platform/app/Support/Baselines/CompareSemantics/CompareSubjectOutcome.php` as a derived result object with sanitized `toArray()` output for OperationRun/context use.
- [x] T033 [US1] Create `apps/platform/app/Support/Baselines/CompareSemantics/BaselineCompareOutcomeClassifier.php` to map matching outcomes plus compare strategy outputs into `CompareSubjectOutcome`.
- [x] T034 [US3] Create `apps/platform/app/Support/Baselines/CompareSemantics/BaselineCompareRunSummaryClassifier.php` to aggregate subject outcomes into run-level completed, partial, blocked, or failed decisions and count buckets.

---

## Phase 6: Replace Legacy Matching And Gap Reasons

**Purpose**: Stop old matching/gap strings from remaining authoritative.

- [x] T035 [US2] Update `apps/platform/app/Support/Baselines/Matching/MatchingOutcome.php` so factory methods use new provider-neutral reason names and keep old strings out of authoritative output.
- [x] T036 [US2] Update `apps/platform/app/Services/Baselines/Matching/FoundationCoverageResolver.php` so unsupported and foundation coverage returns `unsupported_resource_class`, `foundation_inventory_only`, `foundation_identity_only`, or `foundation_canonical_only` as appropriate.
- [x] T037 [US2] Update `apps/platform/app/Services/Baselines/Matching/SubjectMatchingPipeline.php` to map duplicate candidates to `unresolved_duplicate_candidates`, low-trust/identity gaps to `identity_required` or `unresolved_low_trust_match`, active binding resolution to `resolved_active_binding`, and canonical/provider identity to provider-neutral resolved reasons.
- [x] T038 [US1] Update `apps/platform/app/Support/Baselines/SubjectResolver.php`, `apps/platform/app/Support/Baselines/ResolutionOutcome.php`, and `apps/platform/app/Support/Baselines/ResolutionOutcomeRecord.php` so legacy policy-shaped reasons are no longer final compare result truth; retain only non-authoritative helper behavior if still needed by capture flows and document any boundary in code/tests.
- [x] T039 [US1] Update `apps/platform/app/Support/Baselines/BaselineCompareReasonCode.php` so run-level reasons are either provider-neutral summary reasons or delegated to the new run summary classifier.

---

## Phase 7: Compare Strategy And OperationRun Integration

**Purpose**: Make runtime compare output and proof payloads use the new truth.

- [x] T040 [US2] Update `apps/platform/app/Support/Baselines/Compare/CompareState.php` or its mapping layer so unsupported, incomplete, ambiguous, failed, drift, and no-drift states map to `CompareSubjectOutcome` without old reason strings.
- [x] T041 [US2] Update `apps/platform/app/Support/Baselines/Compare/CompareSubjectResult.php` to expose structured semantic payloads or enough diagnostics for `BaselineCompareOutcomeClassifier` without duplicating result truth.
- [x] T042 [US2] Update `apps/platform/app/Support/Baselines/Compare/IntuneCompareStrategy.php` so missing current evidence, unsupported subjects, ambiguous conditions, and compare failures emit provider-neutral diagnostics and keep drift/no-drift limited to trusted comparable subjects.
- [x] T043 [US2] Update `apps/platform/tests/Feature/Baselines/Support/FakeCompareStrategy.php` to emit provider-neutral diagnostics used by Spec 383 tests.
- [x] T044 [US3] Update `apps/platform/app/Jobs/CompareBaselineToTenantJob.php` to build structured `CompareSubjectOutcome` records from matching outcomes and strategy results before gap aggregation.
- [x] T045 [US3] Update `apps/platform/app/Jobs/CompareBaselineToTenantJob.php` so `baseline_compare.evidence_gaps` includes structured counts by reason, category, actionability, readiness impact, and subject outcome payloads.
- [x] T046 [US3] Update `apps/platform/app/Jobs/CompareBaselineToTenantJob.php` so run outcome and `summary_counts` derive from `BaselineCompareRunSummaryClassifier` and stay compatible with `OperationSummaryKeys::all()`.
- [x] T047 [US3] Update `apps/platform/app/Support/OpsUx/OperationSummaryKeys.php` and `apps/platform/tests/Feature/OpsUx/OperationSummaryKeysSpecTest.php` only if Spec 383 needs new count keys not representable by the existing canonical list.
- [x] T048 [US3] Add or update a focused test or guard assertion proving baseline compare aggregation does not mutate `OperationRun.status` or `OperationRun.outcome` outside `OperationRunService` while summary semantics change.

---

## Phase 8: Existing Surface Labels And Downstream Consumers

**Purpose**: Render the new truth on existing surfaces without new UI workflows.

- [x] T049 [US4] Update `apps/platform/app/Support/Baselines/BaselineCompareEvidenceGapDetails.php` to render provider-neutral group labels for verified, drift detected, action required, missing evidence, missing provider resource, unsupported, limitations, excluded, and failed.
- [x] T050 [US4] Update `apps/platform/app/Support/Baselines/BaselineCompareExplanationRegistry.php` and `apps/platform/app/Support/ReasonTranslation/ReasonPresenter.php`; search for OperationRun baseline-compare presentation helpers directly touched by the implementation and update any matches so primary operator text no longer uses old reason strings.
- [x] T051 [US4] Confirm `apps/platform/app/Livewire/BaselineCompareEvidenceGapTable.php` uses existing data paths and does not add a new action, route, modal, drawer, or layout pattern.
- [x] T052 [US3] Update `apps/platform/app/Services/Evidence/Sources/BaselineDriftPostureSource.php` only if needed to avoid treating blocked/limited compare runs as complete before Spec 385.
- [x] T053 [US4] If implementation changes route/layout/action structure instead of labels/groups only, update the active spec/plan plus `docs/ui-ux-enterprise-audit/route-inventory.md`, `docs/ui-ux-enterprise-audit/design-coverage-matrix.md`, and `docs/ui-ux-enterprise-audit/page-reports/ui-015-baseline-compare.md` before continuing.

---

## Phase 9: Legacy Removal And Scope Guard

**Purpose**: Remove old authoritative truth and prevent accidental compatibility scope.

- [x] T054 [US1] Search `apps/platform/app` and `apps/platform/tests` for `ambiguous_match`, `policy_record_missing`, `foundation_not_policy_backed`, `missing_policy`, `missing_current`, `unsupported_subject`, `unsupported_subjects`, `coverage_unproven`, and `strategy_failed`; remove or convert compare-result usages to the new semantics.
- [x] T055 [US1] Keep any old string that remains only if it is outside baseline compare result truth or is explicitly transitional fixture input; document the boundary in the nearest test or close-out note.
- [x] T056 [US1] Confirm no legacy result-code mapper, old OperationRun context reader, dual old/new result reader, or compatibility alias is introduced.
- [x] T057 [US4] Confirm no Spec 384 resolution UI, manual bind/exclude/accept-limitation workflow, or operator decision screen is implemented.
- [x] T058 [US3] Confirm no Spec 385 Evidence Snapshot readiness final mapping, Review Pack publication blocker mapping, or customer-facing wording is implemented.
- [x] T059 [US4] Confirm no Management Report/PDF runtime or report wording work is included.

---

## Phase 10: Validation And Close-Out

**Purpose**: Prove the implementation and document the exact operational impact.

- [x] T060 Run `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/Baselines/CompareSemantics`.
- [x] T061 Run `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/Baselines/Matching tests/Unit/Baselines/CompareStrategyRegistryTest.php`.
- [x] T062 Run `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Baselines/BaselineCompareGapClassificationTest.php tests/Feature/Baselines/BaselineCompareAmbiguousMatchGapTest.php tests/Feature/Baselines/BaselineCompareProviderResourceBindingCanonicalIdentityTest.php tests/Feature/Baselines/BaselineCompareExecutionGuardTest.php tests/Feature/Baselines/BaselineCompareResumeTokenTest.php`.
- [x] T063 Run `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Filament/BaselineCompareExplanationSurfaceTest.php tests/Feature/Filament/BaselineCompareLandingWhyNoFindingsTest.php tests/Feature/Filament/BaselineCompareSummaryConsistencyTest.php tests/Feature/Filament/BaselineCompareEvidenceGapTableTest.php`.
- [x] T064 Run `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Evidence/BaselineDriftPostureSourceTest.php tests/Feature/ReviewPack/Spec347ReviewPackReadinessSemanticsTest.php tests/Feature/ReviewPack/Spec349ReviewPackResolutionGuidanceTest.php`.
- [x] T065 Run a PostgreSQL lane only if implementation adds migrations, JSONB indexes/query behavior, locks, or constraints.
- [x] T066 Run a browser smoke test only if implementation changes rendered layout, navigation, actions, or JavaScript behavior beyond labels/groups.
- [x] T067 Run `cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent`.
- [x] T068 Run `git diff --check`.
- [x] T069 Record `specs/383-baseline-result-semantics/implementation-close-out.md` with Livewire v4 compliance, provider registration location, global search status, destructive/high-impact action status, asset strategy, tests run, browser decision, and deployment impact.

## Dependencies

- Phase 1 must finish before runtime work.
- Phases 2-4 should be written before or alongside implementation changes.
- Phase 5 unblocks Phases 6 and 7.
- Phase 6 must complete before OperationRun/gap aggregation can be trusted.
- Phase 7 unblocks Phase 8.
- Phase 9 and Phase 10 validate the completed implementation.

## Parallel Opportunities

- T007-T011 can be drafted in parallel.
- T012-T020 can be drafted in parallel if each test file remains scoped.
- T021-T026 can be drafted in parallel with the core semantics tests.
- T027-T034 can be implemented in parallel after names are agreed, but mapping methods should converge before integration.
- T035-T039 and T040-T048 should be coordinated because they touch shared reason mappings.
- T060-T064 can be run independently after implementation, but close-out should cite the complete targeted set.

## Explicit Non-Goals

- Do not add new persisted entities/tables/artifacts without updating spec and plan first.
- Do not add new routes, navigation entries, Filament actions, modals, drawers, wizards, panel providers, or assets.
- Do not add operator resolution UI.
- Do not change final Evidence Snapshot readiness, Review Pack readiness, or customer-facing report/review wording.
- Do not add historical payload mappers, OperationRun context compatibility readers, or old reason aliases.
- Do not create a generic workflow engine, report engine, provider framework, badge framework, or evidence readiness framework.