# Implementation Plan: 163 — Baseline Subject Resolution and Evidence Gap Semantics Foundation

**Branch**: `163-baseline-subject-resolution` | **Date**: 2026-03-24 | **Spec**: `specs/163-baseline-subject-resolution/spec.md`
**Input**: Feature specification from `specs/163-baseline-subject-resolution/spec.md`

**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/scripts/` for helper scripts.

## Summary

Introduce an explicit backend subject-resolution contract for baseline compare and baseline capture so the system can classify each subject before resolution, select the correct local model path, and persist precise operator-safe gap semantics instead of collapsing structural, operational, and transient causes into broad `policy_not_found` style states. The implementation will extend existing baseline scope, inventory policy-type metadata, compare and capture jobs, baseline evidence-gap detail parsing, and OperationRun context persistence rather than introducing a parallel execution stack, with a bounded runtime support guard that prevents baseline-supported types from entering compare or capture on a resolver path that cannot truthfully classify them.

## Technical Context

**Language/Version**: PHP 8.4  
**Primary Dependencies**: Laravel 12, Filament v5, Livewire v4  
**Storage**: PostgreSQL via existing application tables, especially `operation_runs.context` and baseline snapshot summary JSON  
**Testing**: Pest v4 on PHPUnit 12  
**Target Platform**: Dockerized Laravel web application running through Sail locally and Dokploy in deployment  
**Project Type**: Web application  
**Performance Goals**: Preserve DB-only render behavior for Monitoring and tenant review surfaces, add no render-time Graph calls, and keep evidence-gap interpretation deterministic and lightweight enough for existing run-detail and landing surfaces  
**Constraints**:
- No new render-time remote work and no bypass of `GraphClientInterface`
- No change to `OperationRun` lifecycle ownership, notification channels, or summary-count rules
- No new operator screen; existing surfaces must present richer semantics
- Existing development-only run payloads may be deleted or regenerated if that simplifies migration to the new structured contract
- Baseline-supported configuration must not overpromise runtime capability
**Scale/Scope**: Cross-cutting backend semantic work across baseline compare and capture pipelines, support-layer parsers and translators, OperationRun context contracts, tenant and canonical read surfaces, and focused Pest coverage for deterministic classification and development-safe contract cleanup

## Constitution Check

*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*

- Inventory-first: PASS — the design keeps inventory as last-observed truth and distinguishes inventory-backed evidence from policy-backed evidence rather than conflating them.
- Read/write separation: PASS — this feature changes classification and persisted run semantics inside existing compare and capture flows; it does not add new write or restore actions.
- Graph contract path: PASS — no new Graph contract or direct endpoint use is introduced; existing capture and sync services remain the only remote paths.
- Deterministic capabilities: PASS — subject-class derivation, resolution outcome mapping, and support-capability guards are explicitly designed to be deterministic and testable.
- RBAC-UX: PASS — existing `/admin` and tenant-context authorization boundaries remain unchanged; only read semantics improve.
- Workspace isolation: PASS — no new workspace leakage is introduced and canonical run-detail remains tenant-safe.
- RBAC confirmations: PASS — no new destructive actions are added.
- Global search: PASS — unaffected.
- Tenant isolation: PASS — all compare, capture, inventory, and run data remain tenant-bound and entitlement-checked.
- Run observability: PASS — compare and capture continue to use existing `OperationRun` types; this slice enriches context semantics only.
- Ops-UX 3-surface feedback: PASS — no new toast, progress, or terminal-notification channels are added.
- Ops-UX lifecycle: PASS — `OperationRun.status` and `OperationRun.outcome` remain service-owned; only context enrichment changes.
- Ops-UX summary counts: PASS — no non-numeric values are moved into `summary_counts`; richer semantics live in context and read models.
- Ops-UX guards: PASS — focused regression tests can protect classification determinism and development cleanup behavior without relaxing existing CI rules.
- Ops-UX system runs: PASS — unchanged.
- Automation: PASS — existing queue, retry, and backoff behavior stays intact; transient outcomes are classified more precisely, not re-executed differently.
- Data minimization: PASS — the new gap detail contract stores classification and stable identifiers, not raw policy payloads or secrets.
- Badge semantics (BADGE-001): PASS — if structural, operational, or transient labels surface as badges, they must route through centralized badge or presentation helpers rather than ad hoc maps.
- UI naming (UI-NAMING-001): PASS — the feature exists to replace implementation-first broad error prose with domain-first operator meaning.
- Operator surfaces (OPSURF-001): PASS — existing run detail and tenant review surfaces remain operator-first and diagnostics-secondary.
- Filament UI Action Surface Contract: PASS — action topology stays unchanged; this is a read-surface semantics upgrade.
- Filament UI UX-001 (Layout & IA): PASS — existing layouts remain, but sections become more semantically truthful. No exemption required.

## Project Structure

### Documentation (this feature)

```text
specs/163-baseline-subject-resolution/
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│   └── openapi.yaml
├── checklists/
│   └── requirements.md
└── tasks.md
```

### Source Code (repository root)

```text
app/
├── Filament/
│   ├── Pages/
│   │   └── BaselineCompareLanding.php
│   ├── Resources/
│   │   ├── OperationRunResource.php
│   │   └── BaselineSnapshotResource.php
│   └── Widgets/
├── Jobs/
│   ├── CompareBaselineToTenantJob.php
│   └── CaptureBaselineSnapshotJob.php
├── Services/
│   ├── Baselines/
│   │   ├── BaselineCompareService.php
│   │   ├── BaselineCaptureService.php
│   │   ├── BaselineContentCapturePhase.php
│   │   └── Evidence/
│   ├── Intune/
│   │   └── PolicySyncService.php
│   └── Inventory/
│       └── InventorySyncService.php
├── Support/
│   ├── Baselines/
│   ├── Inventory/
│   ├── OpsUx/
│   └── Ui/
├── Livewire/
└── Models/
config/
├── tenantpilot.php
└── graph_contracts.php
tests/
├── Feature/
│   ├── Baselines/
│   ├── Filament/
│   └── Monitoring/
└── Unit/
    └── Support/
```

**Structure Decision**: Web application. The work stays inside existing baseline jobs and services, support-layer value objects and presenters, current Filament surfaces, and focused Pest coverage. No new top-level architecture area is required.

## Complexity Tracking

No constitution violations are required for this feature.

## Phase 0 — Outline & Research (DONE)

Outputs:
- `specs/163-baseline-subject-resolution/research.md`

Key decisions captured:
- Introduce a first-class subject-resolution contract in the backend instead of solving the problem with UI-only relabeling.
- Persist both `subject_class` and `resolution_outcome` because they answer different operator questions.
- Keep foundation-backed subjects eligible only when the runtime can truthfully classify them through an inventory-backed or limited-capability path.
- Add a runtime consistency guard during scope or resolver preparation so `baseline_compare.supported` cannot silently overpromise structural capability.
- Preserve transient reasons such as throttling and capture failure as precise operational outcomes rather than absorbing them into structural taxonomy.
- Treat broad legacy gap shapes as development-only cleanup candidates rather than a compatibility requirement for the new runtime contract.

## Phase 1 — Design & Contracts (DONE)

Outputs:
- `specs/163-baseline-subject-resolution/data-model.md`
- `specs/163-baseline-subject-resolution/contracts/openapi.yaml`
- `specs/163-baseline-subject-resolution/quickstart.md`

Design highlights:
- The core semantic unit is a `SubjectDescriptor` that is classified before resolution and yields a deterministic `ResolutionOutcomeRecord`.
- `OperationRun.context` remains the canonical persisted contract for compare and capture evidence-gap semantics, but new runs store richer subject-level objects instead of reason plus raw string only.
- The runtime support guard sits before compare and capture execution so unsupported structural mismatches are blocked or reclassified before misleading `policy_not_found`-style outcomes are emitted.
- Existing detail and landing surfaces are updated for the new structured gap contract, and development fixtures or stale local run data are regenerated instead of driving a permanent compatibility layer.
- Compare and capture share the same root-cause vocabulary, but retain operation-specific outcome families where needed.

## Phase 1 — Agent Context Update (REQUIRED)

Run:
- `.specify/scripts/bash/update-agent-context.sh copilot`

## Constitution Check — Post-Design Re-evaluation

- PASS — the design remains inside existing compare and capture operations and does not add new remote-call paths or lifecycle mutations.
- PASS — inventory-first semantics are strengthened because inventory-backed subjects are no longer mislabeled as missing policy records.
- PASS — operator surfaces stay on existing pages and remain DB-only at render time.
- PASS — development cleanup is explicit and bounded; the new contract remains the only forward-looking runtime shape.
- PASS — no Action Surface or UX-001 exemptions are needed because action topology and layouts remain intact.

## Phase 2 — Implementation Plan

### Step 1 — Subject classification and runtime capability foundation

Goal: implement FR-001 through FR-003, FR-008, FR-015, and FR-016 by creating a deterministic subject-resolution foundation shared by compare and capture.

Changes:
- Introduce a dedicated subject-resolution support layer under `app/Support/Baselines/` that defines:
  - subject classes
  - resolution paths
  - resolution outcomes
  - operator action categories
  - structural versus operational versus transient classification
- Extend `InventoryPolicyTypeMeta` and related metadata accessors so baseline support can express whether a type is policy-backed, inventory-backed, foundation-backed, or limited.
- Add a runtime capability guard used by `BaselineScope`, `BaselineCompareService`, and `BaselineCaptureService` so types only enter compare or capture on a truthful path.
- Keep the guard deterministic and explicit in logs or run context when support is limited or excluded.

Tests:
- Add unit tests for subject-class derivation, resolution-path derivation, and runtime-capability guard behavior.
- Add golden-style tests covering supported, limited, and structurally invalid foundation types.

### Step 2 — Capture-path resolution and gap taxonomy upgrade

Goal: implement FR-004 through FR-010 on the capture side so structural resolver mismatches are no longer emitted as generic missing-policy cases.

Changes:
- Refactor `BaselineContentCapturePhase` so it resolves subjects through the new subject contract rather than assuming a policy lookup for all subjects.
- Replace broad `policy_not_found` capture gaps with precise structured outcomes such as:
  - policy record missing
  - inventory record missing
  - foundation-backed via inventory path
  - resolution type mismatch
  - unresolvable subject
- Preserve existing transient outcomes like `throttled`, `capture_failed`, and `budget_exhausted` unchanged except for richer structured metadata.
- Persist new structured gap-subject objects for new runs and remove any requirement to keep broad legacy reason shapes alive for future writes.

Tests:
- Add feature and unit coverage for capture-path classification across policy-backed, inventory-backed, foundation-backed, duplicate, invalid, and transient cases.
- Add deterministic replay coverage proving unchanged capture inputs produce unchanged outcomes.
- Add regressions proving structural foundation subjects no longer produce new generic `policy_not_found` gaps.

### Step 3 — Compare-path resolution and evidence-gap detail contract

Goal: implement FR-004 through FR-014 on the compare side by aligning current-evidence resolution, evidence-gap reasoning, and persisted run context with the new contract.

Changes:
- Refactor `CompareBaselineToTenantJob` so baseline item interpretation and current-state resolution produce explicit `resolution_outcome` records rather than only count buckets and raw subject keys.
- Add structured evidence-gap subject records under `baseline_compare.evidence_gaps.subjects` for new runs, including subject class, resolution path, resolution outcome, reason code, operator action category, and retryability or structural flags.
- Preserve already precise compare reasons such as `missing_current`, `ambiguous_match`, and role-definition-specific gap families while separating them from structural non-policy-backed outcomes.
- Ensure baseline compare reason translation remains aligned with the new detailed reason taxonomy instead of flattening distinct root causes.

Tests:
- Add feature tests for mixed compare runs containing structural, operational, transient, and successful subjects.
- Add deterministic compare tests proving identical inputs yield identical resolution outcomes.
- Add regressions for evidence-gap persistence shape and compare-surface rendering against the new structured contract.

### Step 4 — Development cleanup and operator-surface adoption

Goal: implement FR-011 through FR-014 and the User Story 3 acceptance scenarios by moving existing read surfaces to the new gap contract and treating stale development data as disposable.

Changes:
- Extend `BaselineCompareEvidenceGapDetails`, `BaselineCompareStats`, `OperationRunResource`, `BaselineCompareLanding`, and any related Livewire gap tables so they read the new structured gap subject records consistently.
- Add an explicit development cleanup mechanism for stale local run payloads, preferably a dedicated development-only Artisan command plus fixture regeneration steps, so old broad string-only gap subjects can be purged instead of preserved.
- Introduce operator-facing labels that answer root cause before action advice while keeping diagnostics secondary.
- Keep existing pages and sections, but expose structural versus operational versus transient semantics consistently across dense and detailed surfaces.
- Update snapshot and compare summary surfaces where old broad reason aggregations would otherwise misread the new taxonomy.

Tests:
- Add or update Filament feature tests for canonical run detail and tenant baseline compare landing against the new structured run shape.
- Add cleanup-oriented tests proving the development cleanup mechanism removes or invalidates stale broad-reason run payloads without extending production semantics.

### Step 5 — Focused validation pack and rollout safety

Goal: protect the foundation from semantic regressions and make follow-on fidelity work safe.

Changes:
- Add a focused regression pack spanning compare, capture, capability guard, and development-safe contract cleanup.
- Review every touched reason-label and badge usage to ensure structural, operational, and transient meanings remain centralized.
- Document the new backend contract shape in code-level PHPDoc and tests so follow-on specs can build on stable semantics.
- Keep rollout bounded to baseline compare and capture semantics without adding renderer-richness work from Spec 164.

Tests:
- Run the focused Pest pack in `quickstart.md`.
- Add one regression proving no render-time Graph calls occur on affected run-detail or landing surfaces.