## Summary - add the structured subject-resolution foundation for baseline compare and baseline capture, including capability guards, subject descriptors, resolution outcomes, and operator action categories - persist structured evidence-gap subject records and update compare/capture surfaces, landing projections, and cleanup tooling to use the new contract - add Spec 163 artifacts and focused Pest coverage for classification, determinism, cleanup, and DB-only rendering ## Validation - `vendor/bin/sail bin pint --dirty --format agent` - `vendor/bin/sail artisan test --compact tests/Unit/Support/Baselines tests/Feature/Baselines tests/Feature/Filament/OperationRunEnterpriseDetailPageTest.php` ## Notes - verified locally that a fresh post-restart baseline compare run now writes structured `baseline_compare.evidence_gaps.subjects` records instead of the legacy broad payload shape - excluded the separate `docs/product/spec-candidates.md` worktree change from this branch commit and PR Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de> Reviewed-on: #193
16 KiB
Implementation Plan: 163 — Baseline Subject Resolution and Evidence Gap Semantics Foundation
Branch: 163-baseline-subject-resolution | Date: 2026-03-24 | Spec: specs/163-baseline-subject-resolution/spec.md
Input: Feature specification from specs/163-baseline-subject-resolution/spec.md
Note: This template is filled in by the /speckit.plan command. See .specify/scripts/ for helper scripts.
Summary
Introduce an explicit backend subject-resolution contract for baseline compare and baseline capture so the system can classify each subject before resolution, select the correct local model path, and persist precise operator-safe gap semantics instead of collapsing structural, operational, and transient causes into broad policy_not_found style states. The implementation will extend existing baseline scope, inventory policy-type metadata, compare and capture jobs, baseline evidence-gap detail parsing, and OperationRun context persistence rather than introducing a parallel execution stack, with a bounded runtime support guard that prevents baseline-supported types from entering compare or capture on a resolver path that cannot truthfully classify them.
Technical Context
Language/Version: PHP 8.4
Primary Dependencies: Laravel 12, Filament v5, Livewire v4
Storage: PostgreSQL via existing application tables, especially operation_runs.context and baseline snapshot summary JSON
Testing: Pest v4 on PHPUnit 12
Target Platform: Dockerized Laravel web application running through Sail locally and Dokploy in deployment
Project Type: Web application
Performance Goals: Preserve DB-only render behavior for Monitoring and tenant review surfaces, add no render-time Graph calls, and keep evidence-gap interpretation deterministic and lightweight enough for existing run-detail and landing surfaces
Constraints:
- No new render-time remote work and no bypass of
GraphClientInterface - No change to
OperationRunlifecycle ownership, notification channels, or summary-count rules - No new operator screen; existing surfaces must present richer semantics
- Existing development-only run payloads may be deleted or regenerated if that simplifies migration to the new structured contract
- Baseline-supported configuration must not overpromise runtime capability Scale/Scope: Cross-cutting backend semantic work across baseline compare and capture pipelines, support-layer parsers and translators, OperationRun context contracts, tenant and canonical read surfaces, and focused Pest coverage for deterministic classification and development-safe contract cleanup
Constitution Check
GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.
- Inventory-first: PASS — the design keeps inventory as last-observed truth and distinguishes inventory-backed evidence from policy-backed evidence rather than conflating them.
- Read/write separation: PASS — this feature changes classification and persisted run semantics inside existing compare and capture flows; it does not add new write or restore actions.
- Graph contract path: PASS — no new Graph contract or direct endpoint use is introduced; existing capture and sync services remain the only remote paths.
- Deterministic capabilities: PASS — subject-class derivation, resolution outcome mapping, and support-capability guards are explicitly designed to be deterministic and testable.
- RBAC-UX: PASS — existing
/adminand tenant-context authorization boundaries remain unchanged; only read semantics improve. - Workspace isolation: PASS — no new workspace leakage is introduced and canonical run-detail remains tenant-safe.
- RBAC confirmations: PASS — no new destructive actions are added.
- Global search: PASS — unaffected.
- Tenant isolation: PASS — all compare, capture, inventory, and run data remain tenant-bound and entitlement-checked.
- Run observability: PASS — compare and capture continue to use existing
OperationRuntypes; this slice enriches context semantics only. - Ops-UX 3-surface feedback: PASS — no new toast, progress, or terminal-notification channels are added.
- Ops-UX lifecycle: PASS —
OperationRun.statusandOperationRun.outcomeremain service-owned; only context enrichment changes. - Ops-UX summary counts: PASS — no non-numeric values are moved into
summary_counts; richer semantics live in context and read models. - Ops-UX guards: PASS — focused regression tests can protect classification determinism and development cleanup behavior without relaxing existing CI rules.
- Ops-UX system runs: PASS — unchanged.
- Automation: PASS — existing queue, retry, and backoff behavior stays intact; transient outcomes are classified more precisely, not re-executed differently.
- Data minimization: PASS — the new gap detail contract stores classification and stable identifiers, not raw policy payloads or secrets.
- Badge semantics (BADGE-001): PASS — if structural, operational, or transient labels surface as badges, they must route through centralized badge or presentation helpers rather than ad hoc maps.
- UI naming (UI-NAMING-001): PASS — the feature exists to replace implementation-first broad error prose with domain-first operator meaning.
- Operator surfaces (OPSURF-001): PASS — existing run detail and tenant review surfaces remain operator-first and diagnostics-secondary.
- Filament UI Action Surface Contract: PASS — action topology stays unchanged; this is a read-surface semantics upgrade.
- Filament UI UX-001 (Layout & IA): PASS — existing layouts remain, but sections become more semantically truthful. No exemption required.
Project Structure
Documentation (this feature)
specs/163-baseline-subject-resolution/
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│ └── openapi.yaml
├── checklists/
│ └── requirements.md
└── tasks.md
Source Code (repository root)
app/
├── Filament/
│ ├── Pages/
│ │ └── BaselineCompareLanding.php
│ ├── Resources/
│ │ ├── OperationRunResource.php
│ │ └── BaselineSnapshotResource.php
│ └── Widgets/
├── Jobs/
│ ├── CompareBaselineToTenantJob.php
│ └── CaptureBaselineSnapshotJob.php
├── Services/
│ ├── Baselines/
│ │ ├── BaselineCompareService.php
│ │ ├── BaselineCaptureService.php
│ │ ├── BaselineContentCapturePhase.php
│ │ └── Evidence/
│ ├── Intune/
│ │ └── PolicySyncService.php
│ └── Inventory/
│ └── InventorySyncService.php
├── Support/
│ ├── Baselines/
│ ├── Inventory/
│ ├── OpsUx/
│ └── Ui/
├── Livewire/
└── Models/
config/
├── tenantpilot.php
└── graph_contracts.php
tests/
├── Feature/
│ ├── Baselines/
│ ├── Filament/
│ └── Monitoring/
└── Unit/
└── Support/
Structure Decision: Web application. The work stays inside existing baseline jobs and services, support-layer value objects and presenters, current Filament surfaces, and focused Pest coverage. No new top-level architecture area is required.
Complexity Tracking
No constitution violations are required for this feature.
Phase 0 — Outline & Research (DONE)
Outputs:
specs/163-baseline-subject-resolution/research.md
Key decisions captured:
- Introduce a first-class subject-resolution contract in the backend instead of solving the problem with UI-only relabeling.
- Persist both
subject_classandresolution_outcomebecause they answer different operator questions. - Keep foundation-backed subjects eligible only when the runtime can truthfully classify them through an inventory-backed or limited-capability path.
- Add a runtime consistency guard during scope or resolver preparation so
baseline_compare.supportedcannot silently overpromise structural capability. - Preserve transient reasons such as throttling and capture failure as precise operational outcomes rather than absorbing them into structural taxonomy.
- Treat broad legacy gap shapes as development-only cleanup candidates rather than a compatibility requirement for the new runtime contract.
Phase 1 — Design & Contracts (DONE)
Outputs:
specs/163-baseline-subject-resolution/data-model.mdspecs/163-baseline-subject-resolution/contracts/openapi.yamlspecs/163-baseline-subject-resolution/quickstart.md
Design highlights:
- The core semantic unit is a
SubjectDescriptorthat is classified before resolution and yields a deterministicResolutionOutcomeRecord. OperationRun.contextremains the canonical persisted contract for compare and capture evidence-gap semantics, but new runs store richer subject-level objects instead of reason plus raw string only.- The runtime support guard sits before compare and capture execution so unsupported structural mismatches are blocked or reclassified before misleading
policy_not_found-style outcomes are emitted. - Existing detail and landing surfaces are updated for the new structured gap contract, and development fixtures or stale local run data are regenerated instead of driving a permanent compatibility layer.
- Compare and capture share the same root-cause vocabulary, but retain operation-specific outcome families where needed.
Phase 1 — Agent Context Update (REQUIRED)
Run:
.specify/scripts/bash/update-agent-context.sh copilot
Constitution Check — Post-Design Re-evaluation
- PASS — the design remains inside existing compare and capture operations and does not add new remote-call paths or lifecycle mutations.
- PASS — inventory-first semantics are strengthened because inventory-backed subjects are no longer mislabeled as missing policy records.
- PASS — operator surfaces stay on existing pages and remain DB-only at render time.
- PASS — development cleanup is explicit and bounded; the new contract remains the only forward-looking runtime shape.
- PASS — no Action Surface or UX-001 exemptions are needed because action topology and layouts remain intact.
Phase 2 — Implementation Plan
Step 1 — Subject classification and runtime capability foundation
Goal: implement FR-001 through FR-003, FR-008, FR-015, and FR-016 by creating a deterministic subject-resolution foundation shared by compare and capture.
Changes:
- Introduce a dedicated subject-resolution support layer under
app/Support/Baselines/that defines:- subject classes
- resolution paths
- resolution outcomes
- operator action categories
- structural versus operational versus transient classification
- Extend
InventoryPolicyTypeMetaand related metadata accessors so baseline support can express whether a type is policy-backed, inventory-backed, foundation-backed, or limited. - Add a runtime capability guard used by
BaselineScope,BaselineCompareService, andBaselineCaptureServiceso types only enter compare or capture on a truthful path. - Keep the guard deterministic and explicit in logs or run context when support is limited or excluded.
Tests:
- Add unit tests for subject-class derivation, resolution-path derivation, and runtime-capability guard behavior.
- Add golden-style tests covering supported, limited, and structurally invalid foundation types.
Step 2 — Capture-path resolution and gap taxonomy upgrade
Goal: implement FR-004 through FR-010 on the capture side so structural resolver mismatches are no longer emitted as generic missing-policy cases.
Changes:
- Refactor
BaselineContentCapturePhaseso it resolves subjects through the new subject contract rather than assuming a policy lookup for all subjects. - Replace broad
policy_not_foundcapture gaps with precise structured outcomes such as:- policy record missing
- inventory record missing
- foundation-backed via inventory path
- resolution type mismatch
- unresolvable subject
- Preserve existing transient outcomes like
throttled,capture_failed, andbudget_exhaustedunchanged except for richer structured metadata. - Persist new structured gap-subject objects for new runs and remove any requirement to keep broad legacy reason shapes alive for future writes.
Tests:
- Add feature and unit coverage for capture-path classification across policy-backed, inventory-backed, foundation-backed, duplicate, invalid, and transient cases.
- Add deterministic replay coverage proving unchanged capture inputs produce unchanged outcomes.
- Add regressions proving structural foundation subjects no longer produce new generic
policy_not_foundgaps.
Step 3 — Compare-path resolution and evidence-gap detail contract
Goal: implement FR-004 through FR-014 on the compare side by aligning current-evidence resolution, evidence-gap reasoning, and persisted run context with the new contract.
Changes:
- Refactor
CompareBaselineToTenantJobso baseline item interpretation and current-state resolution produce explicitresolution_outcomerecords rather than only count buckets and raw subject keys. - Add structured evidence-gap subject records under
baseline_compare.evidence_gaps.subjectsfor new runs, including subject class, resolution path, resolution outcome, reason code, operator action category, and retryability or structural flags. - Preserve already precise compare reasons such as
missing_current,ambiguous_match, and role-definition-specific gap families while separating them from structural non-policy-backed outcomes. - Ensure baseline compare reason translation remains aligned with the new detailed reason taxonomy instead of flattening distinct root causes.
Tests:
- Add feature tests for mixed compare runs containing structural, operational, transient, and successful subjects.
- Add deterministic compare tests proving identical inputs yield identical resolution outcomes.
- Add regressions for evidence-gap persistence shape and compare-surface rendering against the new structured contract.
Step 4 — Development cleanup and operator-surface adoption
Goal: implement FR-011 through FR-014 and the User Story 3 acceptance scenarios by moving existing read surfaces to the new gap contract and treating stale development data as disposable.
Changes:
- Extend
BaselineCompareEvidenceGapDetails,BaselineCompareStats,OperationRunResource,BaselineCompareLanding, and any related Livewire gap tables so they read the new structured gap subject records consistently. - Add an explicit development cleanup mechanism for stale local run payloads, preferably a dedicated development-only Artisan command plus fixture regeneration steps, so old broad string-only gap subjects can be purged instead of preserved.
- Introduce operator-facing labels that answer root cause before action advice while keeping diagnostics secondary.
- Keep existing pages and sections, but expose structural versus operational versus transient semantics consistently across dense and detailed surfaces.
- Update snapshot and compare summary surfaces where old broad reason aggregations would otherwise misread the new taxonomy.
Tests:
- Add or update Filament feature tests for canonical run detail and tenant baseline compare landing against the new structured run shape.
- Add cleanup-oriented tests proving the development cleanup mechanism removes or invalidates stale broad-reason run payloads without extending production semantics.
Step 5 — Focused validation pack and rollout safety
Goal: protect the foundation from semantic regressions and make follow-on fidelity work safe.
Changes:
- Add a focused regression pack spanning compare, capture, capability guard, and development-safe contract cleanup.
- Review every touched reason-label and badge usage to ensure structural, operational, and transient meanings remain centralized.
- Document the new backend contract shape in code-level PHPDoc and tests so follow-on specs can build on stable semantics.
- Keep rollout bounded to baseline compare and capture semantics without adding renderer-richness work from Spec 164.
Tests:
- Run the focused Pest pack in
quickstart.md. - Add one regression proving no render-time Graph calls occur on affected run-detail or landing surfaces.