ahmido c17255f854 feat: implement baseline subject resolution semantics (#193 )

## Summary
- add the structured subject-resolution foundation for baseline compare and baseline capture, including capability guards, subject descriptors, resolution outcomes, and operator action categories
- persist structured evidence-gap subject records and update compare/capture surfaces, landing projections, and cleanup tooling to use the new contract
- add Spec 163 artifacts and focused Pest coverage for classification, determinism, cleanup, and DB-only rendering

## Validation
- `vendor/bin/sail bin pint --dirty --format agent`
- `vendor/bin/sail artisan test --compact tests/Unit/Support/Baselines tests/Feature/Baselines tests/Feature/Filament/OperationRunEnterpriseDetailPageTest.php`

## Notes
- verified locally that a fresh post-restart baseline compare run now writes structured `baseline_compare.evidence_gaps.subjects` records instead of the legacy broad payload shape
- excluded the separate `docs/product/spec-candidates.md` worktree change from this branch commit and PR

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #193

2026-03-25 12:40:45 +00:00

16 KiB

Raw Blame History

Implementation Plan: 163 — Baseline Subject Resolution and Evidence Gap Semantics Foundation

Branch: 163-baseline-subject-resolution | Date: 2026-03-24 | Spec: specs/163-baseline-subject-resolution/spec.md Input: Feature specification from specs/163-baseline-subject-resolution/spec.md

Note: This template is filled in by the /speckit.plan command. See .specify/scripts/ for helper scripts.

Summary

Introduce an explicit backend subject-resolution contract for baseline compare and baseline capture so the system can classify each subject before resolution, select the correct local model path, and persist precise operator-safe gap semantics instead of collapsing structural, operational, and transient causes into broad policy_not_found style states. The implementation will extend existing baseline scope, inventory policy-type metadata, compare and capture jobs, baseline evidence-gap detail parsing, and OperationRun context persistence rather than introducing a parallel execution stack, with a bounded runtime support guard that prevents baseline-supported types from entering compare or capture on a resolver path that cannot truthfully classify them.

Technical Context

Language/Version: PHP 8.4
Primary Dependencies: Laravel 12, Filament v5, Livewire v4
Storage: PostgreSQL via existing application tables, especially operation_runs.context and baseline snapshot summary JSON
Testing: Pest v4 on PHPUnit 12
Target Platform: Dockerized Laravel web application running through Sail locally and Dokploy in deployment
Project Type: Web application
Performance Goals: Preserve DB-only render behavior for Monitoring and tenant review surfaces, add no render-time Graph calls, and keep evidence-gap interpretation deterministic and lightweight enough for existing run-detail and landing surfaces
Constraints:

No new render-time remote work and no bypass of GraphClientInterface
No change to OperationRun lifecycle ownership, notification channels, or summary-count rules
No new operator screen; existing surfaces must present richer semantics
Existing development-only run payloads may be deleted or regenerated if that simplifies migration to the new structured contract
Baseline-supported configuration must not overpromise runtime capability Scale/Scope: Cross-cutting backend semantic work across baseline compare and capture pipelines, support-layer parsers and translators, OperationRun context contracts, tenant and canonical read surfaces, and focused Pest coverage for deterministic classification and development-safe contract cleanup

Constitution Check

GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.

Inventory-first: PASS — the design keeps inventory as last-observed truth and distinguishes inventory-backed evidence from policy-backed evidence rather than conflating them.
Read/write separation: PASS — this feature changes classification and persisted run semantics inside existing compare and capture flows; it does not add new write or restore actions.
Graph contract path: PASS — no new Graph contract or direct endpoint use is introduced; existing capture and sync services remain the only remote paths.
Deterministic capabilities: PASS — subject-class derivation, resolution outcome mapping, and support-capability guards are explicitly designed to be deterministic and testable.
RBAC-UX: PASS — existing /admin and tenant-context authorization boundaries remain unchanged; only read semantics improve.
Workspace isolation: PASS — no new workspace leakage is introduced and canonical run-detail remains tenant-safe.
RBAC confirmations: PASS — no new destructive actions are added.
Global search: PASS — unaffected.
Tenant isolation: PASS — all compare, capture, inventory, and run data remain tenant-bound and entitlement-checked.
Run observability: PASS — compare and capture continue to use existing OperationRun types; this slice enriches context semantics only.
Ops-UX 3-surface feedback: PASS — no new toast, progress, or terminal-notification channels are added.
Ops-UX lifecycle: PASS — OperationRun.status and OperationRun.outcome remain service-owned; only context enrichment changes.
Ops-UX summary counts: PASS — no non-numeric values are moved into summary_counts; richer semantics live in context and read models.
Ops-UX guards: PASS — focused regression tests can protect classification determinism and development cleanup behavior without relaxing existing CI rules.
Ops-UX system runs: PASS — unchanged.
Automation: PASS — existing queue, retry, and backoff behavior stays intact; transient outcomes are classified more precisely, not re-executed differently.
Data minimization: PASS — the new gap detail contract stores classification and stable identifiers, not raw policy payloads or secrets.
Badge semantics (BADGE-001): PASS — if structural, operational, or transient labels surface as badges, they must route through centralized badge or presentation helpers rather than ad hoc maps.
UI naming (UI-NAMING-001): PASS — the feature exists to replace implementation-first broad error prose with domain-first operator meaning.
Operator surfaces (OPSURF-001): PASS — existing run detail and tenant review surfaces remain operator-first and diagnostics-secondary.
Filament UI Action Surface Contract: PASS — action topology stays unchanged; this is a read-surface semantics upgrade.
Filament UI UX-001 (Layout & IA): PASS — existing layouts remain, but sections become more semantically truthful. No exemption required.

Project Structure

Documentation (this feature)

specs/163-baseline-subject-resolution/
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│   └── openapi.yaml
├── checklists/
│   └── requirements.md
└── tasks.md

Source Code (repository root)

app/
├── Filament/
│   ├── Pages/
│   │   └── BaselineCompareLanding.php
│   ├── Resources/
│   │   ├── OperationRunResource.php
│   │   └── BaselineSnapshotResource.php
│   └── Widgets/
├── Jobs/
│   ├── CompareBaselineToTenantJob.php
│   └── CaptureBaselineSnapshotJob.php
├── Services/
│   ├── Baselines/
│   │   ├── BaselineCompareService.php
│   │   ├── BaselineCaptureService.php
│   │   ├── BaselineContentCapturePhase.php
│   │   └── Evidence/
│   ├── Intune/
│   │   └── PolicySyncService.php
│   └── Inventory/
│       └── InventorySyncService.php
├── Support/
│   ├── Baselines/
│   ├── Inventory/
│   ├── OpsUx/
│   └── Ui/
├── Livewire/
└── Models/
config/
├── tenantpilot.php
└── graph_contracts.php
tests/
├── Feature/
│   ├── Baselines/
│   ├── Filament/
│   └── Monitoring/
└── Unit/
    └── Support/

Structure Decision: Web application. The work stays inside existing baseline jobs and services, support-layer value objects and presenters, current Filament surfaces, and focused Pest coverage. No new top-level architecture area is required.

Complexity Tracking

No constitution violations are required for this feature.

Phase 0 — Outline & Research (DONE)

Outputs:

specs/163-baseline-subject-resolution/research.md

Key decisions captured:

Introduce a first-class subject-resolution contract in the backend instead of solving the problem with UI-only relabeling.
Persist both subject_class and resolution_outcome because they answer different operator questions.
Keep foundation-backed subjects eligible only when the runtime can truthfully classify them through an inventory-backed or limited-capability path.
Add a runtime consistency guard during scope or resolver preparation so baseline_compare.supported cannot silently overpromise structural capability.
Preserve transient reasons such as throttling and capture failure as precise operational outcomes rather than absorbing them into structural taxonomy.
Treat broad legacy gap shapes as development-only cleanup candidates rather than a compatibility requirement for the new runtime contract.

Phase 1 — Design & Contracts (DONE)

Outputs:

specs/163-baseline-subject-resolution/data-model.md
specs/163-baseline-subject-resolution/contracts/openapi.yaml
specs/163-baseline-subject-resolution/quickstart.md

Design highlights:

The core semantic unit is a SubjectDescriptor that is classified before resolution and yields a deterministic ResolutionOutcomeRecord.
OperationRun.context remains the canonical persisted contract for compare and capture evidence-gap semantics, but new runs store richer subject-level objects instead of reason plus raw string only.
The runtime support guard sits before compare and capture execution so unsupported structural mismatches are blocked or reclassified before misleading policy_not_found-style outcomes are emitted.
Existing detail and landing surfaces are updated for the new structured gap contract, and development fixtures or stale local run data are regenerated instead of driving a permanent compatibility layer.
Compare and capture share the same root-cause vocabulary, but retain operation-specific outcome families where needed.

Phase 1 — Agent Context Update (REQUIRED)

Run:

.specify/scripts/bash/update-agent-context.sh copilot

Constitution Check — Post-Design Re-evaluation

PASS — the design remains inside existing compare and capture operations and does not add new remote-call paths or lifecycle mutations.
PASS — inventory-first semantics are strengthened because inventory-backed subjects are no longer mislabeled as missing policy records.
PASS — operator surfaces stay on existing pages and remain DB-only at render time.
PASS — development cleanup is explicit and bounded; the new contract remains the only forward-looking runtime shape.
PASS — no Action Surface or UX-001 exemptions are needed because action topology and layouts remain intact.

Phase 2 — Implementation Plan

Step 1 — Subject classification and runtime capability foundation

Goal: implement FR-001 through FR-003, FR-008, FR-015, and FR-016 by creating a deterministic subject-resolution foundation shared by compare and capture.

Changes:

Introduce a dedicated subject-resolution support layer under app/Support/Baselines/ that defines:
- subject classes
- resolution paths
- resolution outcomes
- operator action categories
- structural versus operational versus transient classification
Extend InventoryPolicyTypeMeta and related metadata accessors so baseline support can express whether a type is policy-backed, inventory-backed, foundation-backed, or limited.
Add a runtime capability guard used by BaselineScope, BaselineCompareService, and BaselineCaptureService so types only enter compare or capture on a truthful path.
Keep the guard deterministic and explicit in logs or run context when support is limited or excluded.

Tests:

Add unit tests for subject-class derivation, resolution-path derivation, and runtime-capability guard behavior.
Add golden-style tests covering supported, limited, and structurally invalid foundation types.

Step 2 — Capture-path resolution and gap taxonomy upgrade

Goal: implement FR-004 through FR-010 on the capture side so structural resolver mismatches are no longer emitted as generic missing-policy cases.

Changes:

Refactor BaselineContentCapturePhase so it resolves subjects through the new subject contract rather than assuming a policy lookup for all subjects.
Replace broad policy_not_found capture gaps with precise structured outcomes such as:
- policy record missing
- inventory record missing
- foundation-backed via inventory path
- resolution type mismatch
- unresolvable subject
Preserve existing transient outcomes like throttled, capture_failed, and budget_exhausted unchanged except for richer structured metadata.
Persist new structured gap-subject objects for new runs and remove any requirement to keep broad legacy reason shapes alive for future writes.

Tests:

Add feature and unit coverage for capture-path classification across policy-backed, inventory-backed, foundation-backed, duplicate, invalid, and transient cases.
Add deterministic replay coverage proving unchanged capture inputs produce unchanged outcomes.
Add regressions proving structural foundation subjects no longer produce new generic policy_not_found gaps.

Step 3 — Compare-path resolution and evidence-gap detail contract

Goal: implement FR-004 through FR-014 on the compare side by aligning current-evidence resolution, evidence-gap reasoning, and persisted run context with the new contract.

Changes:

Refactor CompareBaselineToTenantJob so baseline item interpretation and current-state resolution produce explicit resolution_outcome records rather than only count buckets and raw subject keys.
Add structured evidence-gap subject records under baseline_compare.evidence_gaps.subjects for new runs, including subject class, resolution path, resolution outcome, reason code, operator action category, and retryability or structural flags.
Preserve already precise compare reasons such as missing_current, ambiguous_match, and role-definition-specific gap families while separating them from structural non-policy-backed outcomes.
Ensure baseline compare reason translation remains aligned with the new detailed reason taxonomy instead of flattening distinct root causes.

Tests:

Add feature tests for mixed compare runs containing structural, operational, transient, and successful subjects.
Add deterministic compare tests proving identical inputs yield identical resolution outcomes.
Add regressions for evidence-gap persistence shape and compare-surface rendering against the new structured contract.

Step 4 — Development cleanup and operator-surface adoption

Goal: implement FR-011 through FR-014 and the User Story 3 acceptance scenarios by moving existing read surfaces to the new gap contract and treating stale development data as disposable.

Changes:

Extend BaselineCompareEvidenceGapDetails, BaselineCompareStats, OperationRunResource, BaselineCompareLanding, and any related Livewire gap tables so they read the new structured gap subject records consistently.
Add an explicit development cleanup mechanism for stale local run payloads, preferably a dedicated development-only Artisan command plus fixture regeneration steps, so old broad string-only gap subjects can be purged instead of preserved.
Introduce operator-facing labels that answer root cause before action advice while keeping diagnostics secondary.
Keep existing pages and sections, but expose structural versus operational versus transient semantics consistently across dense and detailed surfaces.
Update snapshot and compare summary surfaces where old broad reason aggregations would otherwise misread the new taxonomy.

Tests:

Add or update Filament feature tests for canonical run detail and tenant baseline compare landing against the new structured run shape.
Add cleanup-oriented tests proving the development cleanup mechanism removes or invalidates stale broad-reason run payloads without extending production semantics.

Step 5 — Focused validation pack and rollout safety

Goal: protect the foundation from semantic regressions and make follow-on fidelity work safe.

Changes:

Add a focused regression pack spanning compare, capture, capability guard, and development-safe contract cleanup.
Review every touched reason-label and badge usage to ensure structural, operational, and transient meanings remain centralized.
Document the new backend contract shape in code-level PHPDoc and tests so follow-on specs can build on stable semantics.
Keep rollout bounded to baseline compare and capture semantics without adding renderer-richness work from Spec 164.

Tests:

Run the focused Pest pack in quickstart.md.
Add one regression proving no render-time Graph calls occur on affected run-detail or landing surfaces.

16 KiB Raw Blame History

Implementation Plan: 163 — Baseline Subject Resolution and Evidence Gap Semantics Foundation

Summary

Technical Context

Constitution Check

Project Structure

Documentation (this feature)

Source Code (repository root)

Complexity Tracking

Phase 0 — Outline & Research (DONE)

Phase 1 — Design & Contracts (DONE)

Phase 1 — Agent Context Update (REQUIRED)

Constitution Check — Post-Design Re-evaluation

Phase 2 — Implementation Plan

Step 1 — Subject classification and runtime capability foundation

Step 2 — Capture-path resolution and gap taxonomy upgrade

Step 3 — Compare-path resolution and evidence-gap detail contract

Step 4 — Development cleanup and operator-surface adoption

Step 5 — Focused validation pack and rollout safety

16 KiB

Raw Blame History