ahmido 7d4d607475 feat: add baseline gap details surfaces (#192 )

## Summary
- add baseline compare evidence gap detail modeling and a dedicated Livewire table surface
- extend baseline compare landing and operation run detail surfaces to expose evidence gap details and stats
- add spec artifacts for feature 162 and expand feature coverage with focused Filament and baseline tests

## Notes
- branch: `162-baseline-gap-details`
- commit: `a92dd812`
- working tree was clean after push

## Validation
- tests were not run in this step

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #192

2026-03-24 19:05:23 +00:00

4.9 KiB

Raw Blame History

Research: Enterprise Evidence Gap Details for Baseline Compare

Decision 1: Persist evidence-gap subjects inside `OperationRun.context`

Decision: Store concrete evidence-gap subject detail in baseline_compare.evidence_gaps.subjects within the existing JSONB OperationRun.context payload.
Rationale: The canonical operator review surface is already backed by OperationRun, Monitoring pages must remain DB-only at render time, and compare runs are immutable operational artifacts. Extending the existing context preserves observability without introducing a new persistence model.
Alternatives considered:
- Create a dedicated relational evidence-gap table: rejected because the feature needs a bounded, immutable run snapshot rather than an independently mutable dataset.
- Recompute detail on demand from inventory and baseline state: rejected because it would violate DB-only render expectations and risk drift between recorded compare outcome and later inventory state.

Decision 2: Merge all evidence-gap subject sources before persistence

Decision: Consolidate gap subjects from ambiguous current-inventory matches, capture-phase failures, and drift-time missing evidence into one reason-grouped structure.
Rationale: Operators need one coherent explanation of why confidence is reduced. Splitting detail across multiple internal sources would force the UI to know too much about compare internals and would create inconsistent trust messaging.
Alternatives considered:
- Persist only ambiguous matches: rejected because it would leave policy_not_found, missing_current, and similar reasons as counts-only.
- Persist per-phase fragments separately: rejected because the UI contract is reason-oriented, not phase-oriented.

Decision 3: Keep filtering local to the rendered detail surface

Decision: Use local filtering across reason, policy type, and subject key in the evidence-gap detail surface.
Rationale: The payload is intentionally bounded by the compare job, the operator workflow is investigative rather than analytical, and local filtering avoids new server requests or additional read endpoints in the initial user experience.
Alternatives considered:
- Server-side filtering via new query endpoints: rejected for the initial slice because it adds API surface without solving a current scale bottleneck.
- No filtering at all: rejected because enterprise runs can accumulate enough subjects to make manual scanning too slow.

Decision 4: Preserve operator-first information hierarchy

Decision: Keep result meaning, trust, and next-step guidance ahead of evidence-gap detail, and keep raw JSON diagnostics last.
Rationale: The constitution requires operator-first /admin surfaces. Evidence-gap detail is important, but it supports the decision already summarized by the run outcome and explanation layers.
Alternatives considered:
- Show raw JSON only: rejected because it fails the operator-first requirement.
- Put evidence-gap rows ahead of result meaning: rejected because it would over-prioritize diagnostics and weaken the page contract.

Decision 5: Explicitly model legacy and partial-detail runs

Decision: Differentiate among runs with no evidence gaps, runs with gaps and recorded subjects, and runs with gaps but no recorded subject detail.
Rationale: Historical compare runs already exist, and silence must not be interpreted as health. The UI needs an explicit fallback state to preserve trust in old data.
Alternatives considered:
- Treat missing subjects as empty subjects: rejected because it misrepresents historical/partial runs.
- Hide the section when subjects are missing: rejected because operators would lose the signal that detail quality differs across runs.

Decision 6: Use existing Filament/Blade patterns rather than new assets

Decision: Implement the detail surface with existing Filament resource sections, Blade partials, and Alpine-powered filtering only.
Rationale: The feature does not require a new panel plugin, custom published asset, or heavy client library. Existing Filament v5 and Livewire v4 patterns already support the interaction.
Alternatives considered:
- Introduce a custom JS table package: rejected because it adds operational overhead and does not materially improve the bounded use case.
- Publish or override Filament internal views: rejected because render hooks and custom entries are sufficient.

Decision 7: Validate with persistence and render-path regression tests

Decision: Anchor verification in Pest feature tests for compare persistence, capture-phase subject storage, and run-detail rendering.
Rationale: The root failure was data not being persisted, not just a missing view. The test plan must cover both the job path and the operator surface.
Alternatives considered:
- UI-only assertions: rejected because they would not prove the persistence contract.
- Queue smoke tests only: rejected because they are too broad to protect the specific JSON contract.

4.9 KiB Raw Blame History

Research: Enterprise Evidence Gap Details for Baseline Compare

Decision 1: Persist evidence-gap subjects inside OperationRun.context

Decision 2: Merge all evidence-gap subject sources before persistence

Decision 3: Keep filtering local to the rendered detail surface

Decision 4: Preserve operator-first information hierarchy

Decision 5: Explicitly model legacy and partial-detail runs

Decision 6: Use existing Filament/Blade patterns rather than new assets

Decision 7: Validate with persistence and render-path regression tests

4.9 KiB

Raw Blame History

Decision 1: Persist evidence-gap subjects inside `OperationRun.context`