ahmido 8426741068 feat: add baseline snapshot truth guards (#189 )

## Summary
- add explicit BaselineSnapshot lifecycle truth with conservative backfill and a shared truth resolver
- block baseline compare from building, incomplete, or superseded snapshots and align workspace/tenant UI truth surfaces with effective snapshot state
- surface artifact truth separately from operation outcome across baseline profile, snapshot, compare, and operation run pages

## Testing
- integrated browser smoke test on the active feature surfaces
- `vendor/bin/sail artisan test --compact tests/Feature/Filament/BaselineSnapshotTruthSurfaceTest.php tests/Feature/Filament/BaselineProfileCompareStartSurfaceTest.php`
- targeted baseline lifecycle and compare guard coverage added in Pest
- `vendor/bin/sail bin pint --dirty --format agent`

## Notes
- Livewire v4 compliance preserved
- no panel provider registration changes were needed; Laravel 12 providers remain in `bootstrap/providers.php`
- global search remains disabled for the affected baseline resources by design
- destructive actions remain confirmation-gated; capture and compare actions keep their existing authorization and confirmation behavior
- no new panel assets were added; existing deploy flow for `filament:assets` is unchanged

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #189

2026-03-23 11:32:00 +00:00

25 KiB

Raw Blame History

Feature Specification: Artifact Truth & Downstream Consumption Guards for BaselineSnapshot

Feature Branch: 159-baseline-snapshot-truth
Created: 2026-03-23
Status: Draft
Input: User description: "Introduce explicit artifact-truth semantics for BaselineSnapshot so the platform no longer conflates operation success with artifact completeness and usability."

Spec Scope Fields (mandatory)

Scope: workspace, tenant, canonical-view
Primary Routes: /admin/baseline-profiles, /admin/baseline-profiles/{record}, /admin/baseline-snapshots, /admin/baseline-snapshots/{record}, /admin/t/{tenant}/baseline-compare, Monitoring → Operations → Run Detail for baseline.capture and baseline.compare
Data Ownership: Workspace-owned records: BaselineProfile, BaselineSnapshot, BaselineSnapshotItem, profile-to-current-snapshot truth. Tenant-owned consumers: OperationRun records for compare/capture execution and compare outputs that depend on baseline truth.
RBAC: Workspace membership plus WORKSPACE_BASELINES_VIEW for snapshot/profile truth surfaces; WORKSPACE_BASELINES_MANAGE for capture-start and profile mutation surfaces; tenant membership plus tenant compare capability for compare-start surfaces. Non-members remain 404, members without capability remain 403.
Canonical View Default Filter Behavior: Monitoring → Operations → Run Detail follows the active tenant context when one is selected; without tenant context, canonical Monitoring access may resolve baseline capture/compare runs only after workspace entitlement is established and any referenced tenant-owned run is filtered by tenant entitlement before disclosure.
Canonical View Entitlement Checks: Canonical Monitoring routes MUST deny-as-not-found for actors lacking workspace membership or lacking entitlement to the tenant referenced by a tenant-owned OperationRun. No canonical run detail may reveal cross-tenant baseline operation existence.
List Surface Review Standard: Because this feature changes the Baseline Profiles and Baseline Snapshots list surfaces, implementation and review must follow docs/product/standards/list-surface-review-checklist.md.

Operator Surface Contract (mandatory when operator-facing surfaces are changed)

Surface	Primary Persona	Surface Type	Primary Operator Question	Default-visible Information	Diagnostics-only Information	Status Dimensions Used	Mutation Scope	Primary Actions	Dangerous Actions
Baselines list/detail	Workspace manager	List/detail	Which baseline is current, and is its current snapshot trustworthy?	Active baseline profile, latest complete snapshot, latest attempted snapshot when it differs, compare readiness, clear next step	Item-level gap reasons, retry details, diagnostic counts, run context payloads	baseline lifecycle, snapshot completeness, snapshot usability, execution outcome	TenantPilot only	Create baseline, Edit baseline, Capture baseline, Compare now, View current snapshot	Archive baseline profile
Baseline Snapshots list/detail	Workspace manager	List/detail	Can this snapshot be trusted as baseline truth, and if not, why not?	Lifecycle state, consumability label, current vs historical status, captured time, profile linkage, next-step guidance	Partial item details, integrity diagnostics, gap breakdowns, related run diagnostics	lifecycle, usability, derived historical status, execution outcome	TenantPilot only	View snapshot, Open related record	None
Baseline Compare landing	Tenant operator or manager	Tenant-scoped landing page	Can this tenant be compared right now, and which baseline truth will be used?	Assigned baseline profile, effective baseline snapshot, compare availability, current warning/block reason, last compare summary	Detailed evidence-gap reasons, operation-run diagnostics, duplicate-subject diagnostics	compare readiness, snapshot usability, evidence coverage, execution outcome	Simulation only	Compare now, View findings, View run	None
Monitoring run detail for baseline capture/compare	Workspace manager or tenant operator with run access	Canonical run detail	Did the run finish, and did it produce a consumable snapshot?	Separate run outcome and produced-artifact truth, final snapshot state, snapshot link when present, operator-safe explanation	Failure summary, gap counts, retry context, resume metadata	execution outcome, artifact existence, artifact completeness, artifact usability	TenantPilot only	View related snapshot, View related profile	Resume capture remains separately governed

User Scenarios & Testing (mandatory)

User Story 1 - Trust only complete baselines (Priority: P1)

A workspace manager captures a baseline and needs the system to promote it as effective baseline truth only when the snapshot is explicitly complete and safe for downstream comparison.

Why this priority: This closes the core governance-integrity gap. If this story fails, downstream findings can be materially wrong even when the UI appears healthy.

Independent Test: Start a baseline capture that succeeds, then verify the resulting snapshot is marked complete, becomes the effective current snapshot, and is eligible for compare without relying on run status inference.

Acceptance Scenarios:

Given an active baseline profile without a current snapshot, When a capture completes successfully, Then the produced snapshot is marked complete, is consumable, and becomes the profile's effective current baseline snapshot.
Given an active baseline profile with an older complete snapshot, When a newer capture creates a snapshot row but fails before completion, Then the new snapshot is marked incomplete, is not consumable, and the profile continues to point to the older complete snapshot as effective truth.

User Story 2 - Block unsafe compare input (Priority: P2)

A tenant operator starts baseline compare and needs the platform to refuse building or incomplete baseline snapshots instead of producing untrustworthy drift findings.

Why this priority: Unsafe compare input turns a partial capture defect into false governance output. Blocking compare is safer than silently producing findings from incomplete truth.

Independent Test: Attempt compare against a building snapshot, an incomplete snapshot, and a complete snapshot; verify only the complete snapshot is accepted and every blocked case returns a clear operator-safe reason.

Acceptance Scenarios:

Given a tenant assigned to a baseline profile whose latest attempt is incomplete and no explicit override is provided, When compare starts, Then the system uses the latest complete snapshot if one exists, or blocks with a clear no-consumable-snapshot reason.
Given an explicit snapshot selection that is building, incomplete, or a historically superseded complete snapshot that is no longer the effective current truth, When compare starts or the compare job resolves its input, Then the compare flow refuses to proceed and records the rejection reason without generating normal drift output.

User Story 3 - See run truth and artifact truth separately (Priority: P3)

An operator reviewing baselines, snapshots, or runs needs to distinguish execution outcome from artifact usability without opening low-level diagnostics.

Why this priority: Operators need to act on the correct truth quickly. A failed run with an incomplete snapshot should not look equivalent to a failed run with no artifact, and a completed run with a complete snapshot should read as trustworthy.

Independent Test: Review the baseline profile detail, baseline snapshot detail, compare landing page, and run detail for successful, building, and incomplete cases; verify each surface shows run outcome and snapshot usability as separate concepts.

Acceptance Scenarios:

Given a run that failed after partial snapshot creation, When an operator opens the related snapshot or run detail, Then the UI shows the run as failed and the snapshot as incomplete and not usable for compare.
Given a snapshot that has been replaced by a newer complete snapshot, When an operator opens the older snapshot, Then the UI labels it as superseded or historical through derived presentation status and does not present it as current baseline truth.

Edge Cases

A capture creates the snapshot row and some items, then fails during later item persistence. The snapshot must end incomplete and must never become the effective current snapshot.
A retry or rerun encounters already persisted logical subjects. The resulting snapshot must not be marked complete unless duplicates are handled deterministically and the final artifact passes completion checks.
The latest attempted snapshot is building while an older complete snapshot exists. Compare and profile truth must continue to resolve to the older complete snapshot until the new attempt is finalized complete.
Legacy snapshots that cannot be proven complete during backfill must default to incomplete or unavailable-for-compare rather than being assumed trustworthy.
Unknown or ambiguous completion state, including interrupted finalization, must be treated as not consumable.

Requirements (mandatory)

Constitution alignment (required): This feature changes long-running baseline capture and compare behavior but does not introduce new Microsoft Graph endpoints or new contract-registry object types. Existing baseline capture and compare safety gates remain in place: capture and compare start actions stay confirmation-gated, execution remains auditable, tenant/workspace isolation is unchanged, and downstream baseline truth moves from inferred run outcome to explicit artifact state. No new DB-only security mutation is introduced.

Constitution alignment (OPS-UX): baseline.capture and baseline.compare continue to use the existing three feedback surfaces only: queued toast, active progress surfaces, and one terminal DB notification per run. OperationRun.status and OperationRun.outcome remain service-owned via OperationRunService. summary_counts remain numeric-only and continue to describe execution progress rather than artifact completeness. Scheduled or initiator-null behavior remains unchanged: no terminal DB notification is emitted without an initiator, and monitoring stays the audit surface. Regression coverage must include at least one guard proving artifact finalization does not bypass service-owned run transitions.

Constitution alignment (RBAC-UX): This feature touches the workspace-admin plane (/admin/baseline-profiles, /admin/baseline-snapshots), the tenant plane (/admin/t/{tenant}/baseline-compare), and the canonical Monitoring view for run detail. Cross-plane access remains deny-as-not-found. Non-members of the relevant workspace or tenant scope receive 404. Members lacking the required capability receive 403. Server-side enforcement remains required for capture start, compare start, profile mutation, and any related snapshot navigation. Canonical Monitoring run detail must additionally enforce tenant entitlement before revealing tenant-owned baseline runs when no tenant route segment is present. No raw capability strings or role-name checks may be introduced. Global search remains disabled for the affected baseline resources.

Constitution alignment (OPS-EX-AUTH-001): Not applicable beyond reaffirming that this feature does not add any auth-handshake HTTP behavior.

Constitution alignment (BADGE-001): Snapshot lifecycle, derived historical-status, and usability labels must be driven by centralized badge or presenter mappings. No page may introduce ad-hoc color, icon, or wording decisions for building, complete, incomplete, derived superseded, or compare-usability states. Tests must cover the new or changed mappings.

Constitution alignment (UI-NAMING-001): Operator-facing copy must consistently use baseline vocabulary: Capture baseline, Compare now, Building, Complete, Incomplete, Superseded, Not usable for compare, and Current baseline unavailable. Internal terms such as partial write, resume token, or integrity meta may appear only in diagnostics, not as primary labels.

Constitution alignment (OPSURF-001): The modified surfaces remain operator-first by showing default-visible truth in this order: effective baseline snapshot, lifecycle/usability, and the operator next step. Diagnostics such as gap reasons, duplicate handling, and resume metadata remain secondary. Status dimensions must be shown separately where relevant: execution outcome, artifact existence, artifact completeness, and compare readiness. Capture and compare continue to communicate their mutation scope before execution: TenantPilot only for profile/snapshot truth updates and simulation only for compare.

Constitution alignment (Filament Action Surfaces): The Action Surface Contract remains satisfied for the modified baseline resources and page. Existing exemptions remain valid for the immutable snapshot resource: no list-header actions, no bulk actions, and no empty-state CTA. This feature changes truth presentation and action availability, not the overall action topology.

Constitution alignment (UI-STD-001): The modified Baseline Profiles and Baseline Snapshots list surfaces MUST be reviewed against docs/product/standards/list-surface-review-checklist.md as part of implementation and review.

Constitution alignment (UX-001 — Layout & Information Architecture): Existing Baseline Profile create/edit and view layouts remain in place. Snapshot and run detail pages continue to use structured detail sections or infolist-style presentation. This feature adds or changes status sections, labels, and action availability states rather than introducing new free-form inputs. Existing immutable-snapshot exemptions remain documented.

Functional Requirements

FR-001: The system MUST add an explicit lifecycle state to every BaselineSnapshot with the supported V1 values building, complete, and incomplete.
FR-002: The system MUST create every new BaselineSnapshot in building state before item assembly begins.
FR-003: The system MUST finalize a BaselineSnapshot to complete only as the final successful step after the snapshot has passed this feature's minimum completion check: the persisted BaselineSnapshotItem count matches the expected deduplicated item count, no unresolved assembly or finalization error remains, and the finalization step records successful completion metadata.
FR-004: The system MUST finalize a BaselineSnapshot to incomplete whenever capture fails or terminates after snapshot creation but before successful completion, including failures after only part of the snapshot items have been persisted.
FR-005: The system MUST treat consumability as a single authoritative rule derived from snapshot lifecycle state, where only complete is consumable in V1.
FR-006: The system MUST provide a single domain-level helper for snapshot consumability so compare flows, profile truth resolution, presenters, and UI surfaces do not duplicate lifecycle checks.
FR-007: The system MUST prevent any compare start path or compare execution path from consuming a BaselineSnapshot whose lifecycle state is not complete or whose complete snapshot is no longer the effective current truth for its profile.
FR-008: The system MUST return or record a clear operator-safe reason when compare is blocked because the selected or resolved snapshot is building, incomplete, missing, historically superseded, or otherwise not consumable.
FR-009: The system MUST resolve effective baseline truth for a profile as the latest complete snapshot, never merely the latest attempted snapshot.
FR-010: The system MUST NOT advance a profile's current or active snapshot pointer to a building or incomplete snapshot.
FR-011: The system MUST derive the previously effective complete snapshot as superseded or historical in operator-facing truth presentation only after a newer snapshot for the same profile becomes complete, without mutating the older snapshot away from its recorded terminal lifecycle state.
FR-012: The system MUST ensure a derived superseded or historical snapshot remains viewable while remaining non-consumable as current baseline truth.
FR-013: The system MUST use a deterministic persistence strategy for snapshot item assembly so retries or reruns do not create duplicate logical-subject rows that could falsely imply completion.
FR-014: The system MUST treat database uniqueness rules as safety nets, not as the primary definition of snapshot completeness.
FR-015: The system MUST preserve the completion-proof metadata needed to justify the complete transition and later backfill decisions, including at minimum the expected deduplicated item count, the persisted item count, the producing operation run identifier or linkable reference, the completion or failure timestamp, and the best-available incomplete reason when the snapshot becomes incomplete.
FR-016: The system MUST backfill pre-existing BaselineSnapshot rows conservatively according to a deterministic decision table. The first matching rule wins, contradictory or partial evidence is treated as no proof, and rows without proof MUST default to incomplete or require recapture before compare.
FR-017: The system MUST keep run truth and artifact truth separate on operator-facing surfaces so a run can be shown as failed, successful, or in progress independently from snapshot usability.
FR-018: The system MUST show baseline compare availability from effective snapshot consumability, not merely from snapshot existence.
FR-019: The system MUST preserve existing capture and compare start confirmations and existing authorization boundaries while updating the reasons and availability states shown to operators, including keeping ->requiresConfirmation() on the existing start actions and preserving the existing 404-for-non-members and 403-for-members-without-capability behavior on all affected entry points.
FR-020: The system MUST keep snapshot lifecycle and artifact-truth badge semantics centralized so list, detail, compare, and monitoring surfaces render the same state labels and meanings.
FR-021: The system MUST preserve auditability for snapshot lifecycle transitions by retaining which run produced the snapshot, whether the snapshot became consumable, the best-available reason when it became incomplete, and whether it is later rendered historical because a newer complete snapshot exists.
FR-022: The system MUST cover the motivating regression path in automated tests: snapshot row exists, item persistence fails partway, snapshot is not complete, profile truth does not advance, and compare refuses the snapshot.

Assumptions

V1 remains intentionally local to BaselineSnapshot and BaselineSnapshotItem; it does not introduce a generic artifact-lifecycle framework.
The existing profile-level current snapshot pointer remains the main way the product resolves effective baseline truth, but its semantics are tightened so it can reference only a complete snapshot.
Resume or repair flows remain separate follow-up work. This feature only guarantees that incomplete artifacts are explicitly marked unusable and blocked from downstream consumption.
Existing compare assignment rules and baseline scope rules remain unchanged.

Legacy Backfill Decision Table

Priority	Proof Rule	Required Evidence	Classification	Notes
1	Count proof	`summary_jsonb.total_items` or equivalent expected-item metadata is present, persisted BaselineSnapshotItem count is known, and both counts match exactly	`complete`	Use only when counts are non-null and non-contradictory
2	Producing-run success proof	Linkable producing run exists, its terminal outcome proves successful capture finalization, expected item count is present, and persisted item count matches that expected item count	`complete`	Run success alone is insufficient without count reconciliation
3	Proven empty capture proof	Linkable producing run exists, its terminal outcome proves successful capture finalization for the recorded scope, persisted item count is `0`, and the scope evidence proves zero items were expected	`complete`	Empty snapshots require explicit proof that zero items were expected
4	Contradictory or partial evidence	Any required evidence above is missing, null, inconsistent, or disagrees with persisted item counts	`incomplete`	No tie-breaker may elevate contradictory evidence to `complete`
5	No proof available	No qualifying summary metadata or producing-run proof is available	`incomplete`	Operator guidance must point to recapture before compare

UI Action Matrix (mandatory when Filament is changed)

Surface	Location	Header Actions	Inspect Affordance (List/Table)	Row Actions (max 2 visible)	Bulk Actions (grouped)	Empty-State CTA(s)	View Header Actions	Create/Edit Save+Cancel	Audit log?	Notes / Exemptions
Baseline profiles resource	`app/Filament/Resources/BaselineProfileResource.php` and `app/Filament/Resources/BaselineProfileResource/Pages/ViewBaselineProfile.php`	`Create baseline profile` on list	Dedicated `View` inspect affordance on list	`View`; `Edit` and `Archive` remain under `More`	None	Existing create CTA remains	`View current snapshot`, `Capture baseline`, `Compare now`, `Edit`	Existing save/cancel unchanged	Yes	`Capture baseline` and `Compare now` remain confirmation-gated. This spec changes when `View current snapshot` resolves and when compare is allowed. `Archive baseline profile` remains the only destructive action and still requires confirmation.
Baseline snapshots resource	`app/Filament/Resources/BaselineSnapshotResource.php` and `app/Filament/Resources/BaselineSnapshotResource/Pages/ViewBaselineSnapshot.php`	None	Clickable row	`Open related record`	None	None by design	`Open related record`	Not applicable	No direct mutation audit	Action Surface Contract remains satisfied through existing immutable-resource exemptions. This spec changes lifecycle/usability badges, filters, and explanatory text only.
Baseline compare landing	`app/Filament/Pages/BaselineCompareLanding.php`	`Compare now`	Not applicable	None	None	Existing empty or blocked guidance remains, but messaging must distinguish `no consumable baseline` from `no assignment` or `in progress`	Not applicable	Not applicable	Yes, via compare run	`Compare now` remains confirmation-gated and simulation-only. The page must disable or block it when no consumable snapshot exists and explain the next operator step.
Monitoring run detail	Existing operation run detail surface for baseline capture and baseline compare	Existing run-detail header actions unchanged	Not applicable	Not applicable	Not applicable	Not applicable	Existing related-artifact navigation unchanged	Not applicable	Yes	No new actions are introduced. The detail body must show run outcome separately from produced snapshot lifecycle and compare usability.

Key Entities (include if feature involves data)

BaselineSnapshot: The workspace-owned artifact that represents a captured baseline for a profile, including lifecycle state, completion timestamps, integrity summary, and whether it can serve as effective baseline truth.
BaselineSnapshotItem: The immutable or deterministic item set that makes up a snapshot's captured baseline content for specific logical subjects.
BaselineProfile: The workspace-owned governance definition whose effective baseline truth resolves to the latest complete snapshot and whose compare/capture actions depend on snapshot consumability.
OperationRun: The execution record for capture and compare. It remains the source of truth for execution history but not for artifact completeness.

Success Criteria (mandatory)

Measurable Outcomes

SC-001: In automated regression coverage, 100% of compare attempts targeting building or incomplete snapshots are blocked before normal drift output is produced.
SC-002: In automated regression coverage, 100% of partial-capture failure scenarios leave the produced snapshot non-consumable and preserve the previously effective complete snapshot when one exists.
SC-003: In automated surface coverage, the modified baseline profile detail shows the effective baseline truth label and next-step guidance, the baseline snapshot detail shows lifecycle plus current-vs-historical status, the compare landing shows compare availability plus a block reason or effective-snapshot label, and the run detail shows run outcome and artifact truth as separate visible assertions without opening diagnostics.
SC-004: Historical backfill classifies existing baseline snapshots conservatively enough that ambiguous legacy rows do not become effective baseline truth without a rule-based proof of completeness.
SC-005: The product no longer derives effective baseline truth from run outcome alone anywhere in the baseline capture or baseline compare workflow.

25 KiB Raw Blame History

Feature Specification: Artifact Truth & Downstream Consumption Guards for BaselineSnapshot

Spec Scope Fields (mandatory)

Operator Surface Contract (mandatory when operator-facing surfaces are changed)

User Scenarios & Testing (mandatory)

User Story 1 - Trust only complete baselines (Priority: P1)

User Story 2 - Block unsafe compare input (Priority: P2)

User Story 3 - See run truth and artifact truth separately (Priority: P3)

Edge Cases

Requirements (mandatory)

Functional Requirements

Assumptions

Legacy Backfill Decision Table

UI Action Matrix (mandatory when Filament is changed)

Key Entities (include if feature involves data)

Success Criteria (mandatory)

Measurable Outcomes

25 KiB

Raw Blame History