ahmido 02e75e1cda feat: harden baseline compare summary trust surfaces (#196 )

## Summary
- add a shared baseline compare summary assessment and assessor for compact trust propagation
- harden dashboard, landing, and banner baseline compare surfaces against false all-clear claims
- add focused Pest coverage for dashboard, landing, banner, reason translation, and canonical detail parity

## Validation
- vendor/bin/sail bin pint --dirty --format agent
- vendor/bin/sail artisan test --compact tests/Feature/Baselines/BaselineCompareSummaryAssessmentTest.php tests/Feature/Baselines/BaselineCompareExplanationFallbackTest.php tests/Feature/Filament/BaselineCompareNowWidgetTest.php tests/Feature/Filament/NeedsAttentionWidgetTest.php tests/Feature/Filament/BaselineCompareExplanationSurfaceTest.php tests/Feature/Filament/BaselineCompareLandingWhyNoFindingsTest.php tests/Feature/Filament/BaselineCompareCoverageBannerTest.php tests/Feature/Filament/BaselineCompareSummaryConsistencyTest.php tests/Feature/Filament/OperationRunBaselineTruthSurfaceTest.php tests/Feature/ReasonTranslation/ReasonTranslationExplanationTest.php

## Notes
- Livewire compliance: Filament v5 / Livewire v4 stack unchanged
- Provider registration: unchanged, Laravel 12 providers remain in bootstrap/providers.php
- Global search: no searchable resource behavior changed
- Destructive actions: none introduced by this change
- Assets: no new assets registered; existing deploy process remains unchanged

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #196

2026-03-27 00:19:53 +00:00

4.7 KiB

Raw Blame History

Quickstart: Baseline Compare Summary Trust Propagation & Compliance Claim Hardening

Goal

Verify that compact baseline and drift summary surfaces stop issuing false compliant or all-clear claims when the underlying compare result is limited, incomplete, stale, in progress, unavailable, suppressed, or otherwise not decision-grade.

Preconditions

Start Sail and ensure the tenant panel is accessible.
Use a tenant with an assigned baseline profile and an active baseline snapshot.
Prepare representative compare scenarios using existing factories or fixtures:
- trustworthy no-drift result
- limited-confidence zero-findings result
- evidence-gap-affected result with no open findings
- stale compare history with no new confirmed drift
- failed compare result
- no compare yet, compare in progress, or no snapshot result

Manual Verification Flow

Scenario 1: Trustworthy no-drift result

Open the tenant dashboard.
Confirm the baseline summary widget may show a positive aligned state and still links to the deeper Baseline Compare or run-detail path.
Confirm the same tenant's Baseline Compare landing page shows compatible no-drift semantics and preserves its findings or run-detail drilldowns.
Confirm the canonical run detail for the same compare is equally confident or more detailed, never less aligned.

Scenario 2: Limited-confidence zero-findings result

Open the tenant dashboard for a compare result with zero visible findings but limited confidence or suppressed output.
Confirm the baseline summary widget does not show compliant or all-clear wording.
Confirm Needs Attention does not fall back to a blanket healthy message and does not introduce a new drilldown path if the surface is intentionally non-navigational.
Open the landing page and verify the primary explanation remains cautionary while its drilldowns still resolve to the expected findings or run-detail surface.
Open the run detail and confirm the summary was not more optimistic than the detail surface.

Scenario 3: Evidence gaps with no open findings

Open a tenant with evidence gaps recorded but no open drift findings.
Confirm a compact summary surface visibly signals caution or review.
Confirm the coverage or evidence banner appears when appropriate and offers the expected drilldown path to landing or run detail.
Confirm the landing page still exposes deeper evidence-gap detail and diagnostics.

Scenario 4: Missing, stale, or unusable result

Verify the stale-history state stays distinct from no-result and does not render as healthy.
Verify the compare-in-progress state is visibly in progress rather than unavailable or healthy.
Verify the no-snapshot or no-compare-yet state remains unavailable rather than in progress or healthy.
Verify the failed-compare state gives an investigation-oriented next step.
Verify the existing Compare now action remains available only where already authorized and correctly guarded.

Automated Verification

Run the same focused verification pack referenced by tasks.md through Sail:

vendor/bin/sail artisan test --compact tests/Feature/Baselines/BaselineCompareSummaryAssessmentTest.php tests/Feature/Baselines/BaselineCompareWhyNoFindingsReasonCodeTest.php tests/Feature/Baselines/BaselineCompareStatsTest.php tests/Feature/Baselines/BaselineCompareExplanationFallbackTest.php
vendor/bin/sail artisan test --compact tests/Feature/Filament/BaselineCompareNowWidgetTest.php tests/Feature/Filament/NeedsAttentionWidgetTest.php
vendor/bin/sail artisan test --compact tests/Feature/Filament/BaselineCompareExplanationSurfaceTest.php tests/Feature/Filament/BaselineCompareLandingWhyNoFindingsTest.php tests/Feature/Filament/BaselineCompareCoverageBannerTest.php
vendor/bin/sail artisan test --compact tests/Feature/Filament/BaselineCompareSummaryConsistencyTest.php tests/Feature/Filament/BaselineCompareLandingStartSurfaceTest.php tests/Feature/ReasonTranslation/ReasonTranslationExplanationTest.php
vendor/bin/sail artisan test --compact tests/Feature/Filament/OperationRunBaselineTruthSurfaceTest.php tests/Feature/Filament/TenantDashboardDbOnlyTest.php
vendor/bin/sail bin pint --dirty --format agent

Expected Outcome

No in-scope summary surface presents compliant or equivalent all-clear copy for limited-confidence, incomplete-evidence, stale, in-progress, suppressed-result, failed, or unavailable scenarios.
Trustworthy no-drift scenarios can still present a positive aligned state.
Dashboard, landing, banner, and canonical detail surfaces remain semantically aligned.
Existing compare action behavior, lazy widget behavior, and DB-only dashboard rendering remain intact.

4.7 KiB Raw Blame History