TenantAtlas/docs/audits/semantic-clarity-audit.md
ahmido a4f2629493 feat: add tenant review layer (#185)
## Summary
- add the tenant review domain with tenant-scoped review library, canonical workspace review register, lifecycle actions, and review-derived executive pack export
- extend review pack, operations, audit, capability, and badge infrastructure to support review composition, publication, export, and recurring review cycles
- add product backlog and audit documentation updates for tenant review and semantic-clarity follow-up candidates

## Testing
- `vendor/bin/sail bin pint --dirty --format agent`
- `vendor/bin/sail artisan test --compact --filter="TenantReview"`
- `CI=1 vendor/bin/sail artisan test --compact`

## Notes
- Livewire v4+ compliant via existing Filament v5 stack
- panel providers remain in `bootstrap/providers.php` via existing Laravel 12 structure; no provider registration moved to `bootstrap/app.php`
- `TenantReviewResource` is not globally searchable, so the Filament edit/view global-search constraint does not apply
- destructive review actions use action handlers with confirmation and policy enforcement

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #185
2026-03-21 22:03:01 +00:00

37 KiB
Raw Blame History

Semantic Clarity & Operator-Language Audit

Product: TenantPilot / TenantAtlas
Date: 2026-03-21
Scope: System-wide audit of operator-facing semantics, terminology collisions, technical leakage, false urgency, and missing semantic separation
Auditor role: Senior Staff Engineer / Enterprise SaaS UX Architect


1. Executive Summary

Severity: HIGH — systemic, not localized.

TenantPilot has a pervasive semantic collision problem where at least 57 distinct meaning axes are collapsed into the same small vocabulary of terms. The product uses ~12 overloaded words ("failed", "partial", "missing", "gaps", "unsupported", "stale", "blocked", "complete", "ready", "reference only", "metadata only", "needs attention") to communicate fundamentally different things across different domains.

Why it matters for enterprise/MSP trust:

  1. False alarm risk is real and frequent. An operator scanning a baseline snapshot sees "Unsupported" (gray badge) and "Gaps present" (yellow badge) and cannot distinguish a product renderer limitation from a genuine governance gap. Every scan of every resource containing fallback-rendered policy types will surface these warnings, training operators to ignore them — which then masks real issues.

  2. The problem is structural, not cosmetic. The badge system (BadgeDomain, 43 domains, ~500 case values) is architecturally sound and well-centralized. But it maps heterogeneous semantic axes onto a single severity/color channel (success/warning/danger/gray). The infrastructure is good; the taxonomy feeding it is wrong.

  3. The problem spans every major domain. Operations, Baselines, Evidence, Findings, Reviews, Inventory, Restore, Onboarding, and Alerts all exhibit the same pattern: a single status badge or label is asked to communicate execution outcome + data quality + product support maturity + operator urgency simultaneously.

  4. The heaviest damage is in three areas:

    • Baseline Snapshots — "Unsupported", "Reference only", "Partial", "Gaps present" all describe product/renderer limitations but read as data quality failures
    • Restore Runs — "Partial", "Manual required", "Dry run", "Completed with errors", "Skipped" create a 5-way ambiguity about what actually happened
    • Evidence/Review Completeness — "Partial", "Missing", "Stale" collapse three distinct axes (coverage, age, depth) into one badge

Estimated impact: ~60% of warning-colored badges in the product communicate something that is NOT a governance problem. This erodes the signal-to-noise ratio for the badges that actually matter.


2. Semantic Taxonomy Problems

The product needs to distinguish at minimum 8 independent meaning axes. Currently it conflates most of them.

2.1 Required Semantic Axes

# Axis What it answers Who cares
1 Execution outcome Did the operation run succeed, fail, or partially complete? Operator
2 Data completeness Is the captured data set complete for the scope requested? Operator
3 Evidence depth / fidelity How much detail was captured per item (full settings vs. metadata envelope)? Auditor / Compliance
4 Product support maturity Does TenantPilot have a specialized renderer/handler for this policy type or is it using a generic fallback? Product team (not operator)
5 Governance deviation Is this finding, drift, or posture gap a real compliance/security concern? Risk officer
6 Publication readiness Can this review/report be published to stakeholders? Review author
7 Data freshness When was this data last updated, and is it still within acceptable thresholds? Operator
8 Operator actionability Does this state require operator intervention, or is it informational? Operator

2.2 Where They Are Currently Conflated

Conflation Example Axes mixed
Baseline snapshot with fallback renderer shows "Unsupported" badge + "Gaps present" Axes 4 + 5 Product support maturity presented as governance deviation
Evidence completeness shows "Missing" (red/danger) when no findings exist yet Axes 2 + 8 Completeness treated as urgency
Restore run shows "Partial" (yellow) Axes 1 + 2 Execution outcome mixed with scope completion
Review completeness shows "Stale" Axes 2 + 7 Completeness mixed with freshness
Operation outcome "Partially succeeded" Axes 1 + 2 Execution result mixed with item-level coverage
"Blocked" used for both operation outcomes and verification reports Axes 1 + 8 Execution state mixed with actionability
"Failed" used for 10+ different badge domains Axes 1 + 4 + 5 Everything that isn't success collapses to "failed"

3. Findings by Domain

3.1 Operations / Runs

Affected components:

  • OperationRunOutcome enum + OperationRunOutcomeBadge
  • OperationUxPresenter (terminal notifications)
  • SummaryCountsNormalizer (run summary line)
  • RunFailureSanitizer (failure message formatting)
  • OperationRunResource (table, infolist)
  • OperationRunQueued / OperationRunCompleted notifications

Current problematic states:

State Badge Problem
PartiallySucceeded "Partially succeeded" (yellow) Does not explain: how many items succeeded vs. failed, whether core operation goals were met, or what items need attention. Operator cannot distinguish "99 of 100 policies synced" from "1 of 100 policies synced."
Blocked "Blocked" (yellow) Same color as PartiallySucceeded. Does not explain: blocked by what (RBAC? provider? gate?), or what to do next.
Failed "Failed" (red) Reasonable for execution failure, but failure messages are truncated to 140 chars and sanitized to internal reason codes (RateLimited, ProviderAuthFailed). Operator gets "Failed. Rate limited." with no retry guidance.

Notification problems:

  • "Completed with warnings" — what warnings? Where are they?
  • "Execution was blocked." — no explanation or recovery action
  • Summary counts show raw keys like errors_recorded, posture_score, report_deduped without context
  • Queued notification says "Running in the background" — no ETA or queue position

Classification: Systemic (affects all run types across all domains)
Operator impact: HIGH — every failed or partial operation leaves operator uncertain about actual state

3.2 Baselines / Snapshots / Compare

Affected components:

  • FidelityState enum (Full / Partial / ReferenceOnly / Unsupported)
  • BaselineSnapshotFidelityBadge
  • BaselineSnapshotGapStatusBadge
  • BaselineSnapshotPresenter + RenderedSnapshot* DTOs
  • GapSummary class
  • FallbackSnapshotTypeRenderer
  • SnapshotTypeRendererRegistry
  • baseline-snapshot-summary-table.blade.php
  • baseline-snapshot-technical-detail.blade.php
  • BaselineSnapshotResource, BaselineProfileResource

This is the single worst-affected domain. Four separate problems compound:

Problem 1: FidelityState mixes product maturity with data quality

FidelityState What it actually means What operator reads
Full Specialized renderer produced structured output "Data is complete"
Partial Some items rendered fully, others fell back "Data is partially missing"
ReferenceOnly Only metadata envelope captured, no settings payload "Only a reference exists" (alarming)
Unsupported No specialized renderer exists; fallback used "This policy type is broken"

The coverageHint texts are better but still technical:

  • "Mixed evidence fidelity across this group."
  • "Metadata-only evidence is available."
  • "Fallback metadata rendering is being used."

Problem 2: "Gaps" conflates multiple causes

GapSummary messages include:

  • "Metadata-only evidence was captured for this item." — this is a capture depth choice, not a gap
  • "A fallback renderer is being used for this item." — this is a product maturity issue, not a data gap
  • Actual capture errors (e.g., Graph returned an error) — these ARE real gaps

All three are counted as "gaps" and surfaced under the same "Gaps present" (yellow) badge. An operator cannot distinguish "we chose to capture light" from "the API returned an error" from "we don't have a renderer for this type yet."

Problem 3: "Captured with gaps" state label

BaselineSnapshotPresenter.stateLabel returns either "Complete" or "Captured with gaps". This is the top-level snapshot state that operators see first. If any policy type uses a fallback renderer, the entire snapshot is labeled "Captured with gaps" — even if every single policy was successfully captured and the only "gap" is that the product's renderer coverage hasn't been built out yet.

Problem 4: Technical detail exposure

The baseline-snapshot-technical-detail.blade.php template includes the comment: "Technical payloads are secondary on purpose. Use them for debugging capture fidelity and renderer fallbacks." This is good intent, but the primary view still surfaces "Unsupported" and "Gaps present" badges that carry the same connotation.

Classification: P0 — actively damages operator trust
Operator impact: CRITICAL — every baseline snapshot containing non-specialized policy types (which is most of them in a real tenant) will show yellow warnings that are not governance problems

3.3 Evidence / Evidence Snapshots

Affected components:

  • EvidenceCompletenessState enum (Complete / Partial / Missing / Stale)
  • EvidenceCompletenessBadge
  • EvidenceSnapshotStatus enum + EvidenceSnapshotStatusBadge
  • EvidenceCompletenessEvaluator
  • EvidenceSnapshotResource
  • Evidence source classes (FindingsSummarySource, OperationsSummarySource)

Current problematic states:

State Badge Color Problem
Missing Red/danger Used when a required evidence dimension has no data. But "missing" could mean: never collected, collection failed, data source empty (no findings exist yet = no evidence to collect). A new tenant with zero findings will show "Missing" in red for the findings dimension, which reads as a failure.
Partial Yellow/warning Means some dimensions are incomplete. Does not explain which or why.
Stale Gray Evidence collected > 30 days ago. Gray color suggests it's fine / archived, but stale evidence is actually operationally risky. Wrong severity mapping.

Source-level problems:

  • FindingsSummarySource: returns Complete if findings exist, Missing otherwise. A tenant with no findings is not "missing evidence" — it has zero findings, which is valid.
  • OperationsSummarySource: returns Complete if operations ran in last 30 days, Missing otherwise. A stable tenant with no recent operations is not "missing" — it's stable.

Classification: P0 — "Missing" (red) for valid empty states is a false alarm
Operator impact: HIGH — new tenants or stable tenants will always show red/yellow completeness badges that are not problems

3.4 Findings / Exceptions / Risk Acceptance

Affected components:

  • FindingResource (form, infolist, table)
  • FindingExceptionResource
  • FindingStatusBadge, FindingSeverityBadge
  • FindingExceptionStatusBadge
  • FindingRiskGovernanceValidityBadge

Current problematic states:

Term Location Problem
"Validity" Finding infolist Label for governance validity badge. Values include missing_support which reads as a product defect, not a governance state.
"Missing support" FindingRiskGovernanceValidityBadge Means "no exception record exists for this risk-accepted finding." This is a governance gap, not a product support issue. Label suggests the product is broken.
"Exception status" Finding infolist section Section heading uses "exception" (internal concept) instead of operator-facing language like "Risk approval status"
"Risk governance" Finding infolist section Appropriate term but not explained; grouped with "Validity" which is confusing

Drift diff unavailable messages leak technical language:

  • "RBAC evidence unavailable — normalized role definition evidence is missing."
  • "Diff unavailable — missing baseline policy version reference."
  • "Diff unavailable — missing current policy version reference."

These are accurate but incomprehensible to operators. They should say what happened and what action to take.

Classification: P1 — confusing but less frequently seen than baselines/evidence
Operator impact: MEDIUM — affects finding detail pages, which are high-attention moments

3.5 Tenant Reviews / Reports / Publication

Affected components:

  • TenantReviewStatus enum + TenantReviewStatusBadge
  • TenantReviewCompletenessState enum + badge
  • TenantReviewReadinessGate
  • TenantReviewComposer / TenantReviewSectionFactory
  • ReviewPackResource, TenantReviewResource

Current problematic states:

State Problem
"Partial" completeness Same conflation as Evidence: partial coverage vs. partial depth vs. partial freshness all collapsed
"Stale" completeness A review section based on old evidence should be orange/yellow (action needed), not gray (archived/inert)
"{Section} is stale and must be refreshed before publication" Accurate blocker message, but no guidance on HOW to refresh
"{Section} is missing" Confuses "section text hasn't been generated yet" with "underlying data doesn't exist"
"Anchored evidence snapshot" Used in form helper text and empty states — jargon for "the snapshot this review is based on"

Publication readiness gate is well-designed — it correctly separates blockers from completeness. But the messages surfaced to operators still use the collapsed vocabulary.

Highlight bullets are good: "3 open risks from 12 tracked findings" is clear operator language. The recommended next actions are also well-written. This domain is ahead of baselines/evidence in semantic clarity.

Classification: P1 — partially solved, but completeness/freshness conflation remains
Operator impact: MEDIUM — review authors are typically more sophisticated users

3.6 Restore Runs

Affected components:

  • RestoreRunStatus enum + badge (13 states including legacy)
  • RestoreResultStatusBadge (7 states)
  • RestorePreviewDecisionBadge (5 states)
  • RestoreCheckSeverityBadge
  • RestoreRunResource (form, infolist)
  • restore-results.blade.php
  • restore-run-checks.blade.php

This is the second-worst domain for semantic confusion. The restore workflow has:

  • 13 lifecycle states (most products have 4-6) — including legacy states that are still displayable
  • 7 result statuses per item — three of which are yellow/warning with different meanings
  • 5 preview decisions — including "Created copy" (why?) and "Skipped" (intentional or forced?)

Specific problems:

State Badge Problem
Partial (run-level) Yellow "Some items restored; some failed." But which? How many? Is the tenant in a consistent state?
CompletedWithErrors Yellow Legacy state. Does "completed" mean all items were attempted, or all succeeded and errors are secondary?
Partial (item-level) Yellow Mean something different from run-level partial — "item partially applied"
Manual required Yellow What manual action? Where? By whom?
Dry run Blue/info Could be confused with success. "Applied" is green, "Dry run" is blue — but the operation did NOT apply.
Skipped Yellow Intentional skip (user chose to exclude) vs. forced skip (precondition failed) conflated
Created copy Yellow Implies a naming conflict was resolved by creating a copy. Warning color suggests this is bad, but it might be expected behavior.

Form helper text is especially problematic:

  • "Preview-only types stay in dry-run" — operators don't know what "preview-only types" are
  • "Include foundations (scope tags, assignment filters) with policies to re-map IDs" — three jargon terms in one sentence
  • "Paste the target Entra ID group Object ID (GUID)" — technical protocol language

Restore check display:

  • "Blocking" (red) — operator may think the system is frozen, not that a validation check needs attention
  • "Unmapped groups" — no explanation of what mapping means or how to do it

Classification: P0 — restore is a high-risk, high-attention operation where confusion has real consequences
Operator impact: CRITICAL — incorrect interpretation of restore state can lead to incorrect remediation decisions

3.7 Inventory / Coverage / Sync / Provider Health

Affected components:

  • InventoryKpiBadges / InventoryKpiHeader widget
  • InventoryCoverage page
  • InventorySyncService
  • PolicySnapshotModeBadge ("Full" / "Metadata only")
  • ProviderConnectionStatusBadge, ProviderConnectionHealthBadge
  • PolicyResource (sync action, snapshot mode helper)

Current problematic states:

State Problem
"Partial" items (yellow) in KPI Means "preview-only restore mode" — a product support maturity fact, not a data quality issue.
"Risk" items (red) in KPI Correct usage — these are genuinely higher-risk policy types.
"Metadata only" snapshot mode (yellow + warning icon) Technical capture mode presented as if something is wrong. For some policy types, metadata-only capture IS the correct behavior.
Sync error: "Inventory sync reported rate limited." Mechanically constructed from error code (str_replace('_', ' ', $errorCode)). No retry guidance.
Provider "Needs consent" No context about what consent, where to grant it, or who can grant it.
Provider "Degraded" No detail on what is degraded or whether operations are affected.

Graph error leakage in PolicyResource:

  • "Graph returned {status} for this policy type. Only local metadata was saved; settings and restore are unavailable until Graph works again."
  • This is accurate but uses internal API naming ("Graph") and makes the scope unclear (is restore blocked for one policy or all policies?).

Classification: P1 — inventory is read-frequently and yellow badges train operators to ignore warnings
Operator impact: MEDIUM-HIGH — scanning inventory with perpetual yellow badges erodes warning credibility

3.8 Onboarding / Verification

Affected components:

  • VerificationReportOverall + badge ("Ready" / "Needs attention" / "Blocked")
  • VerificationCheckStatus + badge ("Pass" / "Fail" / "Warn" / "Skip")
  • ManagedTenantOnboardingVerificationStatusBadge
  • VerificationAssistViewModelBuilder
  • ManagedTenantOnboardingWizard (notifications, modals)
  • verification-required-permissions-assist.blade.php

Current problematic states:

State Problem
"Needs attention" Non-specific. Used for both missing delegated permissions and stale data.
"Blocked" Means "missing required application permissions" but reads as "system is blocked."
"Fail" check status Ambiguous: check execution failed, or check assertion failed? Both are possible meanings.
Wizard notifications: "Tenant not available", "Draft is not resumable" No explanation of why or what to do.
"Verification required" No detail on what verification is needed or estimated effort.

The verification assist panel is well-designed — it categorizes missing permissions by application vs. delegated and shows feature impact. This is a positive example of good semantic separation that could be replicated elsewhere.

Classification: P2 — onboarding is a one-time flow with guided steps
Operator impact: LOW-MEDIUM — operators encounter this during setup, not daily operations

3.9 Alerts / Notifications

Affected components:

  • EmailAlertNotification
  • AlertDeliveryStatusBadge
  • AlertDestinationLastTestStatusBadge

Current problematic states:

State Problem
Email body includes "Delivery ID: {id}" and "Event type: {type}" Internal metadata exposed to operators without context.
"Failed" delivery status (red) No distinction between transient (retry-able) and permanent failure.
Queued notification: "Running in the background." No ETA, no queue position, no way to check progress until terminal notification arrives.

Classification: P2 — alerts are less frequently viewed
Operator impact: LOW — these are supplementary signals, not primary operator workflows


4. Cross-Cutting Terminology Map

4.1 Most Problematic Terms

Term Appearances Meanings it carries Why dangerous Should belong to
"Failed" OperationRunOutcome, RestoreRunStatus, RestoreResultStatus, ReviewPackStatus, EvidenceSnapshotStatus, AlertDeliveryStatus, TenantReviewStatus, AlertDestinationLastTestStatus, BackupSetStatus, TenantRbacStatus, AuditOutcome (1) Operation execution failure, (2) validation check assertion failure, (3) generation/composition failure, (4) delivery failure, (5) catch-all for non-success Undifferentiated — operator cannot assess severity, retryability, or scope without drilling in. Red badge for all cases regardless of impact. Axis 1 (execution outcome) only. Other uses need distinct terms: "check failed" → "not met"; delivery failed → "undeliverable"; generation failed → "generation error"
"Partial" OperationRunOutcome (PartiallySucceeded), RestoreRunStatus, RestoreResultStatus, EvidenceCompletenessState, TenantReviewCompletenessState, FidelityState, BackupSetStatus, AuditOutcome, InventoryKpiBadges (1) Some items in a batch succeeded, others didn't, (2) evidence depth is limited, (3) capture coverage is incomplete, (4) product renderer coverage is partial, (5) restore mode is preview-only 8+ different meanings across the product. Yellow badge always. An operator reading "Partial" has no way to know which kind without drilling into details. Axis-specific terms: "N of M items processed" (Axis 1), "Limited detail" (Axis 3), "Incomplete coverage" (Axis 2), "Preview mode" (Axis 4)
"Missing" EvidenceCompletenessState, TenantReviewCompletenessState, TenantPermissionStatus, ReferenceResolutionState (DeletedOrMissing), FindingRiskGovernanceValidity (missing_support) (1) Evidence dimension has no data, (2) required review section not generated, (3) permission not granted, (4) referenced record deleted/not found, (5) no exception exists for risk-accepted finding "Missing" in red across all uses. But "zero findings" is not "missing findings" — it's an empty valid state. Separate into: "Not collected" (evidence), "Not generated" (sections), "Not granted" (permissions), "Not found" (references), "No exception" (governance)
"Gaps" BaselineSnapshotGapStatus, GapSummary, BaselineSnapshotPresenter ("Captured with gaps") (1) Fallback renderer was used, (2) metadata-only capture mode, (3) actual capture error, (4) missing coverage Merges product limitation with data quality problem. Every snapshot with unsupported types shows "Gaps present" perpetually. "Coverage notes" or "Capture notes" — with sub-categories for product limitation vs. actual data gap
"Unsupported" FidelityState, ReferenceClass (1) No specialized renderer for this policy type (product maturity), (2) reference class not recognized Reads as "this policy type is broken" when it actually means "we haven't built a dedicated viewer yet, but the data is captured." "Generic view" or "Standard rendering" — product support state should never be operator-alarming
"Stale" EvidenceCompletenessState, TenantReviewCompletenessState, TenantRbacStatus (1) Evidence older than 30 days, (2) review section based on old snapshot, (3) RBAC check results outdated Gray badge color (passive) for freshness issues that may require action. Conflates with completeness (same enum). Separate axis: "Last updated: 45 days ago" with age-based coloring. Not part of completeness axis.
"Blocked" OperationRunOutcome, VerificationReportOverall, ExecutionDenialClass, RestoreCheckSeverity (1) Operation prevented by gate/RBAC, (2) missing required permissions, (3) pre-flight check preventing execution Reads as "system is frozen" regardless of cause. Yellow for operation outcome, red for verification, red for restore checks. "Prevented" or "Requires action" with specific cause shown
"Reference only" FidelityState Only metadata envelope captured, no settings payload Reads as "we only have a pointer, not the actual data" — technically correct but alarming. For some policy types, this IS the full capture because Graph doesn't expose settings. "Metadata captured" or "Summary view"
"Metadata only" PolicySnapshotMode, restore-results.blade.php (1) Capture mode was metadata-only, (2) restore created policy shell without settings Two different uses. In capture, it's a valid mode. In restore, it means manual work is needed. Separate: "Summary capture" (mode) vs. "Shell created — configure settings manually" (restore result)
"Ready" TenantReviewStatus, VerificationReportOverall, ReviewPackStatus, OnboardingLifecycleState (ReadyForActivation) (1) Review cleared publication blockers, (2) permissions verified, (3) review pack downloadable, (4) onboarding ready for activation Overloaded but less dangerous than others — context usually makes it clear. Acceptable as-is, but should never be used alone without domain context
"Complete" EvidenceCompletenessState, TenantReviewCompletenessState, OnboardingDraftStatus/Stage/Lifecycle, OperationRunStatus (Completed) (1) All evidence dimensions collected, (2) all review sections generated, (3) onboarding finished, (4) operation execution finished "Completed" operation does not mean "succeeded" — the outcome must be checked separately. Keep for lifecycle completion. Don't use for quality — use "Full coverage" instead.

4.2 Terms That Are Correctly Used

Term Domain Assessment
"Drift" Findings Clear, domain-appropriate
"Superseded" Reviews, Snapshots Correct lifecycle term
"Archived" Multiple Consistent and clear
"New" / "Triaged" / "In progress" / "Resolved" Findings Standard issue lifecycle
"Risk accepted" Findings Precise governance term
"Applied" / "Mapped" Restore results Clear restore outcomes
"Pass" / "Warn" Verification checks Appropriate for boolean checks

5. Priority Ranking

P0 — Actively damages operator trust / causes false alarms

# Issue Domain Description
P0-1 FidelityState "Unsupported" displayed as badge Baselines Every snapshot with a fallback-rendered policy type shows "Unsupported" (gray) and "Gaps present" (yellow). This is a product maturity fact, not operator-actionable.
P0-2 "Captured with gaps" top-level snapshot state Baselines One fallback renderer taints the entire snapshot label. Most snapshots will show this.
P0-3 Evidence "Missing" (red) for valid empty states Evidence A new tenant with zero findings shows red "Missing" badge. Zero findings is a valid state, not missing evidence.
P0-4 Restore "Partial" ambiguity at run and item level Restore Operator cannot determine scope of partial success or whether tenant is in consistent state.
P0-5 OperationRunOutcome "Partially succeeded" with no item breakdown Operations Yellow badge with no way to assess impact without drilling into run details.

P1 — Strongly confusing, should be fixed soon

# Issue Domain Description
P1-1 "Stale" gray badge for time-sensitive freshness Evidence, Reviews Gray implies inert/archived; stale data requires action. Wrong color mapping.
P1-2 "Blocked" with no explanation across 4 domains Operations, Verification, Restore, Execution Operator reads "blocked" and cannot determine cause or action.
P1-3 "Metadata only" in snapshot mode (yellow warning) Inventory, Policies Valid capture mode presented as if something is wrong.
P1-4 Restore "Manual required" without specifying what Restore Yellow badge with no link to instructions or affected items.
P1-5 "Gaps present" conflates 3+ causes Baselines Renderer fallback, metadata-only capture, and actual errors all counted as "gaps."
P1-6 "Missing support" in governance validity Findings Means "no exception exists" but reads as "product doesn't support this."
P1-7 Technical diff-unavailable messages Findings "normalized role definition evidence is missing" is incomprehensible to operators.
P1-8 Graph error codes in restore results Restore "Graph error: {message} Code: {code}" shown directly to operators.
P1-9 "Partial" in inventory KPI for preview-only types Inventory Product maturity fact displayed as data quality warning.

P2 — Quality issue, not urgent

# Issue Domain Description
P2-1 Notification failure messages truncated to 140 chars Operations Important context may be lost. Sanitized reason codes are still technical.
P2-2 "Needs attention" without specifying what Onboarding, Verification Non-specific label across multiple domains.
P2-3 "Dry run" in blue/info for restore results Restore Could be confused with success. Should be more clearly distinct.
P2-4 "Anchored evidence snapshot" jargon Reviews Used in form helpers and empty states without explanation.
P2-5 Empty state messages lack next-step guidance Multiple Most empty states describe what the feature does, not how to start.
P2-6 Queued notifications lack progress indicators Operations "Running in the background" with no ETA or monitoring guidance.
P2-7 Email alerts include raw internal IDs Alerts "Delivery ID: {id}" and "Event type: {type}" shown to operators.

P3 — Polish / later cleanup

# Issue Domain Description
P3-1 "Archiving is permanent in v1" Baselines Version reference in modal text.
P3-2 "foundations" term unexplained Baselines, Restore Internal grouping concept used in helper text.
P3-3 "Skipped" in restore without intent distinction Restore Intentional vs. forced skip use same badge.
P3-4 Summary count keys like "report_deduped" Operations Internal metric name in operator-facing summary.

6.1 State Axes — Mandatory Separation

The product must maintain independent state tracking for these axes. They must never be flattened into a single badge or single enum:

Axis Scope Operator-facing? Values
Execution lifecycle Per operation run Yes Queued → Running → Completed
Execution outcome Per operation run Yes Succeeded / Completed with issues / Failed / Prevented
Item-level result Per item in a run Yes (on drill-in) Applied / Skipped (with reason) / Failed (with reason)
Data coverage Per evidence/review dimension Yes Full / Incomplete / Not collected
Evidence depth Per captured item Yes (secondary) Detailed / Summary / Metadata envelope
Product support tier Per policy type Diagnostics only Dedicated renderer / Standard renderer
Data freshness Per evidence item/dimension Yes (separate from coverage) Current (< N days) / Aging / Overdue
Governance status Per finding/exception Yes Valid / Expiring / Expired / None
Publication readiness Per review Yes Publishable / Blocked (with reasons)
Operator actionability Cross-cutting Yes (determines badge urgency color) Action required / Informational / No action needed

6.2 Diagnostic-Only Terms (Never in Primary Operator View)

These terms should appear ONLY in expandable/collapsible diagnostic panels, never in primary badges or summary text:

  • "Fallback renderer" → show "Standard rendering" in diagnostic, nothing in primary
  • "Metadata only" (as capture detail) → show "Summary view" in primary, mode detail in diagnostic
  • "Renderer registry" → never shown
  • "FidelityState" → replaced by "Evidence depth" in operator language
  • "Gap summary" → replaced by "Coverage notes"
  • Raw Graph error codes → replaced by categorized operator messages
  • Delivery IDs, event type codes → never in email body
  • errors_recorded, report_deduped → never in summary line; use operator terms

6.3 Warning vs. Info vs. Error — Decision Rules

Color When to use Never use for
Red (danger) Execution failed, governance violation active, permission denied, data loss risk Product support limitations, empty valid states, expired but inactive records
Yellow (warning) Operator action recommended, approaching threshold, mixed outcome Product maturity facts, informational states, things the operator cannot change
Blue (info) In progress, background activity, informational detail States that could be confused with success outcomes
Green (success) Operation succeeded, check passed, data complete States that could change (success should be stable)
Gray (neutral) Archived, not applicable, skipped Stale data (should be yellow/orange), degraded states that need attention

6.4 Where "Next Action" Is Mandatory

Every state that is NOT green or gray MUST include either:

  • An inline explanation (1 sentence) of what happened and whether action is needed
  • A link to a resolution path
  • An explicit "No action needed — this is expected" label

Mandatory coverage:

  • Failed operations → what failed, is it retryable, link to run details
  • Partial operations → N of M items processed, link to item-level details
  • Blocked operations → what blocked it, link to resolution (permissions page, provider settings, etc.)
  • Missing evidence → why missing (no data source, not yet collected, collection error), link to trigger collection
  • Stale data → when last updated, link to refresh action
  • Publication blockers → which sections, link directly to each section's refresh action

7. Spec Candidate Recommendations

The problem has two layers:

  1. A cross-cutting taxonomy problem that must be solved once, centrally, before domains can adopt it
  2. Domain-specific adoption that requires touching each resource/presenter/badge/view

Spec Package

Spec Scope Priority Dependencies
Spec: Operator Semantic Taxonomy Define the target state axes, term dictionary, color rules, diagnostic vs. primary classification, and "next action" policy. Produce a shared reference that all domain specs reference. Update FidelityState, EvidenceCompletenessState, TenantReviewCompletenessState enums to align with new taxonomy. Restructure GapSummary into axis-separated notes. P0 None
Spec: Baseline Snapshot Semantic Cleanup Apply taxonomy to BaselineSnapshotPresenter, FidelityState badge, GapSummary, snapshot summary table. Move renderer-tier info to diagnostics. Redefine "gaps" as categorized coverage notes. Fix "Captured with gaps" label. P0 Taxonomy spec
Spec: Evidence Completeness Reclassification Fix "Missing" for valid-empty states. Separate freshness from coverage. Add "Not applicable" state for dimensions with no expected data. Fix "Stale" color. P0 Taxonomy spec
Spec: Operation Outcome Clarity Replace "Partially succeeded" with item-breakdown-aware messaging. Add cause-specific "Blocked" explanations. Add next-action guidance to terminal notifications. Sanitize failure messages to operator language. P1 Taxonomy spec
Spec: Restore Semantic Cleanup Reduce 13 states to clear operator lifecycle. Distinguish intentional vs. forced skips. Replace "Manual required" with specific guidance. Fix "Partial" ambiguity. Move Graph errors behind diagnostic expandable. P1 Taxonomy spec
Spec: Inventory Coverage Language Replace "Partial" KPI badge for preview-only types. Replace "Metadata only" warning badge with neutral product-tier indicator. Fix sync error messages. P1 Taxonomy spec
Spec: Finding Governance Language Replace "missing_support", fix diff-unavailable messages, clarify "Validity" section. P2 Taxonomy spec
Spec: Notifications & Alerts Language Remove internal IDs from emails, add next-action to terminal notifications, fix summary count keys. P2 Taxonomy spec

Foundation spec first (Operator Semantic Taxonomy) that establishes the shared vocabulary, color rules, and diagnostic/primary boundary. Then 67 domain specs that apply the taxonomy to each area. The foundation spec should be small and decisive — it defines rules, not implementations. The domain specs do the actual refactoring.

This avoids a monolithic spec that would be too large to review or implement incrementally, while ensuring consistency through the shared taxonomy.