TenantAtlas/docs/audits/semantic-clarity-audit.md

# Semantic Clarity & Operator-Language Audit

**Product:** TenantPilot / TenantAtlas
**Date:** 2026-03-21
**Scope:** System-wide audit of operator-facing semantics, terminology collisions, technical leakage, false urgency, and missing semantic separation
**Auditor role:** Senior Staff Engineer / Enterprise SaaS UX Architect

---

## 1. Executive Summary

**Severity: HIGH — systemic, not localized.**

TenantPilot has a **pervasive semantic collision problem** where at least 5–7 distinct meaning axes are collapsed into the same small vocabulary of terms. The product uses ~12 overloaded words ("failed", "partial", "missing", "gaps", "unsupported", "stale", "blocked", "complete", "ready", "reference only", "metadata only", "needs attention") to communicate fundamentally different things across different domains.

**Why it matters for enterprise/MSP trust:**

1. **False alarm risk is real and frequent.** An operator scanning a baseline snapshot sees "Unsupported" (gray badge) and "Gaps present" (yellow badge) and cannot distinguish a product renderer limitation from a genuine governance gap. Every scan of every resource containing fallback-rendered policy types will surface these warnings, training operators to ignore them — which then masks real issues.

2. **The problem is structural, not cosmetic.** The badge system (`BadgeDomain`, 43 domains, ~500 case values) is architecturally sound and well-centralized. But it maps heterogeneous semantic axes onto a single severity/color channel (success/warning/danger/gray). The infrastructure is good; the taxonomy feeding it is wrong.

3. **The problem spans every major domain.** Operations, Baselines, Evidence, Findings, Reviews, Inventory, Restore, Onboarding, and Alerts all exhibit the same pattern: a single status badge or label is asked to communicate execution outcome + data quality + product support maturity + operator urgency simultaneously.

4. **The heaviest damage is in three areas:**
   - **Baseline Snapshots** — "Unsupported", "Reference only", "Partial", "Gaps present" all describe product/renderer limitations but read as data quality failures
   - **Restore Runs** — "Partial", "Manual required", "Dry run", "Completed with errors", "Skipped" create a 5-way ambiguity about what actually happened
   - **Evidence/Review Completeness** — "Partial", "Missing", "Stale" collapse three distinct axes (coverage, age, depth) into one badge

**Estimated impact:** ~60% of warning-colored badges in the product communicate something that is NOT a governance problem. This erodes the signal-to-noise ratio for the badges that actually matter.

---

## 2. Semantic Taxonomy Problems

The product needs to distinguish **at minimum 8 independent meaning axes**. Currently it conflates most of them.

### 2.1 Required Semantic Axes

| # | Axis | What it answers | Who cares |
|---|------|----------------|-----------|
| 1 | **Execution outcome** | Did the operation run succeed, fail, or partially complete? | Operator |
| 2 | **Data completeness** | Is the captured data set complete for the scope requested? | Operator |
| 3 | **Evidence depth / fidelity** | How much detail was captured per item (full settings vs. metadata envelope)? | Auditor / Compliance |
| 4 | **Product support maturity** | Does TenantPilot have a specialized renderer/handler for this policy type or is it using a generic fallback? | Product team (not operator) |
| 5 | **Governance deviation** | Is this finding, drift, or posture gap a real compliance/security concern? | Risk officer |
| 6 | **Publication readiness** | Can this review/report be published to stakeholders? | Review author |
| 7 | **Data freshness** | When was this data last updated, and is it still within acceptable thresholds? | Operator |
| 8 | **Operator actionability** | Does this state require operator intervention, or is it informational? | Operator |

### 2.2 Where They Are Currently Conflated

| Conflation | Example | Axes mixed |
|-----------|---------|------------|
| Baseline snapshot with fallback renderer shows "Unsupported" badge + "Gaps present" | Axes 4 + 5 | Product support maturity presented as governance deviation |
| Evidence completeness shows "Missing" (red/danger) when no findings exist yet | Axes 2 + 8 | Completeness treated as urgency |
| Restore run shows "Partial" (yellow) | Axes 1 + 2 | Execution outcome mixed with scope completion |
| Review completeness shows "Stale" | Axes 2 + 7 | Completeness mixed with freshness |
| Operation outcome "Partially succeeded" | Axes 1 + 2 | Execution result mixed with item-level coverage |
| "Blocked" used for both operation outcomes and verification reports | Axes 1 + 8 | Execution state mixed with actionability |
| "Failed" used for 10+ different badge domains | Axes 1 + 4 + 5 | Everything that isn't success collapses to "failed" |

---

## 3. Findings by Domain

### 3.1 Operations / Runs

**Affected components:**
- `OperationRunOutcome` enum + `OperationRunOutcomeBadge`
- `OperationUxPresenter` (terminal notifications)
- `SummaryCountsNormalizer` (run summary line)
- `RunFailureSanitizer` (failure message formatting)
- `OperationRunResource` (table, infolist)
- `OperationRunQueued` / `OperationRunCompleted` notifications

**Current problematic states:**

| State | Badge | Problem |
|-------|-------|---------|
| `PartiallySucceeded` | "Partially succeeded" (yellow) | Does not explain: how many items succeeded vs. failed, whether core operation goals were met, or what items need attention. Operator cannot distinguish "99 of 100 policies synced" from "1 of 100 policies synced." |
| `Blocked` | "Blocked" (yellow) | Same color as PartiallySucceeded. Does not explain: blocked by what (RBAC? provider? gate?), or what to do next. |
| `Failed` | "Failed" (red) | Reasonable for execution failure, but failure messages are truncated to 140 chars and sanitized to internal reason codes (`RateLimited`, `ProviderAuthFailed`). Operator gets "Failed. Rate limited." with no retry guidance. |

**Notification problems:**
- "Completed with warnings" — what warnings? Where are they?
- "Execution was blocked." — no explanation or recovery action
- Summary counts show raw keys like `errors_recorded`, `posture_score`, `report_deduped` without context
- Queued notification says "Running in the background" — no ETA or queue position

**Classification:** Systemic (affects all run types across all domains)
**Operator impact:** HIGH — every failed or partial operation leaves operator uncertain about actual state

### 3.2 Baselines / Snapshots / Compare

**Affected components:**
- `FidelityState` enum (Full / Partial / ReferenceOnly / Unsupported)
- `BaselineSnapshotFidelityBadge`
- `BaselineSnapshotGapStatusBadge`
- `BaselineSnapshotPresenter` + `RenderedSnapshot*` DTOs
- `GapSummary` class
- `FallbackSnapshotTypeRenderer`
- `SnapshotTypeRendererRegistry`
- `baseline-snapshot-summary-table.blade.php`
- `baseline-snapshot-technical-detail.blade.php`
- `BaselineSnapshotResource`, `BaselineProfileResource`

**This is the single worst-affected domain.** Four separate problems compound:

**Problem 1: FidelityState mixes product maturity with data quality**

| FidelityState | What it actually means | What operator reads |
|--------------|----------------------|-------------------|
| `Full` | Specialized renderer produced structured output | "Data is complete" |
| `Partial` | Some items rendered fully, others fell back | "Data is partially missing" |
| `ReferenceOnly` | Only metadata envelope captured, no settings payload | "Only a reference exists" (alarming) |
| `Unsupported` | No specialized renderer exists; fallback used | "This policy type is broken" |

The `coverageHint` texts are better but still technical:
- "Mixed evidence fidelity across this group."
- "Metadata-only evidence is available."
- "Fallback metadata rendering is being used."

**Problem 2: "Gaps" conflates multiple causes**

`GapSummary` messages include:
- "Metadata-only evidence was captured for this item." — this is a capture depth choice, not a gap
- "A fallback renderer is being used for this item." — this is a product maturity issue, not a data gap
- Actual capture errors (e.g., Graph returned an error) — these ARE real gaps

All three are counted as "gaps" and surfaced under the same "Gaps present" (yellow) badge. An operator cannot distinguish "we chose to capture light" from "the API returned an error" from "we don't have a renderer for this type yet."

**Problem 3: "Captured with gaps" state label**

`BaselineSnapshotPresenter.stateLabel` returns either "Complete" or "Captured with gaps". This is the top-level snapshot state that operators see first. If any policy type uses a fallback renderer, the entire snapshot is labeled "Captured with gaps" — even if every single policy was successfully captured and the only "gap" is that the product's renderer coverage hasn't been built out yet.

**Problem 4: Technical detail exposure**

The `baseline-snapshot-technical-detail.blade.php` template includes the comment: "Technical payloads are secondary on purpose. Use them for debugging capture fidelity and renderer fallbacks." This is good intent, but the primary view still surfaces "Unsupported" and "Gaps present" badges that carry the same connotation.

**Classification:** P0 — actively damages operator trust
**Operator impact:** CRITICAL — every baseline snapshot containing non-specialized policy types (which is most of them in a real tenant) will show yellow warnings that are not governance problems

### 3.3 Evidence / Evidence Snapshots

**Affected components:**
- `EvidenceCompletenessState` enum (Complete / Partial / Missing / Stale)
- `EvidenceCompletenessBadge`
- `EvidenceSnapshotStatus` enum + `EvidenceSnapshotStatusBadge`
- `EvidenceCompletenessEvaluator`
- `EvidenceSnapshotResource`
- Evidence source classes (`FindingsSummarySource`, `OperationsSummarySource`)

**Current problematic states:**

| State | Badge Color | Problem |
|-------|------------|---------|
| `Missing` | Red/danger | Used when a required evidence dimension has no data. But "missing" could mean: never collected, collection failed, data source empty (no findings exist yet = no evidence to collect). A new tenant with zero findings will show "Missing" in red for the findings dimension, which reads as a failure. |
| `Partial` | Yellow/warning | Means some dimensions are incomplete. Does not explain which or why. |
| `Stale` | Gray | Evidence collected > 30 days ago. Gray color suggests it's fine / archived, but stale evidence is actually operationally risky. Wrong severity mapping. |

**Source-level problems:**
- `FindingsSummarySource`: returns `Complete` if findings exist, `Missing` otherwise. A tenant with no findings is not "missing evidence" — it has zero findings, which is valid.
- `OperationsSummarySource`: returns `Complete` if operations ran in last 30 days, `Missing` otherwise. A stable tenant with no recent operations is not "missing" — it's stable.

**Classification:** P0 — "Missing" (red) for valid empty states is a false alarm
**Operator impact:** HIGH — new tenants or stable tenants will always show red/yellow completeness badges that are not problems

### 3.4 Findings / Exceptions / Risk Acceptance

**Affected components:**
- `FindingResource` (form, infolist, table)
- `FindingExceptionResource`
- `FindingStatusBadge`, `FindingSeverityBadge`
- `FindingExceptionStatusBadge`
- `FindingRiskGovernanceValidityBadge`

**Current problematic states:**

| Term | Location | Problem |
|------|----------|---------|
| "Validity" | Finding infolist | Label for governance validity badge. Values include `missing_support` which reads as a product defect, not a governance state. |
| "Missing support" | `FindingRiskGovernanceValidityBadge` | Means "no exception record exists for this risk-accepted finding." This is a governance gap, not a product support issue. Label suggests the product is broken. |
| "Exception status" | Finding infolist section | Section heading uses "exception" (internal concept) instead of operator-facing language like "Risk approval status" |
| "Risk governance" | Finding infolist section | Appropriate term but not explained; grouped with "Validity" which is confusing |

**Drift diff unavailable messages leak technical language:**
- "RBAC evidence unavailable — normalized role definition evidence is missing."
- "Diff unavailable — missing baseline policy version reference."
- "Diff unavailable — missing current policy version reference."

These are accurate but incomprehensible to operators. They should say what happened and what action to take.

**Classification:** P1 — confusing but less frequently seen than baselines/evidence
**Operator impact:** MEDIUM — affects finding detail pages, which are high-attention moments

### 3.5 Tenant Reviews / Reports / Publication

**Affected components:**
- `TenantReviewStatus` enum + `TenantReviewStatusBadge`
- `TenantReviewCompletenessState` enum + badge
- `TenantReviewReadinessGate`
- `TenantReviewComposer` / `TenantReviewSectionFactory`
- `ReviewPackResource`, `TenantReviewResource`

**Current problematic states:**

| State | Problem |
|-------|---------|
| "Partial" completeness | Same conflation as Evidence: partial coverage vs. partial depth vs. partial freshness all collapsed |
| "Stale" completeness | A review section based on old evidence should be orange/yellow (action needed), not gray (archived/inert) |
| "{Section} is stale and must be refreshed before publication" | Accurate blocker message, but no guidance on HOW to refresh |
| "{Section} is missing" | Confuses "section text hasn't been generated yet" with "underlying data doesn't exist" |
| "Anchored evidence snapshot" | Used in form helper text and empty states — jargon for "the snapshot this review is based on" |

**Publication readiness gate is well-designed** — it correctly separates blockers from completeness. But the messages surfaced to operators still use the collapsed vocabulary.

**Highlight bullets are good:** "3 open risks from 12 tracked findings" is clear operator language. The recommended next actions are also well-written. This domain is ahead of baselines/evidence in semantic clarity.

**Classification:** P1 — partially solved, but completeness/freshness conflation remains
**Operator impact:** MEDIUM — review authors are typically more sophisticated users

### 3.6 Restore Runs

**Affected components:**
- `RestoreRunStatus` enum + badge (13 states including legacy)
- `RestoreResultStatusBadge` (7 states)
- `RestorePreviewDecisionBadge` (5 states)
- `RestoreCheckSeverityBadge`
- `RestoreRunResource` (form, infolist)
- `restore-results.blade.php`
- `restore-run-checks.blade.php`

**This is the second-worst domain for semantic confusion.** The restore workflow has:

- **13 lifecycle states** (most products have 4-6) — including legacy states that are still displayable
- **7 result statuses** per item — three of which are yellow/warning with different meanings
- **5 preview decisions** — including "Created copy" (why?) and "Skipped" (intentional or forced?)

**Specific problems:**

| State | Badge | Problem |
|-------|-------|---------|
| `Partial` (run-level) | Yellow | "Some items restored; some failed." But which? How many? Is the tenant in a consistent state? |
| `CompletedWithErrors` | Yellow | Legacy state. Does "completed" mean all items were attempted, or all succeeded and errors are secondary? |
| `Partial` (item-level) | Yellow | Mean something different from run-level partial — "item partially applied" |
| `Manual required` | Yellow | What manual action? Where? By whom? |
| `Dry run` | Blue/info | Could be confused with success. "Applied" is green, "Dry run" is blue — but the operation did NOT apply. |
| `Skipped` | Yellow | Intentional skip (user chose to exclude) vs. forced skip (precondition failed) conflated |
| `Created copy` | Yellow | Implies a naming conflict was resolved by creating a copy. Warning color suggests this is bad, but it might be expected behavior. |

**Form helper text is especially problematic:**
- "Preview-only types stay in dry-run" — operators don't know what "preview-only types" are
- "Include foundations (scope tags, assignment filters) with policies to re-map IDs" — three jargon terms in one sentence
- "Paste the target Entra ID group Object ID (GUID)" — technical protocol language

**Restore check display:**
- "Blocking" (red) — operator may think the system is frozen, not that a validation check needs attention
- "Unmapped groups" — no explanation of what mapping means or how to do it

**Classification:** P0 — restore is a high-risk, high-attention operation where confusion has real consequences
**Operator impact:** CRITICAL — incorrect interpretation of restore state can lead to incorrect remediation decisions

### 3.7 Inventory / Coverage / Sync / Provider Health

**Affected components:**
- `InventoryKpiBadges` / `InventoryKpiHeader` widget
- `InventoryCoverage` page
- `InventorySyncService`
- `PolicySnapshotModeBadge` ("Full" / "Metadata only")
- `ProviderConnectionStatusBadge`, `ProviderConnectionHealthBadge`
- `PolicyResource` (sync action, snapshot mode helper)

**Current problematic states:**

| State | Problem |
|-------|---------|
| "Partial" items (yellow) in KPI | Means "preview-only restore mode" — a product support maturity fact, not a data quality issue. |
| "Risk" items (red) in KPI | Correct usage — these are genuinely higher-risk policy types. |
| "Metadata only" snapshot mode (yellow + warning icon) | Technical capture mode presented as if something is wrong. For some policy types, metadata-only capture IS the correct behavior. |
| Sync error: "Inventory sync reported rate limited." | Mechanically constructed from error code (`str_replace('_', ' ', $errorCode)`). No retry guidance. |
| Provider "Needs consent" | No context about what consent, where to grant it, or who can grant it. |
| Provider "Degraded" | No detail on what is degraded or whether operations are affected. |

**Graph error leakage in PolicyResource:**
- "Graph returned {status} for this policy type. Only local metadata was saved; settings and restore are unavailable until Graph works again."
- This is accurate but uses internal API naming ("Graph") and makes the scope unclear (is restore blocked for one policy or all policies?).

**Classification:** P1 — inventory is read-frequently and yellow badges train operators to ignore warnings
**Operator impact:** MEDIUM-HIGH — scanning inventory with perpetual yellow badges erodes warning credibility

### 3.8 Onboarding / Verification

**Affected components:**
- `VerificationReportOverall` + badge ("Ready" / "Needs attention" / "Blocked")
- `VerificationCheckStatus` + badge ("Pass" / "Fail" / "Warn" / "Skip")
- `ManagedTenantOnboardingVerificationStatusBadge`
- `VerificationAssistViewModelBuilder`
- `ManagedTenantOnboardingWizard` (notifications, modals)
- `verification-required-permissions-assist.blade.php`

**Current problematic states:**

| State | Problem |
|-------|---------|
| "Needs attention" | Non-specific. Used for both missing delegated permissions and stale data. |
| "Blocked" | Means "missing required application permissions" but reads as "system is blocked." |
| "Fail" check status | Ambiguous: check execution failed, or check assertion failed? Both are possible meanings. |
| Wizard notifications: "Tenant not available", "Draft is not resumable" | No explanation of why or what to do. |
| "Verification required" | No detail on what verification is needed or estimated effort. |

**The verification assist panel is well-designed** — it categorizes missing permissions by application vs. delegated and shows feature impact. This is a positive example of good semantic separation that could be replicated elsewhere.

**Classification:** P2 — onboarding is a one-time flow with guided steps
**Operator impact:** LOW-MEDIUM — operators encounter this during setup, not daily operations

### 3.9 Alerts / Notifications

**Affected components:**
- `EmailAlertNotification`
- `AlertDeliveryStatusBadge`
- `AlertDestinationLastTestStatusBadge`

**Current problematic states:**

| State | Problem |
|-------|---------|
| Email body includes "Delivery ID: {id}" and "Event type: {type}" | Internal metadata exposed to operators without context. |
| "Failed" delivery status (red) | No distinction between transient (retry-able) and permanent failure. |
| Queued notification: "Running in the background." | No ETA, no queue position, no way to check progress until terminal notification arrives. |

**Classification:** P2 — alerts are less frequently viewed
**Operator impact:** LOW — these are supplementary signals, not primary operator workflows

---

## 4. Cross-Cutting Terminology Map

### 4.1 Most Problematic Terms

| Term | Appearances | Meanings it carries | Why dangerous | Should belong to |
|------|-------------|-------------------|---------------|-----------------|
| **"Failed"** | OperationRunOutcome, RestoreRunStatus, RestoreResultStatus, ReviewPackStatus, EvidenceSnapshotStatus, AlertDeliveryStatus, TenantReviewStatus, AlertDestinationLastTestStatus, BackupSetStatus, TenantRbacStatus, AuditOutcome | (1) Operation execution failure, (2) validation check assertion failure, (3) generation/composition failure, (4) delivery failure, (5) catch-all for non-success | Undifferentiated — operator cannot assess severity, retryability, or scope without drilling in. Red badge for all cases regardless of impact. | Axis 1 (execution outcome) only. Other uses need distinct terms: "check failed" → "not met"; delivery failed → "undeliverable"; generation failed → "generation error" |
| **"Partial"** | OperationRunOutcome (PartiallySucceeded), RestoreRunStatus, RestoreResultStatus, EvidenceCompletenessState, TenantReviewCompletenessState, FidelityState, BackupSetStatus, AuditOutcome, InventoryKpiBadges | (1) Some items in a batch succeeded, others didn't, (2) evidence depth is limited, (3) capture coverage is incomplete, (4) product renderer coverage is partial, (5) restore mode is preview-only | 8+ different meanings across the product. Yellow badge always. An operator reading "Partial" has no way to know which kind without drilling into details. | Axis-specific terms: "N of M items processed" (Axis 1), "Limited detail" (Axis 3), "Incomplete coverage" (Axis 2), "Preview mode" (Axis 4) |
| **"Missing"** | EvidenceCompletenessState, TenantReviewCompletenessState, TenantPermissionStatus, ReferenceResolutionState (DeletedOrMissing), FindingRiskGovernanceValidity (missing_support) | (1) Evidence dimension has no data, (2) required review section not generated, (3) permission not granted, (4) referenced record deleted/not found, (5) no exception exists for risk-accepted finding | "Missing" in red across all uses. But "zero findings" is not "missing findings" — it's an empty valid state. | Separate into: "Not collected" (evidence), "Not generated" (sections), "Not granted" (permissions), "Not found" (references), "No exception" (governance) |
| **"Gaps"** | BaselineSnapshotGapStatus, GapSummary, BaselineSnapshotPresenter ("Captured with gaps") | (1) Fallback renderer was used, (2) metadata-only capture mode, (3) actual capture error, (4) missing coverage | Merges product limitation with data quality problem. Every snapshot with unsupported types shows "Gaps present" perpetually. | "Coverage notes" or "Capture notes" — with sub-categories for product limitation vs. actual data gap |
| **"Unsupported"** | FidelityState, ReferenceClass | (1) No specialized renderer for this policy type (product maturity), (2) reference class not recognized | Reads as "this policy type is broken" when it actually means "we haven't built a dedicated viewer yet, but the data is captured." | "Generic view" or "Standard rendering" — product support state should never be operator-alarming |
| **"Stale"** | EvidenceCompletenessState, TenantReviewCompletenessState, TenantRbacStatus | (1) Evidence older than 30 days, (2) review section based on old snapshot, (3) RBAC check results outdated | Gray badge color (passive) for freshness issues that may require action. Conflates with completeness (same enum). | Separate axis: "Last updated: 45 days ago" with age-based coloring. Not part of completeness axis. |
| **"Blocked"** | OperationRunOutcome, VerificationReportOverall, ExecutionDenialClass, RestoreCheckSeverity | (1) Operation prevented by gate/RBAC, (2) missing required permissions, (3) pre-flight check preventing execution | Reads as "system is frozen" regardless of cause. Yellow for operation outcome, red for verification, red for restore checks. | "Prevented" or "Requires action" with specific cause shown |
| **"Reference only"** | FidelityState | Only metadata envelope captured, no settings payload | Reads as "we only have a pointer, not the actual data" — technically correct but alarming. For some policy types, this IS the full capture because Graph doesn't expose settings. | "Metadata captured" or "Summary view" |
| **"Metadata only"** | PolicySnapshotMode, restore-results.blade.php | (1) Capture mode was metadata-only, (2) restore created policy shell without settings | Two different uses. In capture, it's a valid mode. In restore, it means manual work is needed. | Separate: "Summary capture" (mode) vs. "Shell created — configure settings manually" (restore result) |
| **"Ready"** | TenantReviewStatus, VerificationReportOverall, ReviewPackStatus, OnboardingLifecycleState (ReadyForActivation) | (1) Review cleared publication blockers, (2) permissions verified, (3) review pack downloadable, (4) onboarding ready for activation | Overloaded but less dangerous than others — context usually makes it clear. | Acceptable as-is, but should never be used alone without domain context |
| **"Complete"** | EvidenceCompletenessState, TenantReviewCompletenessState, OnboardingDraftStatus/Stage/Lifecycle, OperationRunStatus (Completed) | (1) All evidence dimensions collected, (2) all review sections generated, (3) onboarding finished, (4) operation execution finished | "Completed" operation does not mean "succeeded" — the outcome must be checked separately. | Keep for lifecycle completion. Don't use for quality — use "Full coverage" instead. |

### 4.2 Terms That Are Correctly Used

| Term | Domain | Assessment |
|------|--------|-----------|
| "Drift" | Findings | Clear, domain-appropriate |
| "Superseded" | Reviews, Snapshots | Correct lifecycle term |
| "Archived" | Multiple | Consistent and clear |
| "New" / "Triaged" / "In progress" / "Resolved" | Findings | Standard issue lifecycle |
| "Risk accepted" | Findings | Precise governance term |
| "Applied" / "Mapped" | Restore results | Clear restore outcomes |
| "Pass" / "Warn" | Verification checks | Appropriate for boolean checks |

---

## 5. Priority Ranking

### P0 — Actively damages operator trust / causes false alarms

| # | Issue | Domain | Description |
|---|-------|--------|-------------|
| P0-1 | **FidelityState "Unsupported" displayed as badge** | Baselines | Every snapshot with a fallback-rendered policy type shows "Unsupported" (gray) and "Gaps present" (yellow). This is a product maturity fact, not operator-actionable. |
| P0-2 | **"Captured with gaps" top-level snapshot state** | Baselines | One fallback renderer taints the entire snapshot label. Most snapshots will show this. |
| P0-3 | **Evidence "Missing" (red) for valid empty states** | Evidence | A new tenant with zero findings shows red "Missing" badge. Zero findings is a valid state, not missing evidence. |
| P0-4 | **Restore "Partial" ambiguity at run and item level** | Restore | Operator cannot determine scope of partial success or whether tenant is in consistent state. |
| P0-5 | **OperationRunOutcome "Partially succeeded" with no item breakdown** | Operations | Yellow badge with no way to assess impact without drilling into run details. |

### P1 — Strongly confusing, should be fixed soon

| # | Issue | Domain | Description |
|---|-------|--------|-------------|
| P1-1 | **"Stale" gray badge for time-sensitive freshness** | Evidence, Reviews | Gray implies inert/archived; stale data requires action. Wrong color mapping. |
| P1-2 | **"Blocked" with no explanation across 4 domains** | Operations, Verification, Restore, Execution | Operator reads "blocked" and cannot determine cause or action. |
| P1-3 | **"Metadata only" in snapshot mode (yellow warning)** | Inventory, Policies | Valid capture mode presented as if something is wrong. |
| P1-4 | **Restore "Manual required" without specifying what** | Restore | Yellow badge with no link to instructions or affected items. |
| P1-5 | **"Gaps present" conflates 3+ causes** | Baselines | Renderer fallback, metadata-only capture, and actual errors all counted as "gaps." |
| P1-6 | **"Missing support" in governance validity** | Findings | Means "no exception exists" but reads as "product doesn't support this." |
| P1-7 | **Technical diff-unavailable messages** | Findings | "normalized role definition evidence is missing" is incomprehensible to operators. |
| P1-8 | **Graph error codes in restore results** | Restore | "Graph error: {message} Code: {code}" shown directly to operators. |
| P1-9 | **"Partial" in inventory KPI for preview-only types** | Inventory | Product maturity fact displayed as data quality warning. |

### P2 — Quality issue, not urgent

| # | Issue | Domain | Description |
|---|-------|--------|-------------|
| P2-1 | **Notification failure messages truncated to 140 chars** | Operations | Important context may be lost. Sanitized reason codes are still technical. |
| P2-2 | **"Needs attention" without specifying what** | Onboarding, Verification | Non-specific label across multiple domains. |
| P2-3 | **"Dry run" in blue/info for restore results** | Restore | Could be confused with success. Should be more clearly distinct. |
| P2-4 | **"Anchored evidence snapshot" jargon** | Reviews | Used in form helpers and empty states without explanation. |
| P2-5 | **Empty state messages lack next-step guidance** | Multiple | Most empty states describe what the feature does, not how to start. |
| P2-6 | **Queued notifications lack progress indicators** | Operations | "Running in the background" with no ETA or monitoring guidance. |
| P2-7 | **Email alerts include raw internal IDs** | Alerts | "Delivery ID: {id}" and "Event type: {type}" shown to operators. |

### P3 — Polish / later cleanup

| # | Issue | Domain | Description |
|---|-------|--------|-------------|
| P3-1 | **"Archiving is permanent in v1"** | Baselines | Version reference in modal text. |
| P3-2 | **"foundations" term unexplained** | Baselines, Restore | Internal grouping concept used in helper text. |
| P3-3 | **"Skipped" in restore without intent distinction** | Restore | Intentional vs. forced skip use same badge. |
| P3-4 | **Summary count keys like "report_deduped"** | Operations | Internal metric name in operator-facing summary. |

---

## 6. Recommended Target Model

### 6.1 State Axes — Mandatory Separation

The product must maintain **independent** state tracking for these axes. They must never be flattened into a single badge or single enum:

| Axis | Scope | Operator-facing? | Values |
|------|-------|-----------------|--------|
| **Execution lifecycle** | Per operation run | Yes | Queued → Running → Completed |
| **Execution outcome** | Per operation run | Yes | Succeeded / Completed with issues / Failed / Prevented |
| **Item-level result** | Per item in a run | Yes (on drill-in) | Applied / Skipped (with reason) / Failed (with reason) |
| **Data coverage** | Per evidence/review dimension | Yes | Full / Incomplete / Not collected |
| **Evidence depth** | Per captured item | Yes (secondary) | Detailed / Summary / Metadata envelope |
| **Product support tier** | Per policy type | Diagnostics only | Dedicated renderer / Standard renderer |
| **Data freshness** | Per evidence item/dimension | Yes (separate from coverage) | Current (< N days) / Aging / Overdue |
| **Governance status** | Per finding/exception | Yes | Valid / Expiring / Expired / None |
| **Publication readiness** | Per review | Yes | Publishable / Blocked (with reasons) |
| **Operator actionability** | Cross-cutting | Yes (determines badge urgency color) | Action required / Informational / No action needed |

### 6.2 Diagnostic-Only Terms (Never in Primary Operator View)

These terms should appear ONLY in expandable/collapsible diagnostic panels, never in primary badges or summary text:

- "Fallback renderer" → show "Standard rendering" in diagnostic, nothing in primary
- "Metadata only" (as capture detail) → show "Summary view" in primary, mode detail in diagnostic
- "Renderer registry" → never shown
- "FidelityState" → replaced by "Evidence depth" in operator language
- "Gap summary" → replaced by "Coverage notes"
- Raw Graph error codes → replaced by categorized operator messages
- Delivery IDs, event type codes → never in email body
- `errors_recorded`, `report_deduped` → never in summary line; use operator terms

### 6.3 Warning vs. Info vs. Error — Decision Rules

| Color | When to use | Never use for |
|-------|------------|--------------|
| **Red (danger)** | Execution failed, governance violation active, permission denied, data loss risk | Product support limitations, empty valid states, expired but inactive records |
| **Yellow (warning)** | Operator action recommended, approaching threshold, mixed outcome | Product maturity facts, informational states, things the operator cannot change |
| **Blue (info)** | In progress, background activity, informational detail | States that could be confused with success outcomes |
| **Green (success)** | Operation succeeded, check passed, data complete | States that could change (success should be stable) |
| **Gray (neutral)** | Archived, not applicable, skipped | Stale data (should be yellow/orange), degraded states that need attention |

### 6.4 Where "Next Action" Is Mandatory

Every state that is NOT green or gray MUST include either:
- An inline explanation (1 sentence) of what happened and whether action is needed
- A link to a resolution path
- An explicit "No action needed — this is expected" label

Mandatory coverage:
- Failed operations → what failed, is it retryable, link to run details
- Partial operations → N of M items processed, link to item-level details
- Blocked operations → what blocked it, link to resolution (permissions page, provider settings, etc.)
- Missing evidence → why missing (no data source, not yet collected, collection error), link to trigger collection
- Stale data → when last updated, link to refresh action
- Publication blockers → which sections, link directly to each section's refresh action

---

## 7. Spec Candidate Recommendations

### Recommended approach: **Foundation spec + domain follow-up specs**

The problem has two layers:
1. A **cross-cutting taxonomy problem** that must be solved once, centrally, before domains can adopt it
2. **Domain-specific adoption** that requires touching each resource/presenter/badge/view

### Spec Package

| Spec | Scope | Priority | Dependencies |
|------|-------|----------|-------------|
| **Spec: Operator Semantic Taxonomy** | Define the target state axes, term dictionary, color rules, diagnostic vs. primary classification, and "next action" policy. Produce a shared reference that all domain specs reference. Update `FidelityState`, `EvidenceCompletenessState`, `TenantReviewCompletenessState` enums to align with new taxonomy. Restructure `GapSummary` into axis-separated notes. | P0 | None |
| **Spec: Baseline Snapshot Semantic Cleanup** | Apply taxonomy to BaselineSnapshotPresenter, FidelityState badge, GapSummary, snapshot summary table. Move renderer-tier info to diagnostics. Redefine "gaps" as categorized coverage notes. Fix "Captured with gaps" label. | P0 | Taxonomy spec |
| **Spec: Evidence Completeness Reclassification** | Fix "Missing" for valid-empty states. Separate freshness from coverage. Add "Not applicable" state for dimensions with no expected data. Fix "Stale" color. | P0 | Taxonomy spec |
| **Spec: Operation Outcome Clarity** | Replace "Partially succeeded" with item-breakdown-aware messaging. Add cause-specific "Blocked" explanations. Add next-action guidance to terminal notifications. Sanitize failure messages to operator language. | P1 | Taxonomy spec |
| **Spec: Restore Semantic Cleanup** | Reduce 13 states to clear operator lifecycle. Distinguish intentional vs. forced skips. Replace "Manual required" with specific guidance. Fix "Partial" ambiguity. Move Graph errors behind diagnostic expandable. | P1 | Taxonomy spec |
| **Spec: Inventory Coverage Language** | Replace "Partial" KPI badge for preview-only types. Replace "Metadata only" warning badge with neutral product-tier indicator. Fix sync error messages. | P1 | Taxonomy spec |
| **Spec: Finding Governance Language** | Replace "missing_support", fix diff-unavailable messages, clarify "Validity" section. | P2 | Taxonomy spec |
| **Spec: Notifications & Alerts Language** | Remove internal IDs from emails, add next-action to terminal notifications, fix summary count keys. | P2 | Taxonomy spec |

### Recommended approach summary:

**Foundation spec first** (Operator Semantic Taxonomy) that establishes the shared vocabulary, color rules, and diagnostic/primary boundary. Then **6–7 domain specs** that apply the taxonomy to each area. The foundation spec should be small and decisive — it defines rules, not implementations. The domain specs do the actual refactoring.

This avoids a monolithic spec that would be too large to review or implement incrementally, while ensuring consistency through the shared taxonomy.