TenantAtlas/docs/product/spec-candidates.md
ahmido 2fa8fc0f87
Some checks failed
PR Fast Feedback / fast-feedback (pull_request) Failing after 51s
refactor: remove findings lifecycle backfill runtime surfaces (#294)
## Summary
- decommission the legacy findings lifecycle backfill substrate across command, job, service, and UI layers
- remove related platform capabilities, operation catalog entries, and action surface exemptions
- add regression and removal verification tests to ensure runtime integrity and surface absence
- include spec, plan, tasks, and data-model artifacts for the removal slice

## Scope
- active spec: specs/253-remove-findings-backfill-runtime-surfaces
- target branch: dev

## Validation
- integrated regression and removal verification tests for console, findings, and system ops surfaces
- audit log and capability trace verification for the removal path

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #294
2026-04-28 22:00:51 +00:00

357 lines
22 KiB
Markdown

# Spec Candidates
> Repo-based next-spec queue for TenantPilot.
> This file is not a wishlist. It tracks only open gaps that are still worth turning into new or refreshed specs.
> **Last reviewed**: 2026-04-28
> **Basis**: `implementation-ledger.md`, `roadmap.md`, current `specs/` truth
---
## Candidate Rules
- Work repo-based, not roadmap-aspirational.
- Do not keep implemented features as active candidates.
- Do not keep already-specced foundations as active candidates unless a narrower follow-up gap remains.
- P0 is reserved for blockers to the next sellable release.
- P1 is for enterprise and product maturity gaps.
- P2 is for commercial and scale readiness.
- P3 is for later platform ambitions after current release blockers close.
- Existing candidate history is preserved through `Promoted to Spec`, `Deferred`, and `Superseded / Removed` notes rather than silent deletion.
## Active Candidate Queue
### P0 — Release Blockers
### Customer Review Workspace v1
- **Priority**: P0
- **Why this stays active**: The repo already has strong internal review foundations: tenant reviews, evidence snapshots, review packs, redaction paths, entitlements, audit, and RBAC-aware surfaces. What is still missing is the customer-safe read-only consumption layer that turns those internal assets into a clearly sellable review product.
- **Roadmap relationship**: R2 completion / customer-facing review consumption.
- **Dependencies**:
- `TenantReview`
- `EvidenceSnapshot`
- `ReviewPack`
- existing redaction behavior
- workspace entitlements
- tenant/workspace RBAC and audit foundations
- **Scope**:
- customer-safe read-only workspace or view for latest review state
- latest findings and accepted risks in customer-safe form
- review-pack download surface with existing redaction rules
- explicit absence of admin or remediation actions
- clear authorization boundaries for customer and read-only viewers
- **Non-scope**:
- admin settings
- remediation actions
- raw operator diagnostics
- a broader customer portal rewrite
- billing or contract workflows
- **Acceptance criteria**:
- an authorized customer or read-only actor can open the review workspace
- latest review status, accepted risks, and key findings are visible without exposing admin controls
- review-pack downloads respect existing redaction and entitlement rules
- tenant and workspace isolation are enforced and tested
- audit-sensitive or operator-only data is not exposed through this surface
- **Notes**: This is the clearest repo-derived blocker between current internal review strength and a cleaner sellable release.
### P1 — Enterprise Maturity
### Decision-Based Governance Inbox v1
- **Priority**: P1
- **Why this stays active**: Findings, alerts, operation runs, review-pack generation, and portfolio triage already exist, but operators still work across several surfaces. The next maturity step is a single decision-oriented work surface, not more raw detail pages.
- **Roadmap relationship**: Findings workflow maturity; later MSP Portfolio OS prerequisite.
- **Dependencies**:
- findings workflow semantics and inbox foundations from Specs 219, 221, 222, 224, 225, 230, 231
- alert routing foundation
- `OperationRun` truth
- portfolio triage continuity
- contextual help and reason-code surfaces where helpful
- **Scope**:
- one operator-facing inbox for high-signal governance work
- grouping or prioritization across findings, alerts, stale runs, and related attention signals
- direct action links into compare, finding review, review-pack generation, or triage paths
- auditable state changes such as snooze, assign, or acknowledge where already supported
- **Non-scope**:
- autonomous remediation
- AI-generated recommendations
- customer-facing inboxes
- full cross-tenant workboard redesign
- **Acceptance criteria**:
- one surface shows prioritized governance work from more than one underlying signal family
- actions route to existing product truth rather than duplicating state
- visibility is capability-aware and workspace-safe
- auditable state changes are recorded where the inbox mutates work state
- tests prove signal grouping and authorization boundaries
- **Notes**: Important, but not a P0 release blocker while Customer Review Workspace is still missing.
### Cross-Tenant Compare and Promotion v1
- **Priority**: P1
- **Why this stays active**: Portfolio triage exists, but portfolio action does not. The repo already contains an older draft spec for this direction, yet the capability is not repo-proven as a finished product workflow.
- **Roadmap relationship**: MSP Portfolio & Operations.
- **Existing spec**: Spec 043 exists and should be refreshed against current repo truth rather than replaced by a new broad direction.
- **Dependencies**:
- inventory foundations
- baseline compare truth
- restore and execution guardrails
- audit log foundation
- tenant and workspace isolation plus RBAC
- **Scope**:
- choose source and target tenants within allowed scope
- show a structured compare preview
- support a dry-run or promotion preflight before any write path
- preserve auditability and scope boundaries
- **Non-scope**:
- blind one-click promotion
- autonomous rollout
- multi-cloud or multi-provider compare
- full MSP control-plane redesign
- **Acceptance criteria**:
- operator can produce a compare preview between two allowed tenants
- promotion path includes explicit preflight or dry-run semantics
- authorization and tenant isolation are enforced and tested
- audit trail exists for compare and promotion entry points
- the slice refreshes or narrows Spec 043 instead of reopening it as a vague ambition
### Localization v1
- **Priority**: P1
- **Why this stays active**: The repo and roadmap both indicate this is still absent. It is not a backend foundation gap; it is a product maturity gap that will get more expensive as the governance surface grows.
- **Roadmap relationship**: R1.9 Platform Localization v1.
- **Dependencies**:
- existing status and terminology catalogs
- contextual help boundaries
- notification and UI copy inventory on critical surfaces
- locale resolution rules for workspace, user, and system context
- **Scope**:
- `de` and `en` on core governance surfaces
- locale resolution order and fallback behavior
- locale-aware formatting for dates, times, and numbers
- stable machine and export formats that remain non-localized
- **Non-scope**:
- public website localization
- broad documentation translation
- retrospective translation of every legacy free-text record
- marketing copy systems
- **Acceptance criteria**:
- core navigation, dashboard, findings, baseline compare, alerts, and operations surfaces support `de` and `en`
- no raw translation keys appear on critical UI paths
- fallback to English is controlled and predictable
- locale-aware formatting does not affect audit or export truth
- targeted regression coverage exists for fallback and key critical flows
### Remove Findings Lifecycle Backfill Runtime Surfaces
- **Priority**: P1
- **Why this stays active**: Repo audit shows visible runtime surfaces for a pre-production findings lifecycle repair path even though active finding generators already write the relevant lifecycle fields directly. The remaining path is not just ballast; it appears partially detached from current operational-control truth and keeps internal repair tooling productized.
- **Roadmap relationship**: Findings workflow cleanup / legacy removal.
- **Dependencies**:
- current finding generators that already set lifecycle fields directly
- system runbook registry and execution surfaces
- tenant findings actions
- operation catalog, capability, and seeder bindings
- backfill jobs, runbook service, and deploy hooks
- **Scope**:
- remove the system runbook `Rebuild Findings Lifecycle`
- remove the tenant action `Backfill findings lifecycle`
- remove the command `tenantpilot:findings:backfill-lifecycle`
- remove findings lifecycle backfill jobs, runbook services, and deploy/runtime hooks
- remove operation-catalog, capability, seeder, and test traces that exist only for this backfill path
- **Non-scope**:
- removing the legacy `acknowledged` status or related compatibility helpers
- changing normal finding workflow actions such as triage, assignment, progress, resolve, or risk acceptance
- changing ownership, assignee, SLA, due-date, or risk-governance semantics
- changing historical migrations or adding replacement backfills
- **Acceptance criteria**:
- no `/admin` surface exposes `Backfill findings lifecycle`
- no system runbook exposes `Rebuild Findings Lifecycle`
- `tenantpilot:findings:backfill-lifecycle` is no longer a supported command
- deploy or operational hooks do not start a findings lifecycle backfill
- `findings.lifecycle.backfill` is no longer used as an operational-control key, operation type, or capability
- tests no longer expect backfill preflight, start, or completion behavior
- normal finding workflows keep working unchanged for triage, assignment, start progress, resolve, and risk acceptance
- **Notes**: This is the first and most important cleanup candidate because it removes visible product ballast without changing the canonical findings workflow semantics.
### Remove Legacy Acknowledged Finding Status Compatibility
- **Priority**: P1
- **Why this stays active**: Repo audit indicates that `acknowledged` compatibility still survives in status helpers, filters, badges, capabilities, and tests even though the current operator workflow is centered on `triaged`. Keeping both semantics alive weakens workflow clarity and RBAC consistency.
- **Roadmap relationship**: Findings workflow semantics / RBAC cleanup.
- **Dependencies**:
- finding status constants and model helpers
- badge and filter catalogs
- role capability mappings and capability aliases
- workflow and bulk-action tests that still speak in acknowledge semantics
- **Scope**:
- remove `Finding::STATUS_ACKNOWLEDGED`
- remove or simplify compatibility helpers that only map `acknowledged` to `triaged`
- remove `openStatusesForQuery()` compatibility for `acknowledged`
- remove legacy capability aliases such as `tenant_findings.acknowledge`
- rename, adapt, or remove tests that only protect the old acknowledge vocabulary
- ensure active workflow actions consistently use `triage` / `triaged`
- **Non-scope**:
- removing findings lifecycle backfill runtime surfaces in the same slice
- changing SLA, ownership, assignee, or risk-acceptance behavior
- introducing new workflow states or new customer-facing workflow surfaces
- changing finding generators unless they still emit `acknowledged`
- **Acceptance criteria**:
- no productive code path writes `acknowledged`
- no productive code path expects `acknowledged` as a valid workflow status
- `tenant_findings.acknowledge` no longer exists as a capability or alias
- workflow actions, filters, badges, and tests consistently use `triage` / `triaged`
- existing finding flows remain functional from `new` to `triaged`, `in_progress`, `resolved`, and risk-accepted outcomes
- **Notes**: Keep this separate from backfill removal because it reaches deeper into workflow semantics, queries, badges, and RBAC mappings.
### Enforce Creation-Time Finding Invariants
- **Priority**: P1
- **Why this stays active**: Removing lifecycle backfills only stays safe if new findings are always created in a lifecycle-ready state. The repo already hints at good direct-write behavior, but those invariants still need explicit protection so future generators do not recreate the need for repair jobs.
- **Roadmap relationship**: Findings data integrity / workflow hardening.
- **Dependencies**:
- drift and baseline compare finding generation
- permission posture finding generation
- Entra admin roles finding generation
- rediscovery, reopen, and deduplication behavior around recurrence keys and lifecycle timestamps
- **Scope**:
- review active finding generators and verify lifecycle-ready creation
- add or tighten invariant tests around canonical status, first/last seen timestamps, `times_seen`, `sla_days`, and `due_at` where applicable
- verify reopen and rediscovery behavior
- verify drift idempotency and recurrence-key semantics
- consider a tightly bounded DB constraint only if the repo proves a safe, narrow case
- **Non-scope**:
- reintroducing any backfill or repair runtime surface
- historical data migration work
- forcing owner or assignee fields to become mandatory
- introducing new finding types or broader customer review workflow changes
- **Acceptance criteria**:
- repo-verified finding generators have tests that prove lifecycle-ready creation
- no new finding generation path relies on a later backfill or repair run
- repeated drift detection does not create uncontrolled canonical duplicates
- reopen or rediscovery behavior updates lifecycle fields correctly
- accountability remains a governance state rather than a forced owner/assignee requirement
- **Notes**: This should follow the visible cleanup work and protects the target state so findings do not regress back into repair-job dependency.
### P2 — Commercial / Scale
### Commercial Entitlements and Billing-State Maturity
- **Priority**: P2
- **Why this stays active**: The repo already has a real entitlement foundation and an existing spec for plans and billing readiness. The remaining gap is narrower: commercial lifecycle maturity, not inventing entitlements from scratch.
- **Roadmap relationship**: Product Scalability & Self-Service Foundation.
- **Existing spec context**: Spec 247 exists for `Plans, Entitlements & Billing Readiness`. This candidate is the follow-up gap after the current entitlement substrate, not a duplicate foundation spec.
- **Dependencies**:
- existing `WorkspaceEntitlementResolver`
- workspace settings surfaces
- review-pack entitlement gates
- audit foundation
- customer-facing read-only and suspension semantics where applicable
- **Scope**:
- commercial lifecycle states such as trial, grace, suspended/read-only, and active paid usage
- clearer enforcement at key product gates
- explicit disabled and read-only messaging distinct from authorization failures
- audited state changes and overrides
- **Non-scope**:
- payment provider integration
- invoicing
- tax or accounting workflows
- public pricing pages
- **Acceptance criteria**:
- central commercial state can be resolved for a workspace
- at least two real behaviors are gated by lifecycle state, not scattered conditionals
- read-only or suspended behavior preserves safe access to needed history or evidence while blocking disallowed actions
- changes and overrides are audited
- tests cover blocked and allowed paths
### External Support Desk / PSA Handoff
- **Priority**: P2
- **Why this stays active**: In-app support requests are already repo-real. The remaining gap is external handoff and visible ticket linkage, not support-request creation itself.
- **Roadmap relationship**: R2 support follow-through; later commercial scale.
- **Dependencies**:
- support request context flow from Spec 246
- support diagnostic pack
- audit logging
- tenant and workspace authorization boundaries
- **Scope**:
- outbound adapter seam for one support desk or PSA target
- store and display external ticket reference
- auditable create or link actions
- visible product linkage back from support requests to external references
- **Non-scope**:
- full bidirectional sync
- SLA engine
- generic helpdesk product
- AI support automation
- **Acceptance criteria**:
- a support request can create or link an external ticket through one bounded adapter
- resulting ticket reference is stored and visible in the right context
- failures are explicit and auditable
- tenant and workspace scope are enforced and tested
- the slice extends the existing support-request model instead of replacing it
### P3 — Later Platform Ambitions
- No active P3 candidate from the current focus set.
- `Private AI Execution & Policy Foundation` is already promoted as Spec 248 and should no longer remain in the open candidate queue.
- Broader AI-assisted customer operations can return later as a follow-up only after Spec 248 and the current customer-facing release gaps are materially closed.
## Deferred / Existing Drafts Outside the Current Queue
These items are still useful, but they are not the next best open specs from the current repo state.
- `Policy Lifecycle / Ghost Policies`: still a valid gap, but not ahead of Customer Review Workspace or Cross-Tenant Compare.
- `Workspace-level PII override for review packs`: bounded deferred follow-up from Spec 109.
- `CSV export for filtered run metadata`: valid system-console follow-up, but not near the top of the queue.
- `Raw error/context drilldowns for system console`: useful operator enhancement, but not ahead of current P0-P2 gaps.
- UI polish snippets such as dashboard sparklines, density toggles, louder attention cards, or chooser refinements: keep out of the active spec queue until they become bounded release work.
## Promoted to Spec
Historical ledger for candidates that are no longer open. Keep them here so prioritization stays clean without losing decision history.
- Canonical Operation Type Source of Truth -> Spec 239 (`canonical-operation-type-source-of-truth`)
- Self-Service Tenant Onboarding & Connection Readiness -> Spec 240 (`tenant-onboarding-readiness`)
- Support Diagnostic Pack -> Spec 241 (`support-diagnostic-pack`)
- Operational Controls & Feature Flags -> Spec 242 (`operational-controls`)
- Product Usage & Adoption Telemetry -> Spec 243 (`product-usage-adoption-telemetry`)
- Product Knowledge & Contextual Help -> Spec 244 (`product-knowledge-contextual-help`)
- Customer Health Score -> Spec 245 (`customer-health-score`)
- In-App Support Request with Context -> Spec 246 (`support-request-context`)
- Plans, Entitlements & Billing Readiness -> Spec 247 (`plans-entitlements-billing-readiness`)
- Private AI Execution & Policy Foundation -> Spec 248 (`private-ai-policy-foundation`)
- Queued Execution Reauthorization and Scope Continuity -> Spec 149 (`queued-execution-reauthorization`)
- Livewire Context Locking and Trusted-State Reduction -> Spec 152 (`livewire-context-locking`)
- Evidence Domain Foundation -> Spec 153 (`evidence-domain-foundation`)
- Exception / Risk-Acceptance Workflow for Findings -> Spec 154 (`finding-risk-acceptance`)
- Operator Outcome Taxonomy and Cross-Domain State Separation -> Spec 156 (`operator-outcome-taxonomy`)
- Operator Reason Code Translation and Humanization Contract -> Spec 157 (`reason-code-translation`)
- Governance Artifact Truthful Outcomes & Fidelity Semantics -> Spec 158 (`artifact-truth-semantics`)
- Operator Explanation Layer for Degraded / Partial / Suppressed Results -> Spec 161 (`operator-explanation-layer`)
- Request-Scoped Derived State and Resolver Memoization -> Spec 167 (`derived-state-memoization`)
- Tenant Governance Aggregate Contract -> Spec 168 (`tenant-governance-aggregate-contract`)
- Record Page Header Discipline & Contextual Navigation -> Spec 192 (`record-header-discipline`)
- Monitoring Surface Action Hierarchy & Workbench Semantics -> Spec 193 (`monitoring-action-hierarchy`)
- Governance Friction & Operator Vocabulary Hardening -> Spec 194 (`governance-friction-hardening`)
- Governance Operator Outcome Compression -> Spec 214 (`governance-outcome-compression`)
- Provider-Backed Action Preflight and Dispatch Gate Unification -> Spec 216 (`provider-dispatch-gate`)
- Finding Ownership Semantics Clarification -> Spec 219 (`finding-ownership-semantics`)
- Humanized Diagnostic Summaries for Governance Operations -> Spec 220 (`governance-run-summaries`)
- Findings Operator Inbox v1 -> Spec 221 (`findings-operator-inbox`)
- Findings Intake & Team Queue v1 -> Spec 222 (`findings-intake-team-queue`)
- Findings Notifications & Escalation v1 -> Spec 224 (`findings-notifications-escalation`)
- Assignment Hygiene & Stale Work Detection -> Spec 225 (`assignment-hygiene`)
- Findings Notification Presentation Convergence -> Spec 230 (`findings-notification-convergence`)
- Finding Outcome Taxonomy & Verification Semantics -> Spec 231 (`finding-outcome-taxonomy`)
- Operation Run Link Contract Enforcement -> Spec 232 (`operation-run-link-contract`)
- Operation Run Active-State Visibility & Stale Escalation -> Spec 233 (`stale-run-visibility`)
- Provider Boundary Hardening -> Spec 237 (`provider-boundary-hardening`)
## Superseded / Removed From Active Queue
These items were previously open candidates or roadmap-fit ideas, but should no longer stay in the active queue.
- `R2.0 Canonical Control Catalog Foundation`: remove from active candidates because the ledger shows a repo-real catalog, config, bindings, review integration, and test coverage. This is no longer an open candidate; it is an implemented foundation.
- `Self-Service Tenant Onboarding & Connection Readiness`: remove from active candidates because it is already Spec 240 and the repo already shows meaningful adoption.
- `Support Diagnostic Pack`: remove from active candidates because it is already Spec 241 and repo-adopted.
- `Operational Controls & Feature Flags`: remove from active candidates because it is already Spec 242 and repo-adopted.
- `Product Usage & Adoption Telemetry`: remove from active candidates because it is already Spec 243 and repo-adopted.
- `Product Knowledge & Contextual Help`: remove from active candidates because it is already Spec 244; any remaining work should be narrower follow-ups, not a repeated top-level candidate.
- `Customer Health Score`: remove from active candidates because it is already Spec 245 and repo-adopted.
- `In-App Support Request with Context`: remove from active candidates because it is already Spec 246 and repo-implemented.
- `Plans, Entitlements & Billing Readiness`: remove as a broad active candidate because Spec 247 already exists and the remaining open gap is narrower commercial lifecycle maturity.
- `Private AI Execution & Policy Foundation`: remove from the active queue because Spec 248 already exists.
- Company-ops items such as `Lead Capture & CRM Pipeline`, `AVV / DPA / TOM / Legal Pack`, `Vendor Questionnaire Answer Bank`, `Business Continuity / Founder Backup Plan`, and similar operating artifacts should remain outside the active product-spec queue unless a concrete product slice emerges.