TenantAtlas/specs/053-unify-runs-monitoring/checklists/writing.md

# Requirements Writing Checklist: Unified Operations Runs + Monitoring Hub (053)

**Purpose**: “Unit tests for English” to validate the clarity, completeness, and internal consistency of `spec.md` before implementation planning
**Created**: 2026-01-16
**Feature**: [spec.md](../spec.md)

**Note**: These items validate what the requirements *say* (or don’t say). They are not implementation/QA tests.

## Requirement Completeness

- [x] CHK001 Are all required run attributes explicitly enumerated (type, status, timestamps, initiator, label)? [Completeness, Spec §FR-001] — Evidence: Spec §FR-001 “...run type, scope/target (when applicable), status, timestamps (created/started/finished), initiator (user or system), and a human-readable label...”
- [x] CHK002 Are list requirements complete for Monitoring/Operations (default sort, default time window, and filterable fields/values)? [Completeness, Spec §FR-002] — Evidence: Spec §FR-002 “...sorted most-recent-first by default, defaulting to... (last 30 days), and supporting filtering by run type, status, and time range...”
- [x] CHK003 Are run detail requirements complete, including which summary counts are required and when they apply (total/succeeded/failed/skipped)? [Completeness, Spec §FR-003] — Evidence: Spec §FR-003 “For itemized operations... counts MUST include `total`, `succeeded`, `failed`, and `skipped` (if applicable).”
- [x] CHK004 Are Phase 1 included operations explicitly listed, and are all other candidate operations explicitly deferred? [Completeness, Spec §Scope & Assumptions] — Evidence: Spec §Scope & Assumptions “Phase 1 supported operations are: Drift generation; Backup Set “Add Policies”. All other candidate operations are explicitly deferred to Phase 2+...”
- [x] CHK005 Are “related artifact” link requirements defined per Phase 1 operation (drift findings link; backup set context link)? [Completeness, Spec §FR-012, Spec §FR-013] — Evidence: Spec §FR-012 “...run detail view MUST provide a link to that artifact.” + Spec §FR-013 “...link back to the related backup set context.”
- [x] CHK006 Are notification requirements complete across lifecycle events, including who receives them (initiator only vs tenant operators)? [Completeness, Spec §FR-011] — Evidence: Spec §FR-011 “Phase 1 notifications MUST be delivered to the initiating user...”; Spec §User Story 2 scenarios 4–5 define queued + completion notifications.

## Requirement Clarity

- [x] CHK007 Is “supported long-running operation” bounded and unambiguous for Phase 1 (what is included/excluded)? [Clarity, Spec §Scope & Assumptions] — Evidence: Spec §Scope & Assumptions “Phase 1 supported operations are: Drift generation; Backup Set “Add Policies”...”
- [x] CHK008 Is the run status set closed or open, and are optional statuses (e.g., canceled/aborted) explicitly addressed as included or excluded? [Clarity, Spec §FR-004] — Evidence: Spec §FR-004 “Phase 1 status set: `queued`, `running`, `succeeded`, `partially succeeded`, `failed`... Cancellation/abort outcomes are deferred to Phase 2.”
- [x] CHK009 Is the dedupe definition of “identical run” unambiguous (scope components, time window, and whether initiator matters)? [Clarity, Spec §FR-006] — Evidence: Spec §FR-006 “Identical means... same tenant... same run type... same scope/target... same effective inputs... The initiator MUST NOT be part of the identity...”
- [x] CHK010 Are “sanitized” and “minimized” failure detail requirements defined with explicit redaction expectations and allowed/forbidden content? [Clarity, Spec §FR-010] — Evidence: Spec §FR-010 “MUST NOT include secrets, credentials, tokens, PII, or full external payload dumps...” and defines allowed per-item references.
- [x] CHK011 Is “human-readable label” defined (format, stability, and whether localization is required) or explicitly deferred? [Clarity, Spec §FR-001] — Evidence: Spec §FR-001 “label MUST be a stable operator-facing description combining run type and scope/target (English-only in Phase 1; localization deferred to Phase 2).”
- [x] CHK012 Is the “View run” link requirement specific about its destination and the surfaces where it must appear (immediate confirmation vs lifecycle notifications)? [Clarity, Spec §FR-005, Spec §FR-011] — Evidence: Spec §FR-005 “...‘View run’ link that opens the run detail view...” + Spec §FR-011 “Notifications MUST include a ‘View run’ link that opens the run detail view.”

## Requirement Consistency

- [x] CHK013 Is status vocabulary consistent across the spec (e.g., “completed” vs “succeeded/partially succeeded/failed”)? [Consistency, Spec §FR-004, Spec §SC-001] — Evidence: Spec §FR-004 and Spec §SC-001 use the same status set: `queued` / `running` / `succeeded` / `partially succeeded` / `failed`.
- [x] CHK014 Are view-only constraints consistent across Clarifications, acceptance scenarios, and FR-014 (no manage controls in hub)? [Consistency, Spec §Clarifications, Spec §User Story 1, Spec §FR-014] — Evidence: Spec §Clarifications “Monitoring/Operations is view-only in Phase 1...” + Spec §User Story 1 scenario 5 “...view-only...” + Spec §FR-014 “...MUST be view-only...”
- [x] CHK015 Are permissions consistent across scenarios and FR-009 (who can view vs start), including `Readonly` restrictions? [Consistency, Spec §Scope & Assumptions, Spec §User Story 2, Spec §FR-009] — Evidence: Spec §Scope & Assumptions roles bullet defines `Readonly` view-only; Spec §User Story 2 scenario 2 denies start; Spec §FR-009 forbids `Readonly` start/manage.
- [x] CHK016 Are dedupe semantics consistent between user story scenarios, FR-006, and SC-003? [Consistency, Spec §User Story 3, Spec §FR-006, Spec §SC-003] — Evidence: Spec §User Story 3 scenario 2 (reuses existing run) + Spec §FR-006 (reuse queued/running) + Spec §SC-003 (no more than one active run).

## Acceptance Criteria Quality

- [x] CHK017 Does each user story have acceptance scenarios specific enough to validate without unstated assumptions (filters, links, permissions)? [Measurability, Spec §User Scenarios & Testing] — Evidence: Spec §User Story 1 scenarios specify default sort/window + filters + cross-tenant denial; Spec §User Story 2 scenarios specify notifications + background-unavailable behavior; Spec §User Story 3 scenarios specify findings link + failure summary.
- [x] CHK018 Are success criteria measurable with defined measurement methods/proxies (e.g., how “under 30 seconds” is assessed)? [Measurability, Spec §Success Criteria] — Evidence: Spec §SC-001 “measured via timed operator walkthroughs...” + Spec §SC-002/SC-003/SC-004 include measurement notes.
- [x] CHK019 Is “within 2 seconds under normal conditions” defined with explicit conditions/examples so it can be measured consistently? [Clarity, Spec §SC-002] — Evidence: Spec §SC-002 defines “normal conditions” (no active service degradation; typical tenant dataset sizes; excludes maintenance/outage windows).
- [x] CHK020 Are “99% of repeated-start attempts” and “95% of cases” scoped precisely (population, timeframe, and measurement approach)? [Clarity, Spec §SC-003, Spec §SC-004] — Evidence: Spec §SC-003 and §SC-004 specify Phase 1 population, rolling 30-day window, and measurement approaches.

## Scenario Coverage

- [x] CHK021 Are requirements/scenarios present for system-initiated runs (no interactive initiator) and how they appear in Monitoring/Operations? [Coverage, Spec §Scope & Assumptions, Spec §User Story 1] — Evidence: Spec §Scope & Assumptions “System-initiated runs may exist... initiator shown as ‘System’.” + Spec §User Story 1 scenario 9.
- [x] CHK022 Are scenarios present for cross-tenant access attempts for both list and detail views (and expected denial behavior)? [Coverage, Spec §User Story 1, Spec §FR-008] — Evidence: Spec §User Story 1 scenario 7 “...access is denied and no run data is disclosed.” + Spec §FR-008 “...MUST NOT disclose run existence or details.”
- [x] CHK023 Do scenarios cover related-artifact behavior across outcomes (results link on success; safe failure summary on failure)? [Coverage, Spec §User Story 3, Spec §FR-012] — Evidence: Spec §User Story 3 scenario 3 (findings link on success) + scenario 4 (safe failure summary on failure) + Spec §FR-012.
- [x] CHK024 Is the “permissions changed mid-run” scenario defined with explicit expected outcomes (viewability and notifications)? [Completeness, Spec §Edge Cases] — Evidence: Spec §Edge Cases “visibility is evaluated at time of access; and completion notifications are delivered only if the recipient remains authorized...”

## Edge Case Coverage

- [x] CHK025 Are Edge Cases written as explicit expected behaviors (not only open questions), or explicitly deferred with ownership/timing? [Completeness, Spec §Edge Cases] — Evidence: Spec §Edge Cases bullets are written as “If... MUST...” statements (no open placeholders).
- [x] CHK026 Is drift eligibility defined (minimum data needed) and the user-facing outcome when eligibility is not met? [Completeness, Spec §User Story 3, Spec §Edge Cases] — Evidence: Spec §User Story 3 scenario 5 “...not enough eligible data...” + Spec §Edge Cases references failing/refusing with reason code (e.g., `insufficient_data`) and actionable message.
- [x] CHK027 Are “very large scope” partial completion expectations defined (counts semantics; when partial vs failed applies)? [Clarity, Spec §User Story 1, Spec §FR-004] — Evidence: Spec §User Story 1 scenario 8 defines partial vs failed; Spec §FR-004 defines status semantics; Spec §Edge Cases addresses large scopes.
- [x] CHK028 Are failure-display requirements defined for sensitive underlying errors (what is shown vs redacted)? [Clarity, Spec §Edge Cases, Spec §FR-010] — Evidence: Spec §Edge Cases “...only stable reason codes and short sanitized messages are shown...” + Spec §FR-010 “MUST NOT include secrets... tokens, PII, or full external payload dumps.”

## Non-Functional Requirements

- [x] CHK029 Are confidentiality constraints for stored/displayed failures explicit and framed as hard requirements (no secrets/tokens/full payload dumps)? [Non-Functional, Security, Spec §FR-010] — Evidence: Spec §FR-010 “MUST NOT include secrets, credentials, tokens, PII, or full external payload dumps.”
- [x] CHK030 Are reliability expectations defined for background execution and notification delivery, including user-visible behavior when background execution is unavailable? [Completeness, Spec §Scope & Assumptions, Spec §FR-005, Spec §FR-011] — Evidence: Spec §FR-005 defines behavior when background execution is unavailable; Spec §FR-011 “If a notification cannot be delivered, Monitoring/Operations remains the source of truth...”
- [x] CHK031 Are scalability expectations defined for Monitoring/Operations usage (expected run volume, retention horizon, and list size constraints) or explicitly deferred? [Completeness, Spec §Scope & Assumptions] — Evidence: Spec §Scope & Assumptions “Run retention horizon and scale targets... are deferred to Phase 2 (Owner: Product)...”
- [x] CHK032 Are auditability requirements explicit beyond “initiator metadata” (what constitutes the audit trail for start/outcome/view)? [Completeness, Spec §Scope & Assumptions] — Evidence: Spec §Scope & Assumptions “Auditability for Phase 1 is achieved via the run record... and lifecycle notifications. Auditing ‘who viewed which run’ is deferred to Phase 2...”

## Dependencies & Assumptions

- [x] CHK033 Are Phase 1 adoption assumptions (which operations adopt the unified run model now vs later) stable, complete, and traceable to requirements? [Assumption, Spec §Scope & Assumptions, Spec §FR-002, Spec §FR-007, Spec §FR-013] — Evidence: Spec §Scope & Assumptions lists Phase 1 supported ops; Spec §FR-002 requires filtering for them; Spec §FR-007 and §FR-013 require run tracking for them.
- [x] CHK034 Are environment dependencies (background execution) mapped to requirement-level behavior when unmet (operator guidance, degraded mode)? [Completeness, Spec §Scope & Assumptions, Spec §FR-005] — Evidence: Spec §Scope & Assumptions describes dependency; Spec §FR-005 defines clear error + no misleading “queued” confirmation when unavailable.
- [x] CHK035 Is tenant isolation defined beyond “forbidden cross-tenant” with enough specificity to guide acceptance and review (e.g., scoping expectations)? [Clarity, Spec §FR-008, Spec §User Story 1] — Evidence: Spec §FR-008 “Tenant scoping MUST be applied before any filtering... MUST NOT disclose run existence or details.” + Spec §User Story 1 scenario 7.

## Ambiguities & Conflicts

- [x] CHK036 Are the roles `Owner`/`Manager`/`Operator`/`Readonly` defined or referenced so permission requirements are interpretable to reviewers? [Clarity, Spec §Scope & Assumptions] — Evidence: Spec §Scope & Assumptions roles bullet defines view/start capabilities per role.
- [x] CHK037 Is “Monitoring/Operations” named consistently and distinguished from per-feature start surfaces (avoid competing synonyms)? [Consistency, Spec §User Story 1, Spec §Scope & Assumptions, Spec §FR-014] — Evidence: Spec consistently uses “Monitoring/Operations” for the hub and states start controls remain in feature areas (Spec §Scope & Assumptions; Spec §FR-014).
- [x] CHK038 Is “run type” terminology consistent between Key Entities and filtering/permissions language (avoid conflicting synonyms like type/resource/action)? [Consistency, Spec §Key Entities, Spec §FR-002, Spec §FR-009] — Evidence: Spec §Key Entities defines “Run Type”; Spec §FR-002 and §FR-009 use “run type” for filtering/permissions.
- [x] CHK039 Is traceability complete: does every “MUST” requirement have at least one corresponding scenario and/or measurable success criterion (or an explicit rationale if not)? [Traceability, Spec §Functional Requirements, Spec §User Scenarios & Testing, Spec §Success Criteria] — Evidence: US1 scenarios cover FR-002/003/004/008/014; US2 scenarios cover FR-005/011/013; US3 scenarios cover FR-006/007/012; Success Criteria (SC-001–SC-005) provide measurable outcomes for the overall feature.

## Notes

- Mark items as completed: `[x]`
- Capture findings inline under the relevant item(s) (add links/quotes from `spec.md` as needed)