# Feature Specification: Unified Managed Tenant Onboarding Wizard (073)

**Feature Branch**: `073-unified-managed-tenant-onboarding-wizard`  
**Created**: 2026-02-03  
**Status**: Draft  
**Input**: User description: "Single, unified onboarding wizard for Managed Tenants (create/attach connection, verify, optional bootstrap), removing all legacy entry points."

## Clarifications

### Session 2026-02-03

- Q: Which workspace roles can start the onboarding wizard? → A: Only `owner` and `manager`.
- Q: If Provider Connections already exist, what should Step 2 do? → A: Auto-use the existing default connection (and allow switching).
- Q: What is the canonical uniqueness key for a Managed Tenant? → A: Unique globally by `tenant_id` (Entra tenant ID) and bound to exactly one workspace.
- Q: Which Managed Tenant status values exist in v1? → A: `pending`, `active`, `archived`.
- Q: Who can resume an existing onboarding session? → A: Any workspace `owner/manager` with the onboarding capability (shared session per tenant).

## User Scenarios & Testing *(mandatory)*

### User Story 1 - Start Managed Tenant onboarding (Priority: P1)

As a workspace member with the required capability, I can start a single guided onboarding flow that creates (or resumes) a Managed Tenant in the current workspace, so that the tenant is always created consistently and safely.

**Why this priority**: This is the primary entry point and eliminates inconsistent/unsafe creation paths.

**Independent Test**: Can be fully tested by starting the onboarding in an empty workspace, completing step 1, and confirming a single Managed Tenant exists and is bound to that workspace.

**Acceptance Scenarios**:

1. **Given** a user has selected a workspace and has permission to onboard tenants, **When** they complete “Identify Managed Tenant”, **Then** exactly one Managed Tenant record exists for that workspace and tenant identifier.
2. **Given** a user repeats the same step with the same tenant identifier, **When** they submit again, **Then** no duplicate Managed Tenant is created and the existing onboarding session is continued.

---

### User Story 2 - Configure a connection and verify access (Priority: P2)

As a workspace member with the required capability, I can configure (or attach) a Provider Connection for the Managed Tenant and trigger a verification run, so that connectivity and permissions are validated without exposing secrets.

**Why this priority**: Without a validated connection, the tenant cannot be safely managed.

**Independent Test**: Can be tested by completing the “Connection” step and starting a verification run, then asserting the run is created with the expected scope and that no secrets appear in run outputs.

**Acceptance Scenarios**:

1. **Given** a Managed Tenant exists in the current workspace, **When** a user configures a connection, **Then** the system stores the connection as configured without ever showing stored secret material back to the user.
2. **Given** a user confirms they granted consent, **When** they trigger verification, **Then** a background verification run is started and is visible as “queued / running / succeeded / failed” with a sanitized outcome.

---

### User Story 3 - Resume and complete onboarding (Priority: P3)

As a workspace member, I can resume an incomplete onboarding session and complete optional bootstrap actions, so that interrupted onboarding does not create duplicates and finishes in a “ready” state.

**Why this priority**: Real onboarding often pauses for consent/approvals; resumability reduces rework and errors.

**Independent Test**: Can be tested by starting onboarding, leaving it incomplete, resuming, and finishing; then verifying the tenant is “ready” and optional actions create separate runs.

**Acceptance Scenarios**:

1. **Given** onboarding was started but not completed, **When** the user returns later, **Then** they can resume at the correct step with previously entered (non-secret) state.
2. **Given** verification succeeded, **When** the user chooses optional bootstrap actions, **Then** each selected action starts its own background run and onboarding can still be completed.

---

### Edge Cases

- Cross-workspace isolation: a tenant identifier that exists in a different workspace must not be attachable or discoverable (deny-as-not-found).
- Missing capability: members without the required capability see disabled UI affordances, and server-side requests are denied.
- Roles and capabilities: `operator` and `readonly` members cannot start onboarding by default.
- Resume permissions: onboarding can be resumed by any authorized workspace `owner/manager` (not only the initiator).
- Verification failures: outcomes must be actionable (reason code + safe message) and never leak tokens/secrets.
- Idempotency: repeated submissions or refreshes must not create duplicate tenants, duplicate default connections, or a runaway number of active verification runs.
- Last-owner protections: demoting/removing the last owner (workspace or managed tenant) is blocked and recorded for audit.

## Requirements *(mandatory)*

**Constitution alignment (required):** If this feature introduces any Microsoft Graph calls, any write/change behavior,
or any long-running/queued/scheduled work, the spec MUST describe contract registry updates, safety gates
(preview/confirmation/audit), tenant isolation, run observability (`OperationRun` type/identity/visibility), and tests.
If security-relevant DB-only actions intentionally skip `OperationRun`, the spec MUST describe `AuditLog` entries.

**Constitution alignment (RBAC-UX):** If this feature introduces or changes authorization behavior, the spec MUST:
- state which authorization plane(s) are involved (tenant `/admin/t/{tenant}` vs platform `/system`),
- ensure any cross-plane access is deny-as-not-found (404),
- explicitly define 404 vs 403 semantics:
  - non-member / not entitled to tenant scope → 404 (deny-as-not-found)
  - member but missing capability → 403
- describe how authorization is enforced server-side (Gates/Policies) for every mutation/operation-start/credential change,
- reference the canonical capability registry (no raw capability strings; no role-string checks in feature code),
- ensure global search is tenant-scoped and non-member-safe (no hints; inaccessible results treated as 404 semantics),
- ensure destructive-like actions require confirmation (`->requiresConfirmation()`),
- include at least one positive and one negative authorization test, and note any RBAC regression tests added/updated.

**Constitution alignment (OPS-EX-AUTH-001):** OIDC/SAML login handshakes may perform synchronous outbound HTTP (e.g., token exchange)
on `/auth/*` endpoints without an `OperationRun`. This MUST NOT be used for Monitoring/Operations pages.

**Constitution alignment (BADGE-001):** If this feature changes status-like badges (status/outcome/severity/risk/availability/boolean),
the spec MUST describe how badge semantics stay centralized (no ad-hoc mappings) and which tests cover any new/changed values.

### Scope & Assumptions

**In scope (v1)**

- A single onboarding wizard to create or resume onboarding of a Managed Tenant within a selected workspace.
- Configure or attach a Provider Connection, guide consent, start verification runs, and optionally start bootstrap runs.
- Completion marks the tenant as ready/active and routes the user to the tenant details.
- Removal of all legacy UI entry points for creating/onboarding tenants (no redirects).

**Out of scope (v1)**

- User invitation workflows.
- Group-based auto-provisioning.
- Full compliance/evidence reporting.
- Cloud resource provisioning.

**Dependencies**

- Workspace selection/context and workspace membership.
- A managed-tenant concept bound to exactly one workspace.
- Provider Connections and secure credential storage.
- A run system to track verification and bootstrap actions.
- Audit logging and a canonical capability registry.

**Assumptions**

- Default policy: the onboarding initiator becomes workspace manager and managed-tenant owner (or the closest minimum-privilege equivalents).
- “Not found” behavior is used to avoid leaking the existence of out-of-scope tenants.

### Acceptance Coverage

The following acceptance coverage is required to treat the feature as complete:

- Legacy entry points removed (not found behavior).
- Workspace isolation enforced (cross-workspace attach/visibility prevented).
- Idempotency verified (no duplicates created by repeated submissions).
- Verification run creation and sanitized failure reporting.
- Last-owner protections enforced and auditable.

### Functional Requirements

- **FR-001 (Single entry point)**: System MUST provide exactly one UI flow to onboard a Managed Tenant (the onboarding wizard), and all other “add tenant” entry points MUST be removed and behave as “not found”.
- **FR-002 (Workspace-first enforcement)**: System MUST require an active workspace context for onboarding and tenant-scoped access.
- **FR-003 (Hard isolation)**: System MUST deny-as-not-found (404 semantics) when a Managed Tenant does not belong to the current workspace, including for attempts to attach an existing tenant identifier from another workspace.
- **FR-004 (Authorization semantics)**: System MUST enforce authorization server-side for all onboarding mutations and run-start actions. Non-member / not entitled to tenant scope MUST be treated as 404 semantics; a member lacking the required capability MUST be treated as 403 semantics. By default, only workspace `owner` and `manager` can start the onboarding wizard.
- **FR-005 (Capabilities-first)**: System MUST authorize via canonical capabilities (not role string comparisons in feature code).
- **FR-006 (Idempotent tenant identification)**: System MUST upsert tenant identification by a stable tenant identifier within the workspace, so repeating step 1 never creates duplicates.
- **FR-006a (Tenant uniqueness key)**: System MUST enforce a single Managed Tenant globally per `tenant_id` (Entra tenant ID) and bind it to exactly one workspace.
- **FR-007 (Onboarding session resumability)**: System MUST persist onboarding state (excluding secret material) so the flow can be resumed after interruption without data inconsistency.
- **FR-007a (Shared resumability)**: An onboarding session MUST be resumable by any authorized workspace `owner/manager` with the onboarding capability (not only the user who started it).
- **FR-008 (Connection handling)**: System MUST allow creating or attaching a Provider Connection during onboarding and MUST never display stored secret material back to users; UI MUST only show safe configuration indicators (e.g., configured yes/no, last rotation timestamp).
- **FR-008a (Default connection selection)**: If one or more Provider Connections already exist for the Managed Tenant, Step 2 MUST auto-select the default connection and MAY allow the user to switch to a different existing connection.
- **FR-009 (Verification as runs)**: System MUST start verification as a background run with clear status and a sanitized result (reason code + short safe message).
- **FR-010 (DB-only UI rendering)**: System MUST render onboarding UI using only stored data; any external calls required for verification MUST occur only in background work.
- **FR-011 (Operational clarity)**: System MUST display verification outcomes and missing requirements in a user-actionable way (what is missing, what to do next) without leaking sensitive details.
- **FR-012 (Optional bootstrap actions)**: System MUST support optional post-verify bootstrap actions that each start their own background run and do not block completion unless explicitly selected.
- **FR-013 (Completion state)**: System MUST mark the Managed Tenant as ready/active only after successful verification, and MUST redirect users to the Managed Tenant details view upon completion.
- **FR-013a (Status model)**: System MUST use a v1 Managed Tenant lifecycle with statuses: `pending` (created/onboarding), `active` (ready), `archived` (no longer managed).
- **FR-014 (Membership bootstrap)**: System MUST ensure the onboarding initiator receives the minimum required memberships in the workspace and the managed tenant scope according to policy (default: workspace manager + tenant owner).
- **FR-015 (Last-owner protections)**: System MUST block demotion/removal of the last owner at both workspace scope and managed tenant scope, and MUST record the blocked attempt for audit.
- **FR-016 (Auditability)**: System MUST record audit events for tenant creation, connection creation/rotation, verification start/result, membership changes, and last-owner blocks.

### Key Entities *(include if feature involves data)*

- **Workspace**: A portfolio/customer context that owns memberships and one or more Managed Tenants.
- **Managed Tenant**: A managed Entra/Intune tenant, uniquely identified within a workspace by an external tenant identifier, with lifecycle status (e.g., pending/ready/archived).
- Uniqueness: exactly one globally per `tenant_id` (Entra tenant ID), bound to exactly one workspace.
- Status values (v1): `pending`, `active`, `archived`.
- **Provider Connection**: A technical connection configuration that enables access to a Managed Tenant; includes secure credentials/configuration metadata and enabled/default flags.
- **Onboarding Session**: A persistent record of onboarding progress and safe state to support resumability and idempotency.
- **Verification Run**: A background run that validates connectivity and required permissions and produces a sanitized outcome.
- **Membership (Workspace-scoped / Tenant-scoped)**: Defines who can see and operate within a workspace and on a specific managed tenant.

## Success Criteria *(mandatory)*

### Measurable Outcomes

- **SC-001 (Time-to-onboard)**: A workspace admin can complete the wizard up to starting verification in under 3 minutes (excluding external consent/approval waiting time).
- **SC-002 (Idempotency)**: Re-running any wizard step does not create duplicates (0 duplicate tenants per tenant identifier per workspace; 0 duplicate default connections per tenant).
- **SC-003 (Authorization correctness)**: For all onboarding endpoints/actions, non-members see no discoverability and get 404 semantics; members without capability get 403 semantics; authorized users can complete the flow.
- **SC-004 (Secret safety)**: No secrets/tokens are present in run outputs, notifications, audit entries, or error messages (validated by automated tests that assert redaction/sanitization behavior).
- **SC-005 (Operational clarity)**: When verification fails, users can identify the failure reason category (via reason code + safe message) and see the next step without contacting support.

### Badge Semantics (BADGE-001)

- Managed Tenant status badges MUST map from the canonical status set (`pending`, `active`, `archived`) using a centralized mapping (no ad-hoc per-page mapping).