TenantAtlas/specs/069-managed-tenant-onboarding-wizard/spec.md
2026-02-01 12:20:09 +01:00

14 KiB
Raw Blame History

Feature Specification: Managed Tenant Onboarding Wizard v1

Feature Branch: 069-managed-tenant-onboarding-wizard
Created: 2026-01-31
Status: Draft
Input: User description: "Spec 069 — Managed Tenant Onboarding Wizard v1 (Single Front Door, DB-only render, enqueue-only runs, resumable onboarding session, RBAC-UX enforcement, remove legacy entry points)."

Clarifications

Session 2026-01-31

  • Q: Do we need to store local app credentials (client_id/client_secret) for Managed Tenants in v1? → A: Conditional — Step 3 only when a config/driver says “credentials required”.
  • Q: When a user is a workspace member but lacks a capability and tries the action/server endpoint, what should the server return? → A: 403 Forbidden.
  • Q: For the legacy URL /admin/new (old managed tenant create entry), where should it redirect? → A: Redirect to “Choose workspace” (then start wizard from there).
  • Q: Who is allowed to resume an existing onboarding session for a Managed Tenant? → A: Any workspace member with managed_tenants.create (and tenant-scoped access).
  • Q: If a user starts the wizard again for the same workspace + tenant ID while an active onboarding session already exists, what should happen? → A: Auto-resume the existing active session.

Terminology (Repository Mapping)

  • In this repository, the specs term Workspace maps to the existing Tenant concept (tenant-plane container + memberships).
  • Capability names shown in this spec (e.g. managed_tenants.create) are conceptual for stakeholders; implementation MUST map them onto the canonical capability registry and MUST NOT introduce new raw capability strings in feature code.

User Scenarios & Testing (mandatory)

User Story 1 - Onboard a managed tenant end-to-end (Priority: P1)

As a workspace Owner, I can onboard a new Managed Tenant through a consistent, guided wizard so onboarding is repeatable and results in a tenant that is ready to run verification/health operations.

Why this priority: This is the primary business outcome: reliable onboarding and operational readiness.

Independent Test: Can be fully tested by completing the wizard and observing that the system marks onboarding complete and allows runs to be started.

Acceptance Scenarios:

  1. Given a user is a workspace Owner and no Managed Tenant exists for the target tenant ID, When they start the wizard and complete the steps, Then a Managed Tenant record exists and onboarding is marked complete.
  2. Given a user started onboarding and leaves mid-way, When they return, Then they can resume the wizard at the last completed step with their previously entered (non-secret) data.
  3. Given a Managed Tenant already exists in the workspace with the same tenant ID, When the user enters that tenant ID, Then the wizard prevents creating a duplicate and guides the user to the existing tenant's onboarding/resume state.

User Story 2 - Run verification checks without blocking page loads (Priority: P2)

As an authorized operator, I can trigger verification/health operations for a Managed Tenant so the system checks permissions and connectivity without performing external calls during page rendering.

Why this priority: Operational safety and predictability; the UI must remain responsive and all outbound work must be observable.

Independent Test: Can be tested by loading wizard steps (no outbound activity on render) and then triggering a verification action that creates a run.

Acceptance Scenarios:

  1. Given a Managed Tenant is in onboarding, When the user clicks “Verify permissions”, Then a background run is queued and the page does not perform synchronous external calls.
  2. Given the last verification run reported missing permissions, When the user visits the permissions step, Then they see the stored “Granted/Missing” status from the last run.

User Story 3 - RBAC-UX enforcement and safe access semantics (Priority: P3)

As a tenant-plane user, I can only see and interact with wizard and tenant actions I am entitled to, with deny-as-not-found for non-members and server-side enforcement for every action.

Why this priority: Prevents information leakage across tenants/workspaces and ensures policy-compliant enforcement.

Independent Test: Can be tested by attempting to access the wizard as a non-member, and as a member lacking specific capabilities.

Acceptance Scenarios:

  1. Given a user is not a member of the workspace scope, When they attempt to access the onboarding wizard or tenant pages, Then they receive a 404 response (deny-as-not-found).
  2. Given a user is a member but lacks the relevant capability, When they view the wizard step, Then restricted actions are disabled with an explanatory tooltip and server-side attempts are rejected with 403.

Edge Cases

  • Invalid tenant ID format entered (not a UUID/GUID).
  • Attempt to create a second Managed Tenant with the same tenant ID within the same workspace.
  • Two users start onboarding the same Managed Tenant concurrently.
  • A user loses membership/capabilities while an onboarding session is in progress.
  • Verification run fails (transient error) and surfaces a stored error code/status without breaking page rendering.
  • Credentials are required but not yet set; wizard shows “missing” state.
  • Credentials were set previously; wizard shows “set” state without revealing secret values.

Requirements (mandatory)

Constitution alignment (required): If this feature introduces any Microsoft Graph calls, any write/change behavior, or any long-running/queued/scheduled work, the spec MUST describe contract registry updates, safety gates (preview/confirmation/audit), tenant isolation, run observability (OperationRun type/identity/visibility), and tests. If security-relevant DB-only actions intentionally skip OperationRun, the spec MUST describe AuditLog entries.

Constitution alignment (RBAC-UX): If this feature introduces or changes authorization behavior, the spec MUST:

  • state which authorization plane(s) are involved (tenant /admin/t/{tenant} vs platform /system),
  • ensure any cross-plane access is deny-as-not-found (404),
  • explicitly define 404 vs 403 semantics:
    • non-member / not entitled to tenant scope → 404 (deny-as-not-found)
    • member but missing capability → 403 (Forbidden)
  • describe how authorization is enforced server-side (Gates/Policies) for every mutation/operation-start/credential change,
  • reference the canonical capability registry (no raw capability strings; no role-string checks in feature code),
  • ensure global search is tenant-scoped and non-member-safe (no hints; inaccessible results treated as 404 semantics),
  • ensure destructive-like actions require confirmation (->requiresConfirmation()),
  • include at least one positive and one negative authorization test, and note any RBAC regression tests added/updated.

Constitution alignment (OPS-EX-AUTH-001): OIDC/SAML login handshakes may perform synchronous outbound HTTP (e.g., token exchange) on /auth/* endpoints without an OperationRun. This MUST NOT be used for Monitoring/Operations pages.

Constitution alignment (BADGE-001): If this feature changes status-like badges (status/outcome/severity/risk/availability/boolean), the spec MUST describe how badge semantics stay centralized (no ad-hoc mappings) and which tests cover any new/changed values.

Assumptions & Dependencies

  • Depends on the existing workspace + managed tenant foundations from Spec 068 v2 (including canonical naming and tenant-plane routing).
  • The onboarding wizard lives in the tenant-plane admin area (not the platform/system area).
  • Credential capture is required only if the product uses local credentials for managed tenants; otherwise that step is skipped/hidden.
  • A single configuration/driver flag determines whether credentials are required for the current environment.
  • Permission/connection status displayed in the wizard is based on stored results from the latest completed verification run.

Functional Requirements

  • FR-001 (Single Front Door): The system MUST allow creation of a new Managed Tenant only via the onboarding wizard.
  • FR-002 (Disable Legacy Entry Points): The system MUST remove or disable all previous “Add Tenant/Create” entry points and MUST redirect any legacy creation URLs to an onboarding-appropriate destination.
  • FR-002a (Legacy /admin/new Redirect): Requests to /admin/new MUST NOT create a managed tenant and MUST redirect to the “Choose workspace” entry point.
  • FR-003 (DB-only Render): Loading any wizard step MUST NOT trigger outbound HTTP calls; step pages MUST render exclusively from persisted data (including latest known run results).
  • FR-004 (Wizard Steps): The wizard MUST provide 5 steps: (1) Welcome/Requirements, (2) Tenant Details, (3) App/Credentials Setup (when applicable), (4) Admin Consent & Permissions, (5) Verification / First Run.
  • FR-005 (Tenant Details Validation): The wizard MUST require a tenant ID (UUID/GUID) and validate its format.
  • FR-005a (Tenant Details Fields): The tenant details step MUST capture: display name, tenant ID (required), optional domain, and an environment label (dev/staging/prod/other).
  • FR-006 (Uniqueness): The system MUST prevent duplicates by enforcing uniqueness of Managed Tenant by (workspace, tenant ID).
  • FR-007 (Onboarding State): The system MUST track onboarding state per Managed Tenant and set initial state to “onboarding” when created/updated via the wizard.
  • FR-008 (Credentials - Optional Step): If the product requires local credentials for managed tenants, the wizard MUST support setting them as part of onboarding. If not required, the wizard MUST skip this step.
  • FR-008b (Credentials Decision Rule): The wizard MUST decide whether to include the credentials step based on a single configuration/driver rule (no ad-hoc per-page checks).
  • FR-008a (Credential Fields): When the credentials step is applicable, it MUST allow setting a client identifier and a client secret, and MAY allow optional labeling/notes without exposing secret values.
  • FR-009 (Credentials Security): When credentials are used, the system MUST store secrets encrypted at rest and MUST never display secret values after they are saved; the UI MUST only show “secret set” vs “missing”.
  • FR-010 (Credentials RBAC): Only users with “manage” capability for managed tenants MUST be allowed to set/rotate credentials.
  • FR-011 (Runs Canonical / Enqueue-only): “Verify permissions”, “Check connection”, and optional “Run inventory sync” MUST enqueue background runs and MUST NOT perform external calls synchronously.
  • FR-012 (Admin Consent & Permissions UX): The permissions step MUST show a required permissions list, MUST display “Granted/Missing” derived from the latest completed verification run, and MUST provide a link for administrators to grant consent.
  • FR-013 (Resume / Session Persistence): The system MUST persist onboarding sessions and allow users to resume an in-progress onboarding flow; persisted session payload MUST exclude secrets.
  • FR-014 (Session Dedupe): The system MUST ensure at most one active onboarding session exists per Managed Tenant and deduplicate accordingly.
  • FR-014a (Session Dedupe Behavior): When a user attempts to start onboarding for a tenant with an existing active session, the system MUST reuse that session and route the user to resume it.
  • FR-015 (Completion Criteria): The wizard MUST mark onboarding “complete” when the Managed Tenant exists, required credentials (if applicable) are present, and the permissions verification is successful.
  • FR-016 (Resume Link): The Managed Tenant view MUST show a “Resume wizard” entry point when onboarding is not complete.
  • FR-016a (Resume Authorization): Resuming an onboarding session MUST be allowed for any workspace member who has managed_tenants.create within that workspace scope.
  • FR-017 (Capabilities v1): The system MUST support these minimum capabilities: managed_tenants.create (start wizard), managed_tenants.manage (credentials/edit), managed_tenants.view, operations.run (start verify/health/inventory runs).

Key Entities (include if feature involves data)

  • Workspace: Customer/organization container; owns Managed Tenants; defines membership scope.
  • Managed Tenant: A Microsoft/Entra/Intune tenant managed within a workspace; identified by a tenant ID; includes onboarding state and metadata (display name, optional domain, environment label).
  • Onboarding Session: A resumable onboarding state container with: workspace, optional managed tenant reference, creator, status (draft/in progress/completed/abandoned), current step, non-secret payload, last error code, timestamps.
  • Operation Run: An observable, queued execution record for verification/health/sync actions initiated from the wizard.

Success Criteria (mandatory)

Measurable Outcomes

  • SC-001: Workspace Owners can complete onboarding for a new Managed Tenant in under 10 minutes (excluding time waiting for admin consent).
  • SC-002: 100% of wizard step page loads complete without initiating outbound HTTP calls (outbound activity occurs only when a user triggers a run action).
  • SC-003: Users can resume an in-progress wizard in 2 clicks or fewer from the Managed Tenant view.
  • SC-004: After onboarding completion, authorized users can start verification/health runs successfully for the tenant.
  • SC-005: Non-members receive deny-as-not-found behavior (404) for tenant-plane onboarding/managed tenant pages; members lacking capabilities are prevented from performing restricted actions.