Ahmed Darrazi ab0ffff1d1 feat(onboarding): enterprise wizard + tenantless run viewer

- Canonical /admin/onboarding entry point; legacy routes 404\n- Tenantless run viewer at /admin/operations/{run} with membership-based 404\n- RBAC UX (disabled controls + tooltips) and server-side 403\n- DB-only rendering/refresh; contract registry enforced\n- Adds migrations + tests + spec artifacts

2026-02-04 23:00:06 +01:00

19 KiB

Raw Blame History

Feature Specification: Managed Tenant Onboarding Wizard V1 (Enterprise)

Feature Branch: 073-unified-managed-tenant-onboarding-wizard
Created: 2026-02-04
Status: Draft
Input: User description: "Spec 073 — Managed Tenant Onboarding Wizard V1 (Enterprise): single workspace-first wizard as source of truth, tenantless until activation; legacy entry points removed; strict 404/403 semantics; verification checklist with tenantless run page; optional bootstrap; enterprise-grade UX and regression tests."

Clarifications

Session 2026-02-04

Q: Capability granularity for the wizard? → A: Per-step/per-action capabilities (least-privilege). Activation is owner-only; bootstrap actions are separately gated.
Q: For members without capability, should actions be hidden or disabled? → A: Visible but disabled, with tooltip/explanation; server-side remains authoritative.
Q: What is the tenantless “View run” URL pattern? → A: /admin/operations/{run} (no workspace in path), access-controlled by run.workspace membership (non-member → 404), no auto workspace switching.
Q: What is the canonical onboarding entry point URL? → A: /admin/onboarding (sole entry point in V1; no aliases).

User Scenarios & Testing (mandatory)

User Story 1 - Start onboarding from a single entry point (Priority: P1)

As a workspace member, I can open a single onboarding entry point and start (or resume) onboarding for a Managed Tenant in the currently selected workspace, so that tenant onboarding is consistent, workspace-first, and safe.

Why this priority: This is the foundation for all onboarding work and replaces fragmented legacy flows.

Independent Test: Can be fully tested by visiting /admin/onboarding with and without a selected workspace, completing Step 1, and verifying that a single tenant is created or resumed without duplicates.

Acceptance Scenarios:

Given no workspace is selected, When a user visits /admin/onboarding, Then they are redirected to choose a workspace.
Given a workspace is selected and has no active tenants, When a user visits the onboarding entry point, Then the onboarding wizard opens directly.
Given a workspace is selected and has at least one active tenant, When a user visits the onboarding entry point, Then the onboarding wizard is still reachable via an “Add managed tenant” call-to-action.
Given the user identifies a tenant using an Entra Tenant ID that already exists in the same workspace, When they submit Step 1 again, Then the wizard stays on Step 1 and shows a notification that the tenant already exists with a link to open it.
Given the user provides an Entra Tenant ID that exists in a different workspace, When they submit Step 1, Then the system responds with deny-as-not-found behavior and the UI shows a generic “Not found” notification (no details leaked).

User Story 2 - Attach or create a provider connection safely (Priority: P2)

As a workspace member, I can choose an existing provider connection or create a new one during onboarding, so that the system has a valid technical connection without exposing secret material.

Why this priority: Without a valid connection, verification and activation cannot be completed safely.

Independent Test: Can be tested by selecting “Use existing connection” vs “Create new connection”, ensuring secrets are masked and never displayed again, and verifying that onboarding state stores no secrets.

Acceptance Scenarios:

Given the user chooses “Use existing connection”, When they select a connection and proceed, Then onboarding records the chosen connection and continues.
Given the user chooses “Create new connection”, When they input connection details, Then any secret input is masked and is not retrievable from the UI later.
Given the user starts Step 2 but leaves before finishing, When they resume onboarding later, Then only non-secret inputs are prefilled and secret material is never shown.

User Story 3 - Verify access and review results without tenant-scoped context (Priority: P3)

As a workspace member, I can start a verification run, manually refresh its status, and view a stored checklist report (including a tenantless “View run” page), so that verification works even before the tenant is activated and without using tenant-scoped routes.

Why this priority: Verification is the safety gate that enables activation, and it must work in empty workspaces and pre-activation flows.

Independent Test: Can be tested by starting verification, asserting idempotent dedupe while a run is active, verifying the viewer renders using stored data only, and verifying the “View run” link is tenantless.

Acceptance Scenarios:

Given verification has not been started, When the user clicks “Start verification”, Then a new verification run is started and the UI shows that verification is in progress.
Given a verification run is active, When the user clicks “Start verification” again, Then the system dedupes the request and does not create a second active run.
Given a verification run is active, When the user clicks “Refresh”, Then the UI updates status using stored run state.
Given verification completes with any blocking failures, When the report is shown, Then the step status is “Blocked”.
Given verification completes with warnings but no blocking failures, When the report is shown, Then the step status is “Needs attention”.
Given verification completes with no warnings and no failures, When the report is shown, Then the step status is “Ready”.
Given the UI shows a “View run” link, When the user clicks it, Then it opens a tenantless operations URL (not a tenant-scoped URL).

Edge Cases

Visiting legacy entry points returns “not found” behavior (no redirects).
A non-member of the selected workspace receives deny-as-not-found behavior for the onboarding entry point.
A workspace member without the required capability can see the page, but action controls are disabled and show a tooltip; server-side action attempts are denied with 403.
Activation is owner-only: non-owners can see Step 5 but cannot activate; the UI explains “Owner required”, and server-side attempts are denied.
Bootstrap actions are optional and gated independently per action; non-authorized users cannot start them.
The wizard must not generate or require tenant-scoped links before activation.
Manual refresh should not trigger external network calls; it may only re-read stored status/report.
Verification report content must never contain secrets/tokens, raw headers, or credential material.
Completing onboarding while verification is blocked is prevented unless an explicit override policy applies.

Requirements (mandatory)

Constitution alignment (required): If this feature introduces any Microsoft Graph calls, any write/change behavior, or any long-running/queued/scheduled work, the spec MUST describe contract registry updates, safety gates (preview/confirmation/audit), tenant isolation, run observability (OperationRun type/identity/visibility), and tests. If security-relevant DB-only actions intentionally skip OperationRun, the spec MUST describe AuditLog entries.

Constitution alignment (RBAC-UX): If this feature introduces or changes authorization behavior, the spec MUST:

state which authorization plane(s) are involved (tenant /admin/t/{tenant} vs platform /system),
ensure any cross-plane access is deny-as-not-found (404),
explicitly define 404 vs 403 semantics:
- non-member / not entitled to tenant scope → 404 (deny-as-not-found)
- member but missing capability → 403
describe how authorization is enforced server-side (Gates/Policies) for every mutation/operation-start/credential change,
reference the canonical capability registry (no raw capability strings; no role-string checks in feature code),
ensure global search is tenant-scoped and non-member-safe (no hints; inaccessible results treated as 404 semantics),
ensure destructive-like actions require confirmation (->requiresConfirmation()),
include at least one positive and one negative authorization test, and note any RBAC regression tests added/updated.

Authorization plane(s) involved (filled for this feature):

Tenant plane (Entra users) only. This feature adds tenantless, workspace-scoped routes under /admin/* (/admin/onboarding, /admin/operations/{run}) that must still enforce tenant-plane membership and capability rules.
Platform plane (/system) is out of scope. No cross-plane navigation is introduced; deny-as-not-found (404) semantics remain the default for non-members / not entitled.

Constitution alignment (OPS-EX-AUTH-001): OIDC/SAML login handshakes may perform synchronous outbound HTTP (e.g., token exchange) on /auth/* endpoints without an OperationRun. This MUST NOT be used for Monitoring/Operations pages.

Constitution alignment (BADGE-001): If this feature changes status-like badges (status/outcome/severity/risk/availability/boolean), the spec MUST describe how badge semantics stay centralized (no ad-hoc mappings) and which tests cover any new/changed values.

Functional Requirements

FR-001 (Single onboarding entry point): The system MUST provide a single onboarding entry point at /admin/onboarding that is the source of truth for onboarding.
FR-002 (Workspace required): If no workspace is selected, the onboarding entry point MUST redirect the user to a workspace chooser.
FR-003 (Workspace landing behavior): With a selected workspace, the system MUST:
- open the wizard directly when the workspace has zero active tenants, and
- keep the wizard reachable via an “Add managed tenant” call-to-action when the workspace has one or more active tenants.
FR-004 (Remove legacy entry points): The following legacy entry points MUST NOT exist and MUST return “not found” behavior (no redirects):
- /admin/new
- any legacy tenant-scoped create entry point
- /admin/managed-tenants/onboarding (legacy)
FR-005 (Membership boundary): A non-member of the selected workspace MUST always receive deny-as-not-found behavior for onboarding and for any workspace-visible operations.
FR-006 (Capability boundary): A workspace member without the required capability MUST be able to view the page, but action controls MUST be disabled with an explanatory tooltip; server-side action attempts MUST be denied with 403.
FR-006d (Discoverability default): In V1, capability-gated controls SHOULD remain visible but disabled with an explanation (rather than being hidden), to support enterprise operator workflows.
FR-006a (Least-privilege capability model): The wizard MUST gate each step and each action by canonical capabilities (no ad-hoc role string checks).
FR-006b (Wizard capability breakdown): The system MUST support, at minimum, distinct capability gates for:
- identifying / creating / resuming onboarding for a managed tenant,
- viewing/selecting a provider connection,
- creating/editing a provider connection,
- starting verification,
- running each optional bootstrap action (inventory sync, policy sync, backup bootstrap) independently,
- activating a tenant.
FR-006c (Viewer visibility): Viewing verification reports and operation-run results MUST be permitted to workspace members (subject to workspace membership), even when they cannot start runs.
FR-007 (Workspace↔tenant match hard rule): For any tenant-scoped route, if the tenant does not belong to the currently selected workspace, the system MUST return deny-as-not-found behavior.
FR-008 (Tenantless wizard until activation): The wizard MUST not require tenant-scoped pages, routes, or links before the final “Complete / Activate” step.
FR-009 (Identify managed tenant inputs): Step 1 MUST capture, at minimum:
- tenant name,
- environment,
- Entra Tenant ID,
- optional primary domain,
- optional notes.
FR-010 (Idempotent identification): Step 1 MUST be idempotent for the same tenant identifier within the same workspace and MUST resume an active onboarding session when applicable.
FR-011 (Uniqueness of Entra Tenant ID): The system MUST enforce Entra Tenant ID uniqueness globally, and each Entra Tenant ID MUST be bound to exactly one workspace in V1.
FR-012 (Tenant status model): Managed Tenants MUST support a v1 lifecycle including: draft, onboarding, active, archived.
FR-013 (Provider connection choice): Step 2 MUST let the user either use an existing connection or create a new connection.
FR-014 (Secret safety): Any secret material entered during connection creation MUST be masked, stored securely, and MUST never be displayed again. Onboarding session state MUST not store secret material.
FR-015 (Verification run start): Step 3 MUST allow starting a verification run and MUST dedupe requests while an active verification run exists.
FR-016 (Verification viewer behavior): Step 3 MUST display a stored checklist report with:
- an “in progress” banner while a run is active,
- a manual “Refresh” control,
- status mapping: blocking failures → Blocked; warnings-only → Needs attention; otherwise → Ready,
- “Next steps” as links only (no server-side actions in V1).
FR-017 (Tenantless operations page): The wizard’s “View run” link MUST point to /admin/operations/{run} and MUST never use a tenant-scoped operations URL.
FR-017a (Tenantless access semantics): Access to /admin/operations/{run} MUST be granted only if the user is a member of the run’s workspace; otherwise the system MUST respond with deny-as-not-found behavior. The page MUST NOT require a pre-selected workspace context and MUST NOT auto-switch workspaces.
FR-018 (Workspace-visible operations): Operation runs started by the wizard MUST be safely viewable in a workspace context without tenant-scoped routing and MUST honor the same deny-as-not-found membership boundary.
FR-019 (Optional bootstrap step): Step 4 MAY offer optional bootstrap actions (e.g., inventory sync, policy sync, baseline creation) with per-action capability gating; each selected action MUST start its own operation run and be viewable tenantlessly.
FR-020 (Complete / Activate gate): The wizard MUST only allow activation when a provider connection exists and verification is not Blocked, except when a workspace owner explicitly overrides the block.
FR-020a (Override requirements): When overriding a blocked verification, the system MUST require a human-entered reason and MUST record an audit event capturing the override decision and reason.
FR-020b (Owner-only activation): Activation MUST be restricted to workspace owners (non-owner members may not activate, even if they can run earlier steps).
FR-021 (Activation outcome): On activation, the tenant MUST become visible in the workspace tenant switcher and the user MUST be redirected either to the tenant home (open now) or back to the workspace managed tenant list.
FR-022 (Connection ownership model): Provider connections MUST be workspace-owned.
FR-022a (Safe default binding): By default in V1, a provider connection MUST be bound to exactly one managed tenant.
FR-022b (Reuse safety gate): Reuse of an existing provider connection for additional managed tenants MUST be disabled by default and MUST only be possible via an explicit opt-in that clearly communicates risk and is policy-gated.
FR-023 (Auditability): The system MUST record audit events for: tenant identification, connection creation/updates, verification start/completion, bootstrap run start/completion, and activation.
FR-024 (DB-only rendering): The wizard and the verification viewer MUST render using stored data only; any external checks MUST run as background work.
FR-025 (Badge semantics): Step-status and verification-result chips MUST use centralized badge semantics (no per-page ad-hoc mappings), and changes MUST be covered by automated tests.
FR-026 (Graph contract path): Any Microsoft Graph call made by verification/bootstrap runs MUST go through the canonical contract registry path (GraphClientInterface + config/graph_contracts.php). Feature code MUST NOT hardcode ad-hoc endpoints; missing contracts MUST fail safe and be covered by automated tests.

Key Entities (include if feature involves data)

Workspace: A portfolio context that a user selects; controls membership and owns one or more managed tenants.
Managed Tenant: A record representing a Microsoft tenant managed by the organization; includes identity (Entra Tenant ID), environment, and lifecycle status.
Onboarding Session: A resumable record of onboarding progress and safe, non-secret state.
Provider Connection: A technical connection configuration used to access tenant data; includes secret material that must never be displayed after capture.
Operation Run: A trackable background run started by the wizard (verification and optional bootstrap actions) with a stored report suitable for safe, tenantless viewing.
Verification Report: A stored checklist result with per-check statuses, safe messages, evidence pointers, and “next steps” links.

Success Criteria (mandatory)

Measurable Outcomes

SC-001 (Single entry point adoption): 100% of managed-tenant onboarding starts from the single onboarding entry point; legacy URLs return “not found” behavior.
SC-002 (Time to first verification): A workspace admin can reach “verification started” within 3 minutes of opening onboarding (excluding external consent/approval wait time).
SC-003 (No pre-activation tenant-scoped routing): Before activation, the wizard never generates tenant-scoped URLs; this is validated by regression tests.
SC-004 (Authorization correctness): Non-members consistently receive deny-as-not-found behavior; members lacking capability receive 403 on action attempts; authorized users complete onboarding.
SC-005 (Idempotency): For repeated Step 1 submissions with the same Entra Tenant ID in the same workspace, no duplicates are created and the user resumes the existing onboarding session.
SC-006 (Secret safety): No secret material appears in UI, reports, notifications, logs, or audit events; validated by automated tests.
SC-007 (Operational clarity): When verification is blocked, at least 90% of users can identify the reason category and next step from the report without opening a support ticket (measured via internal feedback or support tagging).

19 KiB Raw Blame History Unescape Escape

Feature Specification: Managed Tenant Onboarding Wizard V1 (Enterprise)

Clarifications

Session 2026-02-04

User Scenarios & Testing (mandatory)

User Story 1 - Start onboarding from a single entry point (Priority: P1)

User Story 2 - Attach or create a provider connection safely (Priority: P2)

User Story 3 - Verify access and review results without tenant-scoped context (Priority: P3)

Edge Cases

Requirements (mandatory)

Functional Requirements

Key Entities (include if feature involves data)

Success Criteria (mandatory)

Measurable Outcomes

19 KiB

Raw Blame History