# Feature Specification: Provider Connection Full Cutover **Feature Branch**: `081-provider-connection-cutover` **Created**: 2026-02-07 **Status**: Draft (implementation-ready) **Input**: Spec 081 — Provider Connection Full Cutover (single source of truth, enterprise suite) ## Clarifications ### Session 2026-02-07 - Q: Provider scope for “default ProviderConnection” enforcement? → A: All providers (generic rule), but backfill only creates Microsoft defaults. - Q: When a provider-backed operation is started but the connection/credential is missing, should we create an OperationRun? → A: Yes — create an OperationRun with a `blocked` outcome/state and store `reason_code` + link-only next steps. - Q: Backfill behavior when Microsoft connections exist but none is default? → A: If exactly one exists, set it default; if multiple exist, do not auto-select (leave blocked + remediation). - Q: Legacy tenant credential fields (`tenants.app_*`) after cutover? → A: Forbidden in runtime; allowed only in explicit backfill tooling for one-time copy. - Q: Legacy tenant credential columns lifecycle? → A: Keep columns for now (deprecated/unused), defer dropping to a follow-up spec. ## User Scenarios & Testing *(mandatory)* ### User Story 1 - Deterministic provider operations (Priority: P1) As an operator, I can run provider-backed operations (inventory, sync, backup, restore, verification) and the system always uses the same, workspace-managed provider connection for the selected managed tenant. If the tenant is not configured, the system blocks the action with a clear reason and a guided path to remediation. **Why this priority**: This removes “verify green, restore red” drift and makes the suite reliable and auditable. **Independent Test**: Start a provider-backed operation for a managed tenant (a) with a default provider connection and (b) without one; verify the first runs using the default connection and the second is blocked with a stable reason and next-step links. **Acceptance Scenarios**: 1. **Given** a managed tenant with exactly one default provider connection for a provider, **When** an operator starts a provider-backed operation, **Then** the operation uses that default connection and records the connection identity for traceability. 2. **Given** a managed tenant with no default provider connection for a provider, **When** an operator starts a provider-backed operation, **Then** the operation is blocked deterministically with reason code `provider_connection_missing` and a remediation link. 3. **Given** a managed tenant with more than one “default” provider connection (invalid configuration), **When** an operator starts a provider-backed operation, **Then** the operation is blocked/failed deterministically with reason code `provider_connection_invalid` (optional extension detail such as `ext.multiple_defaults_detected`) and does not proceed. --- ### User Story 2 - Safe credential management with audit (Priority: P2) As an admin, I can manage provider connections and rotate credentials with explicit confirmation and complete auditability, without secrets ever being shown or stored in logs, reports, or audit payloads. **Why this priority**: Credential handling is security-critical; enterprise operations require least privilege, safe UI flows, and reliable audit trails. **Independent Test**: Update a provider credential and confirm (a) confirmation is required, (b) an audit event is created, and (c) secret values are never persisted outside the encrypted credential store. **Acceptance Scenarios**: 1. **Given** an admin with the required capability, **When** they update provider credentials, **Then** the action requires explicit confirmation and produces an audit event with redacted metadata. 2. **Given** a non-member attempting to access provider connection management for a tenant, **When** they load the page, **Then** they receive deny-as-not-found behavior (404 semantics) with no tenant hints. 3. **Given** a member without the credential-management capability, **When** they attempt to update credentials, **Then** the system denies the mutation with a forbidden response (403 semantics). --- ### User Story 3 - Troubleshoot failures using stable reason codes (Priority: P3) As an operator, I can understand why a provider-backed operation is blocked or failed through stable, machine-readable reason codes and consistent “next steps” links. **Why this priority**: Stable reason codes enable predictable UX, support workflows, and long-term suite consistency. **Independent Test**: Trigger a blocked/failing operation and verify reason codes and next steps appear consistently and contain no secrets. **Acceptance Scenarios**: 1. **Given** a missing credential, **When** an operation runs, **Then** the outcome is blocked with reason code `provider_credential_missing` and a next-step link to update credentials. 2. **Given** an authentication failure at the provider, **When** an operation runs, **Then** the outcome is failed with reason code `provider_auth_failed` and a documentation link for troubleshooting. ### Edge Cases - Default provider connection is missing for a tenant/provider pair. - More than one default provider connection exists for the same tenant/provider pair. - A provider connection exists but is disabled/unusable. - Provider credential is missing. - Provider credential is present but rejected by the provider. - Admin consent is missing / cannot be detected. - Required permissions are missing. - Provider returns forbidden/insufficient privileges. - Provider target tenant does not match the managed tenant target. - Network is unreachable / timeouts occur. - Rate limiting/throttling occurs. - A user without membership tries to view tenant/provider connection details (deny-as-not-found). ## Requirements *(mandatory)* **Constitution alignment (required):** This feature affects provider calls, long-running operations, and credential management. The solution must preserve run observability, tenant isolation, safety/confirmations for sensitive actions, and auditable credential handling. **Constitution alignment (RBAC-UX):** Authorization behavior must be explicit: - Non-member / not entitled to tenant scope → deny-as-not-found (404 semantics) - Member but missing capability → forbidden (403 semantics) ### Functional Requirements - **FR-081-001 (Default required)**: For every managed tenant and provider, the system MUST have exactly one default provider connection OR block provider-backed flows with a clear “missing connection” reason and remediation link. - **FR-081-002 (Single source of truth)**: All provider-facing runtime flows MUST use provider connections + credentials as the only authoritative credential source. - **FR-081-003 (No tenant credential runtime use)**: Tenant-stored application credential fields MUST NOT be used at runtime for provider calls. - **FR-081-003a (Legacy reads are tooling-only)**: Reads of legacy tenant credential fields are permitted only inside explicit backfill tooling (migration/command) for a one-time copy into provider credentials. Runtime flows MUST NOT read legacy tenant credential fields under any circumstances. - **FR-081-004 (No tenant credential write path)**: The system MUST NOT provide any UI or service flow that writes provider secrets into tenant fields. - **FR-081-005 (Single provider call entry point)**: All provider calls MUST go through a single, centralized provider gateway/factory layer that accepts a provider connection as the primary identifier. - **FR-081-006 (Operation traceability)**: Every provider-backed operation MUST record, at minimum, provider identity, provider connection identity, managed tenant identity, and the provider tenant target scope so operators can trace runs. - **FR-081-007 (Deterministic failure semantics)**: When a provider connection/credential is missing or invalid, operations MUST be blocked or failed deterministically with stable reason codes. - **FR-081-007a (Blocked operations are observable)**: When an operator attempts to start a provider-backed operation but it cannot proceed due to configuration/credential reasons, the system MUST still create an operation run record in a `blocked` state and store a safe `reason_code` plus link-only next steps. - **FR-081-008 (No secret leakage)**: Secrets MUST NOT appear in audit metadata, operation context, verification reports, application logs, or exception messages. - **FR-081-009 (DB-only viewing)**: “View” pages MUST render only stored data and MUST NOT perform provider calls during rendering. ### Security & Authorization Requirements - **SR-081-001 (Least privilege)**: The system MUST separate permissions for viewing vs managing provider connections/credentials and enforce them server-side. - **SR-081-002 (Deny-as-not-found)**: Non-members MUST experience deny-as-not-found boundaries for tenant/provider-connection scoped resources. - **SR-081-003 (Confirmed credential mutations)**: Credential changes MUST require explicit confirmation and generate auditable events with redacted payloads. ### Data & Migration Requirements - **FR-081-010 (Backfill defaults, idempotent)**: The system MUST provide a one-time backfill that ensures every managed tenant has a default provider connection for the Microsoft provider. - If a default provider connection already exists, backfill MUST leave it unchanged. - If no default exists and exactly one Microsoft provider connection exists, backfill MUST set it as the default. - If no default exists and multiple Microsoft provider connections exist, backfill MUST NOT auto-select a default and MUST leave the tenant in a blocked/remediation-required state. - If no Microsoft provider connection exists, backfill MUST create one and set it default. - If legacy tenant credentials exist and backfill creates a new Microsoft provider connection, it MUST copy those legacy credentials into the provider credential store for that new connection. - Running the backfill multiple times MUST NOT create duplicates. - **FR-081-011 (Uniqueness invariant)**: The system MUST enforce the invariant “exactly one default provider connection per (managed tenant, provider)” for all providers. ### UX Requirements (minimal changes) - **UX-081-001 (Blocked state guidance)**: When blocked due to missing default provider connection, the UI MUST clearly state the blocked reason and provide a primary remediation link to manage provider connections. - **UX-081-002 (Single management surface)**: Tenants MUST NOT have a second credential edit surface; provider connection management is the only supported place to manage provider credentials. - **UX-081-003 (Link-only next steps)**: Verification “next steps” MUST be navigation-only (links), not server-side “fix-it” actions. ### Scope Boundaries - **NG-081-001**: This spec does not introduce new credential types (e.g., certificates) or redesign token caching. - **NG-081-002**: This spec does not change the canonical, tenantless operation run URL structure. - **NG-081-003**: This spec does not drop legacy tenant credential columns; removal is deferred to a follow-up spec once cutover is proven stable. ### Key Entities *(include if feature involves data)* - **Managed Tenant**: A workspace-owned tenant target used for provider operations. - **Provider Connection**: A managed integration asset bound to a managed tenant and provider. - **Provider Credential**: An encrypted credential payload owned by a provider connection. - **Default Provider Connection**: The single connection designated as default for a managed tenant + provider. - **Operation Run**: A canonical record representing a provider-backed operation’s identity, state, and outcome. - **Audit Event**: An immutable record of credential changes and other sensitive actions. - **Verification Report**: Stored results of readiness checks with stable reason codes and link-only next steps. ## Success Criteria *(mandatory)* ### Measurable Outcomes - **SC-081-001 (Eliminate drift)**: Provider-backed operations for a managed tenant never rely on tenant-stored credential fields at runtime; the system consistently uses the default provider connection. - **SC-081-002 (Blocked determinism)**: 100% of attempts to start provider-backed operations without a default provider connection are blocked with a stable reason code and a remediation link. - **SC-081-003 (Audit coverage)**: 100% of credential mutations produce auditable events with no secret material included. - **SC-081-004 (No secret leakage)**: Secrets appear in 0 verification reports, 0 audit payloads, and 0 operator-visible error messages. - **SC-081-005 (Backfill completeness)**: After backfill, every managed tenant either has exactly one default provider connection for the Microsoft provider or is left in an explicit remediation-required state (`provider_connection_missing` / `provider_connection_invalid`) per FR-081-010 decision rules. ## Appendix A — Reason Code Taxonomy (v1 baseline) **Purpose:** Stable, machine-readable classification for provider/credential/auth/permission failures. | Reason code | Category | Typical status | Meaning | |---|---|---:|---| | `provider_connection_missing` | configuration | `block` | No default provider connection configured for this managed tenant/provider. | | `provider_connection_invalid` | configuration | `fail` | Provider connection exists but is inconsistent/disabled/cannot be used (including multi-default corruption). | | `provider_credential_missing` | credentials | `block` | Connection exists, but no provider credential (secret) is present. | | `provider_credential_invalid` | credentials | `fail` | Credential exists but is unusable (bad secret, wrong app, expired, etc.). | | `provider_consent_missing` | consent | `block` | Admin consent not granted (or not detected). | | `provider_auth_failed` | auth | `fail` | Authentication/token exchange failed. | | `provider_permission_missing` | permissions | `block` | Required application permissions are not granted. | | `provider_permission_denied` | permissions | `fail` | Provider denied access for an attempted call. | | `provider_permission_refresh_failed` | permissions | `warn` | Permission refresh did not run or failed; observed permissions may be stale. | | `tenant_target_mismatch` | integrity | `block` | Connection/credential is bound to a different tenant than the managed tenant target. | | `network_unreachable` | transport | `fail` | Network/DNS/timeout prevents reaching provider endpoints. | | `rate_limited` | transport | `warn` | Provider throttling / rate limiting encountered. | | `unknown_error` | fallback | `fail` | Unclassified failure. | ### Extension Namespace (`ext.*`) Extension codes MAY be added as secondary details without breaking consumers (e.g., provider-specific or error-code subtyping). Viewers MUST degrade gracefully for unknown codes. ## Appendix B — Next Steps Registry (link-only) **Purpose:** Make remediation links consistent across onboarding, verification, and error screens. **Rule (v1):** Next steps are navigation-only (links). They do not trigger server-side “fix” actions. ### Default next steps (examples) - `provider_connection_missing`: Link to manage provider connections and set a default. - `provider_credential_missing`: Link to update credentials. - `provider_permission_missing`: Link to required permissions guidance. - `provider_auth_failed`: Link to connection review and troubleshooting documentation.