TenantAtlas/specs/081-provider-connection-cutover/spec.md
ahmido 4db8030f2a Spec 081: Provider connection cutover (#98)
Implements Spec 081 provider-connection cutover.

Highlights:
- Adds provider connection resolution + gating for operations/verification.
- Adds provider credential observer wiring.
- Updates Filament tenant verify flow to block with next-steps when provider connection isn’t ready.
- Adds spec docs under specs/081-provider-connection-cutover/ and extensive Spec081 test coverage.

Tests:
- vendor/bin/sail artisan test --compact tests/Feature/Filament/TenantSetupTest.php
- Focused suites for ProviderConnections/Verification ran during implementation (see local logs).

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@MacBookPro.fritz.box>
Reviewed-on: #98
2026-02-08 11:28:51 +00:00

15 KiB
Raw Blame History

Feature Specification: Provider Connection Full Cutover

Feature Branch: 081-provider-connection-cutover
Created: 2026-02-07
Status: Draft (implementation-ready)
Input: Spec 081 — Provider Connection Full Cutover (single source of truth, enterprise suite)

Clarifications

Session 2026-02-07

  • Q: Provider scope for “default ProviderConnection” enforcement? → A: All providers (generic rule), but backfill only creates Microsoft defaults.
  • Q: When a provider-backed operation is started but the connection/credential is missing, should we create an OperationRun? → A: Yes — create an OperationRun with a blocked outcome/state and store reason_code + link-only next steps.
  • Q: Backfill behavior when Microsoft connections exist but none is default? → A: If exactly one exists, set it default; if multiple exist, do not auto-select (leave blocked + remediation).
  • Q: Legacy tenant credential fields (tenants.app_*) after cutover? → A: Forbidden in runtime; allowed only in explicit backfill tooling for one-time copy.
  • Q: Legacy tenant credential columns lifecycle? → A: Keep columns for now (deprecated/unused), defer dropping to a follow-up spec.

User Scenarios & Testing (mandatory)

User Story 1 - Deterministic provider operations (Priority: P1)

As an operator, I can run provider-backed operations (inventory, sync, backup, restore, verification) and the system always uses the same, workspace-managed provider connection for the selected managed tenant.

If the tenant is not configured, the system blocks the action with a clear reason and a guided path to remediation.

Why this priority: This removes “verify green, restore red” drift and makes the suite reliable and auditable.

Independent Test: Start a provider-backed operation for a managed tenant (a) with a default provider connection and (b) without one; verify the first runs using the default connection and the second is blocked with a stable reason and next-step links.

Acceptance Scenarios:

  1. Given a managed tenant with exactly one default provider connection for a provider, When an operator starts a provider-backed operation, Then the operation uses that default connection and records the connection identity for traceability.
  2. Given a managed tenant with no default provider connection for a provider, When an operator starts a provider-backed operation, Then the operation is blocked deterministically with reason code provider_connection_missing and a remediation link.
  3. Given a managed tenant with more than one “default” provider connection (invalid configuration), When an operator starts a provider-backed operation, Then the operation is blocked/failed deterministically with reason code provider_connection_invalid (optional extension detail such as ext.multiple_defaults_detected) and does not proceed.

User Story 2 - Safe credential management with audit (Priority: P2)

As an admin, I can manage provider connections and rotate credentials with explicit confirmation and complete auditability, without secrets ever being shown or stored in logs, reports, or audit payloads.

Why this priority: Credential handling is security-critical; enterprise operations require least privilege, safe UI flows, and reliable audit trails.

Independent Test: Update a provider credential and confirm (a) confirmation is required, (b) an audit event is created, and (c) secret values are never persisted outside the encrypted credential store.

Acceptance Scenarios:

  1. Given an admin with the required capability, When they update provider credentials, Then the action requires explicit confirmation and produces an audit event with redacted metadata.
  2. Given a non-member attempting to access provider connection management for a tenant, When they load the page, Then they receive deny-as-not-found behavior (404 semantics) with no tenant hints.
  3. Given a member without the credential-management capability, When they attempt to update credentials, Then the system denies the mutation with a forbidden response (403 semantics).

User Story 3 - Troubleshoot failures using stable reason codes (Priority: P3)

As an operator, I can understand why a provider-backed operation is blocked or failed through stable, machine-readable reason codes and consistent “next steps” links.

Why this priority: Stable reason codes enable predictable UX, support workflows, and long-term suite consistency.

Independent Test: Trigger a blocked/failing operation and verify reason codes and next steps appear consistently and contain no secrets.

Acceptance Scenarios:

  1. Given a missing credential, When an operation runs, Then the outcome is blocked with reason code provider_credential_missing and a next-step link to update credentials.
  2. Given an authentication failure at the provider, When an operation runs, Then the outcome is failed with reason code provider_auth_failed and a documentation link for troubleshooting.

Edge Cases

  • Default provider connection is missing for a tenant/provider pair.
  • More than one default provider connection exists for the same tenant/provider pair.
  • A provider connection exists but is disabled/unusable.
  • Provider credential is missing.
  • Provider credential is present but rejected by the provider.
  • Admin consent is missing / cannot be detected.
  • Required permissions are missing.
  • Provider returns forbidden/insufficient privileges.
  • Provider target tenant does not match the managed tenant target.
  • Network is unreachable / timeouts occur.
  • Rate limiting/throttling occurs.
  • A user without membership tries to view tenant/provider connection details (deny-as-not-found).

Requirements (mandatory)

Constitution alignment (required): This feature affects provider calls, long-running operations, and credential management. The solution must preserve run observability, tenant isolation, safety/confirmations for sensitive actions, and auditable credential handling.

Constitution alignment (RBAC-UX): Authorization behavior must be explicit:

  • Non-member / not entitled to tenant scope → deny-as-not-found (404 semantics)
  • Member but missing capability → forbidden (403 semantics)

Functional Requirements

  • FR-081-001 (Default required): For every managed tenant and provider, the system MUST have exactly one default provider connection OR block provider-backed flows with a clear “missing connection” reason and remediation link.

  • FR-081-002 (Single source of truth): All provider-facing runtime flows MUST use provider connections + credentials as the only authoritative credential source.

  • FR-081-003 (No tenant credential runtime use): Tenant-stored application credential fields MUST NOT be used at runtime for provider calls.

  • FR-081-003a (Legacy reads are tooling-only): Reads of legacy tenant credential fields are permitted only inside explicit backfill tooling (migration/command) for a one-time copy into provider credentials. Runtime flows MUST NOT read legacy tenant credential fields under any circumstances.

  • FR-081-004 (No tenant credential write path): The system MUST NOT provide any UI or service flow that writes provider secrets into tenant fields.

  • FR-081-005 (Single provider call entry point): All provider calls MUST go through a single, centralized provider gateway/factory layer that accepts a provider connection as the primary identifier.

  • FR-081-006 (Operation traceability): Every provider-backed operation MUST record, at minimum, provider identity, provider connection identity, managed tenant identity, and the provider tenant target scope so operators can trace runs.

  • FR-081-007 (Deterministic failure semantics): When a provider connection/credential is missing or invalid, operations MUST be blocked or failed deterministically with stable reason codes.

  • FR-081-007a (Blocked operations are observable): When an operator attempts to start a provider-backed operation but it cannot proceed due to configuration/credential reasons, the system MUST still create an operation run record in a blocked state and store a safe reason_code plus link-only next steps.

  • FR-081-008 (No secret leakage): Secrets MUST NOT appear in audit metadata, operation context, verification reports, application logs, or exception messages.

  • FR-081-009 (DB-only viewing): “View” pages MUST render only stored data and MUST NOT perform provider calls during rendering.

Security & Authorization Requirements

  • SR-081-001 (Least privilege): The system MUST separate permissions for viewing vs managing provider connections/credentials and enforce them server-side.
  • SR-081-002 (Deny-as-not-found): Non-members MUST experience deny-as-not-found boundaries for tenant/provider-connection scoped resources.
  • SR-081-003 (Confirmed credential mutations): Credential changes MUST require explicit confirmation and generate auditable events with redacted payloads.

Data & Migration Requirements

  • FR-081-010 (Backfill defaults, idempotent): The system MUST provide a one-time backfill that ensures every managed tenant has a default provider connection for the Microsoft provider.

    • If a default provider connection already exists, backfill MUST leave it unchanged.
    • If no default exists and exactly one Microsoft provider connection exists, backfill MUST set it as the default.
    • If no default exists and multiple Microsoft provider connections exist, backfill MUST NOT auto-select a default and MUST leave the tenant in a blocked/remediation-required state.
    • If no Microsoft provider connection exists, backfill MUST create one and set it default.
    • If legacy tenant credentials exist and backfill creates a new Microsoft provider connection, it MUST copy those legacy credentials into the provider credential store for that new connection.
    • Running the backfill multiple times MUST NOT create duplicates.
  • FR-081-011 (Uniqueness invariant): The system MUST enforce the invariant “exactly one default provider connection per (managed tenant, provider)” for all providers.

UX Requirements (minimal changes)

  • UX-081-001 (Blocked state guidance): When blocked due to missing default provider connection, the UI MUST clearly state the blocked reason and provide a primary remediation link to manage provider connections.
  • UX-081-002 (Single management surface): Tenants MUST NOT have a second credential edit surface; provider connection management is the only supported place to manage provider credentials.
  • UX-081-003 (Link-only next steps): Verification “next steps” MUST be navigation-only (links), not server-side “fix-it” actions.

Scope Boundaries

  • NG-081-001: This spec does not introduce new credential types (e.g., certificates) or redesign token caching.
  • NG-081-002: This spec does not change the canonical, tenantless operation run URL structure.
  • NG-081-003: This spec does not drop legacy tenant credential columns; removal is deferred to a follow-up spec once cutover is proven stable.

Key Entities (include if feature involves data)

  • Managed Tenant: A workspace-owned tenant target used for provider operations.
  • Provider Connection: A managed integration asset bound to a managed tenant and provider.
  • Provider Credential: An encrypted credential payload owned by a provider connection.
  • Default Provider Connection: The single connection designated as default for a managed tenant + provider.
  • Operation Run: A canonical record representing a provider-backed operations identity, state, and outcome.
  • Audit Event: An immutable record of credential changes and other sensitive actions.
  • Verification Report: Stored results of readiness checks with stable reason codes and link-only next steps.

Success Criteria (mandatory)

Measurable Outcomes

  • SC-081-001 (Eliminate drift): Provider-backed operations for a managed tenant never rely on tenant-stored credential fields at runtime; the system consistently uses the default provider connection.
  • SC-081-002 (Blocked determinism): 100% of attempts to start provider-backed operations without a default provider connection are blocked with a stable reason code and a remediation link.
  • SC-081-003 (Audit coverage): 100% of credential mutations produce auditable events with no secret material included.
  • SC-081-004 (No secret leakage): Secrets appear in 0 verification reports, 0 audit payloads, and 0 operator-visible error messages.
  • SC-081-005 (Backfill completeness): After backfill, every managed tenant either has exactly one default provider connection for the Microsoft provider or is left in an explicit remediation-required state (provider_connection_missing / provider_connection_invalid) per FR-081-010 decision rules.

Appendix A — Reason Code Taxonomy (v1 baseline)

Purpose: Stable, machine-readable classification for provider/credential/auth/permission failures.

Reason code Category Typical status Meaning
provider_connection_missing configuration block No default provider connection configured for this managed tenant/provider.
provider_connection_invalid configuration fail Provider connection exists but is inconsistent/disabled/cannot be used (including multi-default corruption).
provider_credential_missing credentials block Connection exists, but no provider credential (secret) is present.
provider_credential_invalid credentials fail Credential exists but is unusable (bad secret, wrong app, expired, etc.).
provider_consent_missing consent block Admin consent not granted (or not detected).
provider_auth_failed auth fail Authentication/token exchange failed.
provider_permission_missing permissions block Required application permissions are not granted.
provider_permission_denied permissions fail Provider denied access for an attempted call.
provider_permission_refresh_failed permissions warn Permission refresh did not run or failed; observed permissions may be stale.
tenant_target_mismatch integrity block Connection/credential is bound to a different tenant than the managed tenant target.
network_unreachable transport fail Network/DNS/timeout prevents reaching provider endpoints.
rate_limited transport warn Provider throttling / rate limiting encountered.
unknown_error fallback fail Unclassified failure.

Extension Namespace (ext.*)

Extension codes MAY be added as secondary details without breaking consumers (e.g., provider-specific or error-code subtyping). Viewers MUST degrade gracefully for unknown codes.

Purpose: Make remediation links consistent across onboarding, verification, and error screens.

Rule (v1): Next steps are navigation-only (links). They do not trigger server-side “fix” actions.

Default next steps (examples)

  • provider_connection_missing: Link to manage provider connections and set a default.
  • provider_credential_missing: Link to update credentials.
  • provider_permission_missing: Link to required permissions guidance.
  • provider_auth_failed: Link to connection review and troubleshooting documentation.