# Research: Provider Connection Full Cutover **Feature**: [specs/081-provider-connection-cutover/spec.md](spec.md) **Date**: 2026-02-07 ## Goal Resolve repo-specific unknowns for the full credential cutover, and document decisions with rationale and alternatives. ## Findings (Repo Reality) ### Existing ProviderConnection / ProviderCredential primitives - `ProviderConnection` exists as a workspace-owned, tenant-scoped integration asset. - Default invariant already exists at DB level via partial unique index: - `provider_connections_default_unique` on `(tenant_id, provider)` where `is_default = true`. - `ProviderCredential` exists and stores encrypted payload in `payload` (`encrypted:array`) and is hidden from serialization. - `ProviderGateway::graphOptions(ProviderConnection $connection)` builds Graph options using `CredentialManager`. ### Existing runtime provider call patterns - Some jobs are already ProviderConnection-first: - Provider connection health check uses `ProviderGateway::graphOptions($connection)`. - Provider operation start gate (`ProviderOperationStartGate`) uses `provider_connection_id` in operation run context and dedupe. - Legacy tenant credential reads still exist in high-impact services and UI: - Services: inventory sync, policy sync, policy snapshots/backups, restore, RBAC onboarding, scope tag resolver. - UI: tenant registration + tenant resource form exposes `app_client_id` / `app_client_secret`. ### Operations / observability primitives - `OperationRun` has: - `status`: queued|running|completed - `outcome`: pending|succeeded|partially_succeeded|failed (+ reserved cancelled) - `context` JSON field used for identity and target scope. - Provider operation start gate already writes `context.provider`, `context.provider_connection_id`, and `context.target_scope.entra_tenant_id`. ## Decisions ### D1 — Single Source of Truth: ProviderConnection + ProviderCredential **Decision**: All runtime provider calls use `ProviderConnection` + `ProviderCredential` via `ProviderGateway`. **Rationale**: Eliminates drift between verification vs restore and makes the suite deterministic and auditable. **Alternatives considered**: - Continue dual-source (tenant fields + provider connections): rejected due to drift and security risk. - Allow runtime fallback to tenant fields: rejected; violates “single read path” and creates non-determinism. ### D2 — Default enforcement applies to all providers; backfill creates Microsoft defaults only **Decision**: The invariant “exactly one default per (tenant, provider)” is generic for all providers, but the one-time backfill only creates/repairs defaults for provider `microsoft`. **Rationale**: Keeps the suite future-proof while delivering Microsoft-only cutover now. **Alternatives considered**: - Microsoft-only invariant: rejected; forces future migrations and special cases. ### D3 — Blocked starts still create an OperationRun **Decision**: Starting a provider-backed operation without usable configuration still creates an `OperationRun` record to preserve observability. **Rationale**: Operators need a canonical record for “what was attempted” and why it is blocked. **Alternatives considered**: - UI-only blocked banner without a run: rejected; loses auditability/observability. ### D4 — Represent “blocked” runs as a distinct OperationRun outcome **Decision**: Introduce a `blocked` outcome on operation runs (keep status lifecycle unchanged: `completed`). **Rationale**: The repo currently has no “blocked” status/outcome for runs; representing it explicitly prevents conflating blocked with failed. **Alternatives considered**: - Encode blocked as `outcome=failed` + reason_code: rejected; UI semantics become inconsistent and ambiguous. - Add a new status value (`blocked`): rejected; affects active-run dedupe and status badge expectations more broadly. ### D5 — Backfill selection rule for existing connections without a default **Decision**: If exactly one Microsoft provider connection exists, set it default. If multiple exist, do not auto-select (requires admin remediation). **Rationale**: Avoids accidental selection of the wrong app registration. **Alternatives considered**: - Always pick the oldest: rejected; unsafe in enterprise environments. - Always create a new connection: rejected; increases clutter and may violate tenant/provider/entra uniqueness. ### D6 — Legacy tenant credential reads allowed only in explicit backfill tooling **Decision**: Legacy tenant fields (`tenants.app_*`) are forbidden in runtime and permitted only in backfill command/migration. **Rationale**: Tightens the security posture and makes cutover verifiable via guard tests. **Alternatives considered**: - Runtime fallback: rejected. - No backfill reads: rejected; forces manual secret re-entry for all tenants. ## Open Points (to be handled in implementation) - Centralize “next steps” as link keys (the repo currently embeds Filament URLs directly in verification checks). - Determine the final reason_code taxonomy mapping for common exceptions (credential missing, auth failure, tenant mismatch).