Implements Spec 081 provider-connection cutover. Highlights: - Adds provider connection resolution + gating for operations/verification. - Adds provider credential observer wiring. - Updates Filament tenant verify flow to block with next-steps when provider connection isn’t ready. - Adds spec docs under specs/081-provider-connection-cutover/ and extensive Spec081 test coverage. Tests: - vendor/bin/sail artisan test --compact tests/Feature/Filament/TenantSetupTest.php - Focused suites for ProviderConnections/Verification ran during implementation (see local logs). Co-authored-by: Ahmed Darrazi <ahmeddarrazi@MacBookPro.fritz.box> Reviewed-on: #98
5.1 KiB
Research: Provider Connection Full Cutover
Feature: specs/081-provider-connection-cutover/spec.md
Date: 2026-02-07
Goal
Resolve repo-specific unknowns for the full credential cutover, and document decisions with rationale and alternatives.
Findings (Repo Reality)
Existing ProviderConnection / ProviderCredential primitives
ProviderConnectionexists as a workspace-owned, tenant-scoped integration asset.- Default invariant already exists at DB level via partial unique index:
provider_connections_default_uniqueon(tenant_id, provider)whereis_default = true.
ProviderCredentialexists and stores encrypted payload inpayload(encrypted:array) and is hidden from serialization.ProviderGateway::graphOptions(ProviderConnection $connection)builds Graph options usingCredentialManager.
Existing runtime provider call patterns
-
Some jobs are already ProviderConnection-first:
- Provider connection health check uses
ProviderGateway::graphOptions($connection). - Provider operation start gate (
ProviderOperationStartGate) usesprovider_connection_idin operation run context and dedupe.
- Provider connection health check uses
-
Legacy tenant credential reads still exist in high-impact services and UI:
- Services: inventory sync, policy sync, policy snapshots/backups, restore, RBAC onboarding, scope tag resolver.
- UI: tenant registration + tenant resource form exposes
app_client_id/app_client_secret.
Operations / observability primitives
OperationRunhas:status: queued|running|completedoutcome: pending|succeeded|partially_succeeded|failed (+ reserved cancelled)contextJSON field used for identity and target scope.
- Provider operation start gate already writes
context.provider,context.provider_connection_id, andcontext.target_scope.entra_tenant_id.
Decisions
D1 — Single Source of Truth: ProviderConnection + ProviderCredential
Decision: All runtime provider calls use ProviderConnection + ProviderCredential via ProviderGateway.
Rationale: Eliminates drift between verification vs restore and makes the suite deterministic and auditable.
Alternatives considered:
- Continue dual-source (tenant fields + provider connections): rejected due to drift and security risk.
- Allow runtime fallback to tenant fields: rejected; violates “single read path” and creates non-determinism.
D2 — Default enforcement applies to all providers; backfill creates Microsoft defaults only
Decision: The invariant “exactly one default per (tenant, provider)” is generic for all providers, but the one-time backfill only creates/repairs defaults for provider microsoft.
Rationale: Keeps the suite future-proof while delivering Microsoft-only cutover now.
Alternatives considered:
- Microsoft-only invariant: rejected; forces future migrations and special cases.
D3 — Blocked starts still create an OperationRun
Decision: Starting a provider-backed operation without usable configuration still creates an OperationRun record to preserve observability.
Rationale: Operators need a canonical record for “what was attempted” and why it is blocked.
Alternatives considered:
- UI-only blocked banner without a run: rejected; loses auditability/observability.
D4 — Represent “blocked” runs as a distinct OperationRun outcome
Decision: Introduce a blocked outcome on operation runs (keep status lifecycle unchanged: completed).
Rationale: The repo currently has no “blocked” status/outcome for runs; representing it explicitly prevents conflating blocked with failed.
Alternatives considered:
- Encode blocked as
outcome=failed+ reason_code: rejected; UI semantics become inconsistent and ambiguous. - Add a new status value (
blocked): rejected; affects active-run dedupe and status badge expectations more broadly.
D5 — Backfill selection rule for existing connections without a default
Decision: If exactly one Microsoft provider connection exists, set it default. If multiple exist, do not auto-select (requires admin remediation).
Rationale: Avoids accidental selection of the wrong app registration.
Alternatives considered:
- Always pick the oldest: rejected; unsafe in enterprise environments.
- Always create a new connection: rejected; increases clutter and may violate tenant/provider/entra uniqueness.
D6 — Legacy tenant credential reads allowed only in explicit backfill tooling
Decision: Legacy tenant fields (tenants.app_*) are forbidden in runtime and permitted only in backfill command/migration.
Rationale: Tightens the security posture and makes cutover verifiable via guard tests.
Alternatives considered:
- Runtime fallback: rejected.
- No backfill reads: rejected; forces manual secret re-entry for all tenants.
Open Points (to be handled in implementation)
- Centralize “next steps” as link keys (the repo currently embeds Filament URLs directly in verification checks).
- Determine the final reason_code taxonomy mapping for common exceptions (credential missing, auth failure, tenant mismatch).