ahmido 4db8030f2a Spec 081: Provider connection cutover (#98 )

Implements Spec 081 provider-connection cutover.

Highlights:
- Adds provider connection resolution + gating for operations/verification.
- Adds provider credential observer wiring.
- Updates Filament tenant verify flow to block with next-steps when provider connection isn’t ready.
- Adds spec docs under specs/081-provider-connection-cutover/ and extensive Spec081 test coverage.

Tests:
- vendor/bin/sail artisan test --compact tests/Feature/Filament/TenantSetupTest.php
- Focused suites for ProviderConnections/Verification ran during implementation (see local logs).

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@MacBookPro.fritz.box>
Reviewed-on: #98

2026-02-08 11:28:51 +00:00

5.1 KiB

Raw Blame History

Research: Provider Connection Full Cutover

Feature: specs/081-provider-connection-cutover/spec.md
Date: 2026-02-07

Goal

Resolve repo-specific unknowns for the full credential cutover, and document decisions with rationale and alternatives.

Findings (Repo Reality)

Existing ProviderConnection / ProviderCredential primitives

ProviderConnection exists as a workspace-owned, tenant-scoped integration asset.
Default invariant already exists at DB level via partial unique index:
- provider_connections_default_unique on (tenant_id, provider) where is_default = true.
ProviderCredential exists and stores encrypted payload in payload (encrypted:array) and is hidden from serialization.
ProviderGateway::graphOptions(ProviderConnection $connection) builds Graph options using CredentialManager.

Existing runtime provider call patterns

Some jobs are already ProviderConnection-first:
- Provider connection health check uses ProviderGateway::graphOptions($connection).
- Provider operation start gate (ProviderOperationStartGate) uses provider_connection_id in operation run context and dedupe.
Legacy tenant credential reads still exist in high-impact services and UI:
- Services: inventory sync, policy sync, policy snapshots/backups, restore, RBAC onboarding, scope tag resolver.
- UI: tenant registration + tenant resource form exposes app_client_id / app_client_secret.

Operations / observability primitives

OperationRun has:
- status: queued|running|completed
- outcome: pending|succeeded|partially_succeeded|failed (+ reserved cancelled)
- context JSON field used for identity and target scope.
Provider operation start gate already writes context.provider, context.provider_connection_id, and context.target_scope.entra_tenant_id.

Decisions

D1 — Single Source of Truth: ProviderConnection + ProviderCredential

Decision: All runtime provider calls use ProviderConnection + ProviderCredential via ProviderGateway.

Rationale: Eliminates drift between verification vs restore and makes the suite deterministic and auditable.

Alternatives considered:

Continue dual-source (tenant fields + provider connections): rejected due to drift and security risk.
Allow runtime fallback to tenant fields: rejected; violates “single read path” and creates non-determinism.

D2 — Default enforcement applies to all providers; backfill creates Microsoft defaults only

Decision: The invariant “exactly one default per (tenant, provider)” is generic for all providers, but the one-time backfill only creates/repairs defaults for provider microsoft.

Rationale: Keeps the suite future-proof while delivering Microsoft-only cutover now.

Alternatives considered:

Microsoft-only invariant: rejected; forces future migrations and special cases.

D3 — Blocked starts still create an OperationRun

Decision: Starting a provider-backed operation without usable configuration still creates an OperationRun record to preserve observability.

Rationale: Operators need a canonical record for “what was attempted” and why it is blocked.

Alternatives considered:

UI-only blocked banner without a run: rejected; loses auditability/observability.

D4 — Represent “blocked” runs as a distinct OperationRun outcome

Decision: Introduce a blocked outcome on operation runs (keep status lifecycle unchanged: completed).

Rationale: The repo currently has no “blocked” status/outcome for runs; representing it explicitly prevents conflating blocked with failed.

Alternatives considered:

Encode blocked as outcome=failed + reason_code: rejected; UI semantics become inconsistent and ambiguous.
Add a new status value (blocked): rejected; affects active-run dedupe and status badge expectations more broadly.

D5 — Backfill selection rule for existing connections without a default

Decision: If exactly one Microsoft provider connection exists, set it default. If multiple exist, do not auto-select (requires admin remediation).

Rationale: Avoids accidental selection of the wrong app registration.

Alternatives considered:

Always pick the oldest: rejected; unsafe in enterprise environments.
Always create a new connection: rejected; increases clutter and may violate tenant/provider/entra uniqueness.

D6 — Legacy tenant credential reads allowed only in explicit backfill tooling

Decision: Legacy tenant fields (tenants.app_*) are forbidden in runtime and permitted only in backfill command/migration.

Rationale: Tightens the security posture and makes cutover verifiable via guard tests.

Alternatives considered:

Runtime fallback: rejected.
No backfill reads: rejected; forces manual secret re-entry for all tenants.

Open Points (to be handled in implementation)

Centralize “next steps” as link keys (the repo currently embeds Filament URLs directly in verification checks).
Determine the final reason_code taxonomy mapping for common exceptions (credential missing, auth failure, tenant mismatch).

5.1 KiB Raw Blame History

Research: Provider Connection Full Cutover

Goal

Findings (Repo Reality)

Existing ProviderConnection / ProviderCredential primitives

Existing runtime provider call patterns

Operations / observability primitives

Decisions

D1 — Single Source of Truth: ProviderConnection + ProviderCredential

D2 — Default enforcement applies to all providers; backfill creates Microsoft defaults only

D3 — Blocked starts still create an OperationRun

D4 — Represent “blocked” runs as a distinct OperationRun outcome

D5 — Backfill selection rule for existing connections without a default

D6 — Legacy tenant credential reads allowed only in explicit backfill tooling

Open Points (to be handled in implementation)

5.1 KiB

Raw Blame History