TenantAtlas/specs/188-provider-connection-state-cleanup/research.md
ahmido 1655cc481e Spec 188: canonical provider connection state cleanup (#219)
## Summary
- migrate provider connections to the canonical three-dimension state model: lifecycle via `is_enabled`, consent via `consent_status`, and verification via `verification_status`
- remove legacy provider status and health badge paths, update admin and system directory surfaces, and align onboarding, consent callback, verification, resolver, and mutation flows with the new model
- add the Spec 188 artifact set, schema migrations, guard coverage, and expanded provider-state tests across admin, system, onboarding, verification, and rendering paths

## Verification
- `cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent`
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Auth/SystemPanelAuthTest.php tests/Feature/Filament/TenantGlobalSearchLifecycleScopeTest.php tests/Feature/ProviderConnections/ProviderConnectionEnableDisableTest.php tests/Feature/ProviderConnections/ProviderConnectionTruthCleanupSpec179Test.php`
- integrated browser smoke: validated admin provider list/detail/edit, tenant provider summary, system directory tenant detail, provider-connection search exclusion, and cleaned up the temporary smoke record afterward

## Filament / implementation notes
- Livewire v4.0+ compliance: preserved; this change targets Filament v5 on Livewire v4 and does not introduce older APIs
- Provider registration location: unchanged; Laravel 11+ panel providers remain registered in `bootstrap/providers.php`
- Globally searchable resources: `ProviderConnectionResource` remains intentionally excluded from global search; tenant global search remains enabled and continues to resolve to view pages
- Destructive actions: no new destructive action surface was introduced without confirmation or authorization; existing capability checks continue to gate provider mutations
- Asset strategy: unchanged; no new Filament assets were added, so deploy behavior for `php artisan filament:assets` remains unchanged
- Testing plan covered: system auth, tenant global search, provider lifecycle enable/disable behavior, and provider truth cleanup cutover behavior

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #219
2026-04-10 11:22:56 +00:00

74 lines
8.7 KiB
Markdown

# Phase 0 Research: Canonical Provider Connection State Cleanup
## Decision: Persist lifecycle as a single `is_enabled` boolean and present it as the `Lifecycle` dimension
**Rationale**: The only provider-state behavior still carried uniquely by legacy `status` is the explicit enabled or disabled gate. Consent and verification already have canonical enums and should not absorb administrative disablement. A boolean `is_enabled` is the narrowest persisted truth that preserves the current `disabled` behavior without inventing a broader lifecycle framework or another enum class.
**Alternatives considered**:
- Add a new `lifecycle_status` enum class: rejected because current-release behavior is binary and a boolean is narrower.
- Keep deriving disablement from legacy `status`: rejected because it preserves the dual-truth model the spec is removing.
- Encode disablement inside `verification_status`: rejected because administrative pause is not a verification outcome.
## Decision: Keep `ProviderConsentStatus` and `ProviderVerificationStatus` as the only consent and verification truths
**Rationale**: The current schema already stores `consent_status` and `verification_status`, and the runtime already uses them in onboarding, consent, and health-check flows. The cleanup should remove the competing legacy layer, not introduce a new readiness or health family. Consent continues to answer permission state, verification continues to answer what the latest check proves, and lifecycle remains separate.
**Alternatives considered**:
- Introduce a new combined provider readiness status: rejected because it would collapse three distinct operator questions back into one ambiguous state.
- Preserve legacy `status` as a derived compatibility field: rejected because the spec calls for a hard cut with no compatibility tail.
## Decision: Remove legacy projection outputs and trim `ProviderConnectionStateProjector` to canonical verification-outcome logic only
**Rationale**: `ProviderConnectionStateProjector::project()` and `projectForConnection()` exist to emit `status` and `health_status`, preserve `disabled`, and keep legacy fields alive across multiple writers. That responsibility disappears after cutover. The only reusable logic worth retaining is the current `projectVerificationOutcome()` reasoning that translates a health-check result plus current consent state into canonical consent and verification outcomes and error metadata. Retaining that narrow derivation avoids duplicating reason-code handling while still eliminating legacy field projection.
**Alternatives considered**:
- Keep the projector and repoint it to emit a new canonical DTO: rejected because direct canonical persistence is simpler and a new DTO layer is unnecessary.
- Inline all verification-outcome logic into each writer: rejected because the consent and verification reason-code rules already exist in one place and are reused.
## Decision: Replace legacy `HealthResult` transport fields with canonical verification output
**Rationale**: `HealthResult` currently transports legacy `status` and `healthStatus`, and `MicrosoftProviderHealthCheck` manufactures those values purely so downstream code can keep writing the removed columns. The contract should instead expose canonical verification intent and diagnostics only: `verificationStatus`, `reasonCode`, `message`, and `meta`. That keeps health-check output aligned with the final storage model and removes a legacy vocabulary bridge from the contract layer.
**Alternatives considered**:
- Keep `status` and `healthStatus` in `HealthResult` until the final migration: rejected because that prolongs legacy semantics inside the runtime contract layer.
- Replace `HealthResult` with a new service or result hierarchy: rejected because the existing contract can be narrowed without new abstraction.
## Decision: Treat reader cutover as the first breaking-change phase and schema removal as the last
**Rationale**: The current runtime still reads legacy fields in `ProviderConnectionResolver`, `ScanEntraAdminRolesJob`, `ProviderConnectionResource` action visibility, tenant provider summaries, system directory queries, and shared Blade views. Dropping columns before those reads move would break provider resolution, action gating, and page rendering immediately. The correct order is lifecycle foundation plus the one-time disabled-state backfill first, canonical reader cutover across shared helpers, operator surfaces, and runtime gates second, canonical writer and transport cleanup third, badge and helper cleanup fourth, residual test-scaffolding cleanup fifth, and schema removal last.
**Alternatives considered**:
- Remove columns immediately and fix failures opportunistically: rejected because the affected reads span resolvers, background jobs, Livewire actions, and read-only system pages.
- Dual-read or dual-write for a compatibility period: rejected because the feature explicitly forbids a compatibility tail.
## Decision: Recompute system-directory health summaries from canonical verification and permission truth
**Rationale**: The system directory currently counts provider issues from `health_status` and renders provider rows from `status` and `health_status`. Those summaries must move to canonical verification plus existing permission signals so `/system` remains semantically aligned with `/admin`. Tenant-level list rollups count only the default Microsoft provider connection for each tenant; non-default connections may still appear on the read-only detail page but do not affect list rollup badges. The existing `SystemHealth` badge domain can stay; only its input contract changes.
**Alternatives considered**:
- Leave `/system` on legacy diagnostics while `/admin` becomes canonical: rejected because the same connection would tell two different stories depending on the plane.
- Introduce a new system-only provider-health projection: rejected because that would add another layer of derived truth.
## Decision: Reuse existing centralized badge infrastructure and retire legacy provider badge domains
**Rationale**: The repo already has `BadgeDomain::BooleanEnabled`, `ProviderConsentStatus`, and `ProviderVerificationStatus`. That is enough to render lifecycle, consent, and verification without inventing a provider-lifecycle badge domain. The cleanup should delete `BadgeDomain::ProviderConnectionStatus`, `BadgeDomain::ProviderConnectionHealth`, and their mapper classes, while reusing `BooleanEnabled` for lifecycle and the existing provider consent and verification mappings for the other two dimensions.
**Alternatives considered**:
- Add a new provider lifecycle badge domain: rejected because `BooleanEnabled` already matches the required enabled or disabled semantics.
- Keep legacy badge domains for diagnostics only: rejected because their continued existence makes reintroduction of legacy truth too easy.
## Decision: Shared provider-state helpers and views must emit only lifecycle, consent, and verification
**Rationale**: `TenantResource::providerConnectionState()`, the shared `provider-connection-state.blade.php` entry, and system directory detail views still carry `status` and `health_status` keys. Those helpers are the most likely place for accidental legacy fallback because many surfaces share them. Their contract should change directly to lifecycle, consent, verification, and diagnostics such as `last_health_check_at` and `last_error_reason_code`.
**Alternatives considered**:
- Leave legacy keys in helper arrays and stop rendering them: rejected because hidden compatibility keys would keep inviting silent legacy reads.
- Create a new presenter for canonical provider state: rejected because the existing helper contract can be narrowed without adding a new presentation layer.
## Decision: Update factories and Pest helpers before final schema removal and add contradiction and residual-guard coverage
**Rationale**: `ProviderConnectionFactory` and the shared Pest helper still force `status = connected` and `health_status = ok` when constructing default provider fixtures. Those helpers will recreate legacy coupling unless they change before the drop migration lands. The regression suite also needs explicit contradiction scenarios, such as disabled plus consent granted and disabled plus verification healthy, so lifecycle separation is proven after the cutover. A final residual guard should fail if provider-state runtime paths still reference removed columns.
**Alternatives considered**:
- Let individual tests update as they fail: rejected because the helper layer would keep reintroducing removed fields across unrelated suites.
- Rely only on migration failure to find remaining references: rejected because many references live in PHP helpers, Blade views, and query builders that can survive until runtime.