TenantAtlas/specs/239-canonical-operation-type-source-of-truth/research.md
ahmido fb32e9bfa5
Some checks failed
Main Confidence / confidence (push) Failing after 49s
feat: canonical operation type source of truth (#276)
## Summary
- implement the canonical operation type source-of-truth slice across operation writers, monitoring surfaces, onboarding flows, and supporting services
- add focused contract and regression coverage for canonical operation type handling
- include the generated spec 239 artifacts for the feature slice

## Validation
- browser smoke PASS for `/admin` -> workspace overview -> operations -> operation detail -> tenant-scoped operations drilldown
- spec/plan/tasks/quickstart artifact analysis cleaned up to a no-findings state
- automated test suite not run in this session

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #276
2026-04-25 18:11:23 +00:00

6.5 KiB

Research: Canonical Operation Type Source of Truth

Decision 1: Promote OperationCatalog from canonical read helper to sole normative contract

  • Decision: Treat the dotted definitions in App\Support\OperationCatalog as the single platform-owned operation_type contract for the first implementation slice, and converge write owners on those values directly.
  • Rationale: Repo reads show OperationCatalog already owns canonical dotted codes, labels, alias retirement metadata, and filter convergence, while OperationRunType, provider registry definitions, onboarding state, and lifecycle config still emit legacy aliases. Keeping canonicalCode() as a required second step preserves the dual-truth problem instead of removing it.
  • Alternatives considered: Keep the current dual-semantics model where enum or registry values remain legacy aliases and callers translate later via canonicalCode(). Rejected because it continues to teach the wrong contract and keeps every new caller responsible for remembering the translation step.

Decision 2: Keep compatibility explicitly bounded and read-side only

  • Decision: The only allowed compatibility seam is read-time alias resolution for historical operation_runs.type rows and persisted onboarding draft state already present during rollout.
  • Rationale: The spec explicitly requires historical readability, and repo truth shows legacy aliases still exist in stored rows and onboarding session state. Existing OperationCatalog::resolve() and onboarding normalization are the narrowest places to keep that seam without preserving aliases as current truth.
  • Alternatives considered:
    • Rewrite all historical rows and draft payloads as part of this slice. Rejected because it broadens the work into data backfill and migration cleanup outside the requested scope.
    • Add dual-write or fallback writers. Rejected by the spec and by LEAN-001 because it would make the drift permanent.

Decision 3: First-slice write owners are concrete and already visible in repo hotspots

  • Decision: The first implementation pass should converge these concrete write owners before widening into additional consumers: OperationRunType, ProviderOperationRegistry, ProviderOperationStartGate, ManagedTenantOnboardingWizard, and tenantpilot.operations.lifecycle.covered_types.
  • Rationale: Repo reads show each of these currently emits or persists raw aliases such as inventory_sync, baseline_capture, entra_group_sync, or backup_schedule_run. If they remain unchanged, touched read models will keep absorbing drift forever.
  • Alternatives considered: Start by patching only list labels and filter options. Rejected because read-only cleanup would leave onboarding, provider dispatch, and lifecycle policy config still writing legacy values.

Decision 4: Raw type-specific branches and operator-adjacent metadata are part of the first pass

  • Decision: Replace or bound raw type comparisons and raw operation_type metadata copies in touched consumers such as OperationRunResource, OperationRunLinks, OperationRunService, OperationRunTriageService, and FindingsLifecycleBackfillRunbookService.
  • Rationale: The repo already resolves labels through OperationCatalog, but several branches still compare against raw aliases like baseline_compare, baseline_capture, and inventory_sync, and several metadata payloads still emit (string) $run->type directly. Leaving those sites untouched would keep a hidden second truth even after primary writes are fixed.
  • Alternatives considered: Limit scope to visible labels only. Rejected because audit-adjacent summaries, system triage metadata, and onboarding audit payloads are operator-facing evidence paths too.

Decision 5: Canonical dotted codes with underscore segments remain canonical and unchanged

  • Decision: Preserve existing dotted canonical codes that already contain underscore segments and explicitly treat them as in-bounds current-release truth, not cleanup debt for this spec.
  • Rationale: Repo truth in OperationCatalog shows current canonical entries such as backup_set.update, directory.role_definitions.sync, tenant.review_pack.generate, tenant.evidence.snapshot.generate, entra.admin_roles.scan, and rbac.health_check. The spec explicitly forbids widening this slice into cosmetic segment renaming.
  • Alternatives considered: Rename all embedded underscore segments while the contract is being hardened. Rejected because that turns one contract-hardening spec into a broader vocabulary rewrite.

Decision 6: Non-UI exports and summaries that still surface raw type must be called out explicitly

  • Decision: Treat the following non-UI or audit-adjacent payload sites as in-scope planning targets for canonical operation_type emission: OperationRunService audit recorder metadata, OperationRunTriageService triage audit metadata, FindingsLifecycleBackfillRunbookService alert metadata, and onboarding audit metadata (operation_types and started_operation_type).
  • Rationale: The spec asked whether non-UI summaries still surface raw operation_runs.type. Repo reads confirm that they do, and these payloads influence operator and reviewer understanding even when they are not rendered in the primary table surface.
  • Alternatives considered: Leave those payloads out of scope as “not UI.” Rejected because the feature is about one platform contract across monitoring, onboarding, references, and audit-adjacent summaries, not just visible table labels.

Decision 7: Keep proof in focused unit, feature, Livewire, and architecture lanes

  • Decision: Use focused unit coverage for canonical resolution and registry truth, focused feature and Livewire coverage for onboarding and operations filters, and architecture guard coverage for vocabulary drift. Do not require browser coverage for proof.
  • Rationale: The repo hotspots and the spec show a shared-contract problem, not a browser-specific interaction problem. Existing tests already cover key surfaces such as OperationTypeResolutionTest, OperationRunListFiltersTest, ManagedTenantOnboardingWizardTest, ManagedTenantOnboardingProviderStartTest, and PlatformVocabularyBoundaryGuardTest.
  • Alternatives considered:
    • Browser proof for onboarding resume. Rejected as unnecessary for the first proving lane.
    • Repo-wide grep bans for legacy aliases. Rejected because historical fixtures and read-side compatibility remain intentionally bounded and a blind string-ban would either fail valid cases or encourage exception lists.