Implements Spec 096 ops polish bundle: - Persist durable OperationRun.summary_counts for assignment fetch/restore (final attempt wins) - Server-side dedupe for assignment jobs (15-minute cooldown + non-canonical skip) - Track ReconcileAdapterRunsJob via workspace-scoped OperationRun + stable failure codes + overlap prevention - Seed DX: ensure seeded tenants use UUID v4 external_id and seed satisfies workspace_id NOT NULL constraints Verification (local / evidence-based): - `vendor/bin/sail artisan test --compact tests/Feature/Operations/AssignmentRunSummaryCountsTest.php tests/Feature/Operations/AssignmentJobDedupeTest.php tests/Feature/Operations/ReconcileAdapterRunsJobTrackingTest.php tests/Feature/Seed/PoliciesSeederExternalIdTest.php` - `vendor/bin/sail bin pint --dirty` Spec artifacts included under `specs/096-ops-polish-assignment-dedupe-system-tracking/` (spec/plan/tasks/checklists). Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de> Reviewed-on: #115
72 lines
4.0 KiB
Markdown
72 lines
4.0 KiB
Markdown
# Phase 0 — Research (096 Ops Polish Bundle)
|
||
|
||
This feature is an operations / background-job hardening pass. The design intentionally reuses existing run observability and dedupe primitives already present in the codebase.
|
||
|
||
## Decision 1 — Use `OperationRunService` + DB unique indexes for dedupe
|
||
|
||
**Decision:** Use `OperationRunService::ensureRunWithIdentity(...)` (tenant-scoped) and `OperationRunService::ensureWorkspaceRunWithIdentity(...)` (workspace-scoped) as the canonical dedupe mechanism, backed by the existing partial unique indexes for active runs.
|
||
|
||
**Rationale:**
|
||
- The repo already enforces “active-run dedupe MUST be enforced at DB level” via partial unique indexes and the `ensureRun*` helpers.
|
||
- DB enforcement remains correct under concurrency and across multiple workers.
|
||
- Keeps the single source of truth in `operation_runs` (Monitoring → Operations) rather than adding a second dedupe store.
|
||
|
||
**Alternatives considered:**
|
||
- Laravel job uniqueness (e.g., `ShouldBeUnique` / cache lock) — rejected because it introduces a second dedupe primitive outside the canonical `OperationRun` ledger and may behave differently across environments.
|
||
- Scheduler-level overlap prevention only — rejected because the spec requires dedupe at execution time and must handle duplicate dispatch / redelivery.
|
||
|
||
## Decision 2 — Dedupe identity rule (per spec clarifications)
|
||
|
||
**Decision:** Derive job identity as:
|
||
- Prefer `operation_run_id` when available.
|
||
- Otherwise `tenant_id + job_type + stable input fingerprint`.
|
||
|
||
**Rationale:**
|
||
- `operation_run_id` is already stable and non-secret.
|
||
- Fallback fingerprint avoids secrets, stays deterministic, and is suitable for both logging and DB identity hashing.
|
||
|
||
**Alternatives considered:**
|
||
- Fingerprinting full payloads — rejected to avoid secrets/PII and to keep dedupe stable even if non-essential context changes.
|
||
|
||
## Decision 3 — Enforce dedupe at execute time (not just dispatch)
|
||
|
||
**Decision:** Add an execution-time guard so a job skips early when it is not the canonical active run for its identity.
|
||
|
||
**Rationale:**
|
||
- Covers duplicate job dispatch/redelivery even if a caller fails to reuse the same `OperationRun` at dispatch time.
|
||
- Aligns with spec FR-006 and the constitution’s “queued/scheduled ops use locks + idempotency”.
|
||
|
||
**Alternatives considered:**
|
||
- Rely on dispatch-only dedupe — rejected because duplicate jobs can still be enqueued and run concurrently.
|
||
|
||
## Decision 4 — Summary counters use “final attempt wins” semantics
|
||
|
||
**Decision:** Persist `OperationRun.summary_counts` at terminal completion by overwriting with normalized counts (final attempt reflects truth; retries do not double count).
|
||
|
||
**Rationale:**
|
||
- Matches clarified requirement (“final attempt”) and avoids needing cross-attempt reconciliation.
|
||
- Fits existing patterns in `OperationRunService::updateRun(...)` which sanitizes/normalizes summary keys.
|
||
|
||
**Alternatives considered:**
|
||
- Incremental counters (`incrementSummaryCounts`) across attempts — rejected because it risks double-counting under retries unless attempt IDs are tracked.
|
||
|
||
## Decision 5 — Housekeeping job tracking is workspace-scoped
|
||
|
||
**Decision:** Track `ReconcileAdapterRunsJob` via a workspace-scoped `OperationRun` (`tenant_id = null`) using `type = ops.reconcile_adapter_runs`.
|
||
|
||
**Rationale:**
|
||
- The job is not tenant-specific and reconciles across runs; workspace-scoped runs are explicitly supported by the schema + service.
|
||
|
||
**Alternatives considered:**
|
||
- Create one run per tenant — rejected because it would misrepresent the job’s actual unit of work and inflate noise.
|
||
|
||
## Decision 6 — Seed tenant external ID must be UUID v4
|
||
|
||
**Decision:** Ensure the seed tenant’s `external_id` is generated as UUID v4 regardless of `INTUNE_TENANT_ID`.
|
||
|
||
**Rationale:**
|
||
- Matches clarified requirement and avoids coupling a human-readable env value to a UUID-constrained field.
|
||
|
||
**Alternatives considered:**
|
||
- Reuse `INTUNE_TENANT_ID` for `external_id` — rejected because it is not guaranteed UUID formatted.
|