TenantAtlas/specs/096-ops-polish-assignment-dedupe-system-tracking/plan.md
ahmido 03127a670b Spec 096: Ops polish (assignment summaries + dedupe + reconcile tracking + seed DX) (#115)
Implements Spec 096 ops polish bundle:

- Persist durable OperationRun.summary_counts for assignment fetch/restore (final attempt wins)
- Server-side dedupe for assignment jobs (15-minute cooldown + non-canonical skip)
- Track ReconcileAdapterRunsJob via workspace-scoped OperationRun + stable failure codes + overlap prevention
- Seed DX: ensure seeded tenants use UUID v4 external_id and seed satisfies workspace_id NOT NULL constraints

Verification (local / evidence-based):
- `vendor/bin/sail artisan test --compact tests/Feature/Operations/AssignmentRunSummaryCountsTest.php tests/Feature/Operations/AssignmentJobDedupeTest.php tests/Feature/Operations/ReconcileAdapterRunsJobTrackingTest.php tests/Feature/Seed/PoliciesSeederExternalIdTest.php`
- `vendor/bin/sail bin pint --dirty`

Spec artifacts included under `specs/096-ops-polish-assignment-dedupe-system-tracking/` (spec/plan/tasks/checklists).

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #115
2026-02-15 20:49:38 +00:00

7.3 KiB

Implementation Plan: 096 — Ops Polish Bundle (Assignment job summaries + job dedupe + system job tracking + seeder DX)

Branch: 096-ops-polish-assignment-dedupe-system-tracking | Date: 2026-02-15 | Spec: ./spec.md Input: Feature specification from specs/096-ops-polish-assignment-dedupe-system-tracking/spec.md

Note: This template is filled in by the /speckit.plan command. See .specify/scripts/ for helper scripts.

Summary

Improve operational reliability and observability for assignment-related jobs and a housekeeping job by:

  • Persisting durable OperationRun.summary_counts for assignment fetch / restore runs (final-attempt semantics, no double counting across retries).
  • Enforcing server-side deduplication for assignment jobs using a stable identity and the existing DB-level active-run unique indexes.
  • Tracking ReconcileAdapterRunsJob as a workspace-scoped OperationRun (type = ops.reconcile_adapter_runs) with stable reason codes + sanitized errors.
  • Fixing seed DX so seeded tenants always have a UUID v4 external_id.

Technical Context

Language/Version: PHP 8.4.x, Laravel 12
Primary Dependencies: Filament v5, Livewire v4, Laravel Sail (dev), PostgreSQL (dev), Microsoft Graph abstraction (existing)
Storage: PostgreSQL (JSONB used for operation_runs.summary_counts, failure_summary, context)
Testing: Pest v4 (via vendor/bin/sail artisan test --compact)
Target Platform: Linux container runtime (Dokploy deploy); macOS for local dev
Project Type: Laravel monolith (web + workers)
Performance Goals: Deduplication checks must be O(1) per dispatch/execute; no extra remote calls added
Constraints: No secrets in dedupe fingerprints, logs, or failure summaries; queued jobs remain safe under concurrency
Scale/Scope: Background ops for multiple tenants; correctness > throughput

Constitution Check

GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.

  • Inventory-first: PASS (no inventory semantics changed)
  • Read/write separation: PASS (no new “write flows”; internal counters + run tracking + seeding)
  • Graph contract path: PASS (no new Graph calls; no render-time external calls)
  • Deterministic capabilities: PASS (no capability logic changes)
  • RBAC-UX planes/isolation: PASS (no routes/pages added)
  • Workspace/tenant isolation: PASS (all OperationRun reads/writes remain scoped via existing services; workspace-scoped run used only for housekeeping)
  • Destructive confirmation standard: N/A (no Filament actions in scope)
  • Global search tenant safety: N/A (no global search changes)
  • Run observability standard: PASS (adds/strengthens OperationRun coverage + stable failures)
  • Automation locks + idempotency: PASS (dedupe enforced via existing active-run DB unique indexes + job-level skip)
  • Data minimization & safe logging: PASS (fingerprints are non-secret; failure messages sanitized)
  • BADGE-001: N/A (no badge domains changed)
  • Filament Action Surface Contract: N/A (no Filament UI changed)

Project Structure

Documentation (this feature)

specs/096-ops-polish-assignment-dedupe-system-tracking/
├── plan.md              # This file (/speckit.plan command output)
├── research.md          # Phase 0 output (/speckit.plan command)
├── data-model.md        # Phase 1 output (/speckit.plan command)
├── quickstart.md        # Phase 1 output (/speckit.plan command)
├── contracts/           # Phase 1 output (/speckit.plan command)
└── tasks.md             # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)

Source Code (repository root)

app/
├── Jobs/
│   ├── FetchAssignmentsJob.php
│   ├── RestoreAssignmentsJob.php
│   └── ReconcileAdapterRunsJob.php
├── Jobs/Middleware/
│   └── TrackOperationRun.php
├── Services/
│   └── OperationRunService.php
└── Support/
  ├── OperationCatalog.php
  └── OperationRunType.php

database/seeders/
└── PoliciesSeeder.php

tests/Feature/
└── (new/updated Pest coverage for summary persistence, dedupe, workspace job tracking, seeding)

Structure Decision: Laravel monolith. Feature work touches queue jobs + run tracking services + seeders + Pest tests.

Complexity Tracking

Fill ONLY if Constitution Check has violations that must be justified

Violation Why Needed Simpler Alternative Rejected Because
[e.g., 4th project] [current need] [why 3 projects insufficient]
[e.g., Repository pattern] [specific problem] [why direct DB access insufficient]

Phase 0 — Research (output: research.md)

  1. Confirm current run tracking + dedupe primitives in repo (OperationRunService, active-run unique indexes).
  2. Confirm how assignment operations currently persist summaries (services vs jobs) and align on “final attempt wins”.
  3. Confirm workspace-scoped OperationRun support for tenantless scheduled jobs.

Phase 1 — Design (output: data-model.md, contracts/*, quickstart.md)

  1. Data model: no schema changes expected; document how to use existing operation_runs fields for this feature.
  2. Job dedupe design:
  • Identity rule: prefer operation_run_id; otherwise tenant_id + job_type + stable input fingerprint.
  • Enforcement:
    • Dispatch-time: OperationRunService::ensureRunWithIdentity(...) (tenant-scoped) reuses the same active run.
    • Execute-time: job must detect “not the canonical run” and skip to avoid overlap.
  • Window: 15 minutes cooldown (blocks re-runs for the same identity until the window elapses; in addition to active-run overlap prevention).
  1. Summary persistence design:
  • Persist summary_counts at terminal completion via OperationRunService::updateRun(...) so retries overwrite.
  1. Housekeeping job tracking:
  • Use OperationRunService::ensureWorkspaceRunWithIdentity(...) with type = ops.reconcile_adapter_runs.
  • Persist stable failure codes + sanitized messages via existing sanitizer patterns.
  1. Seeder DX:
  • Ensure tenants.external_id is always UUID v4 for seeded tenants (do not reuse INTUNE_TENANT_ID string for external_id).

Phase 2 — Implementation Planning (maps to tasks.md)

  1. Add/adjust helpers for assignment job identity derivation and job-level overlap prevention.
  2. Update assignment jobs to persist summary counts on completion (and on terminal failure) with “final attempt” semantics.
  3. Update ReconcileAdapterRunsJob to be OperationRun-tracked (workspace-scoped), with stable failure code(s).
  4. Fix PoliciesSeeder to generate UUID v4 external_id for the seed tenant.
  5. Add/adjust Pest tests covering:
  • summary persistence for fetch/restore runs
  • dedupe/overlap prevention within window
  • workspace-scoped tracking for reconcile job
  • seed workflow success + UUID external_id