TenantAtlas/specs/096-ops-polish-assignment-dedupe-system-tracking/plan.md
ahmido 03127a670b Spec 096: Ops polish (assignment summaries + dedupe + reconcile tracking + seed DX) (#115)
Implements Spec 096 ops polish bundle:

- Persist durable OperationRun.summary_counts for assignment fetch/restore (final attempt wins)
- Server-side dedupe for assignment jobs (15-minute cooldown + non-canonical skip)
- Track ReconcileAdapterRunsJob via workspace-scoped OperationRun + stable failure codes + overlap prevention
- Seed DX: ensure seeded tenants use UUID v4 external_id and seed satisfies workspace_id NOT NULL constraints

Verification (local / evidence-based):
- `vendor/bin/sail artisan test --compact tests/Feature/Operations/AssignmentRunSummaryCountsTest.php tests/Feature/Operations/AssignmentJobDedupeTest.php tests/Feature/Operations/ReconcileAdapterRunsJobTrackingTest.php tests/Feature/Seed/PoliciesSeederExternalIdTest.php`
- `vendor/bin/sail bin pint --dirty`

Spec artifacts included under `specs/096-ops-polish-assignment-dedupe-system-tracking/` (spec/plan/tasks/checklists).

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #115
2026-02-15 20:49:38 +00:00

141 lines
7.3 KiB
Markdown

# Implementation Plan: 096 — Ops Polish Bundle (Assignment job summaries + job dedupe + system job tracking + seeder DX)
**Branch**: `096-ops-polish-assignment-dedupe-system-tracking` | **Date**: 2026-02-15 | **Spec**: ./spec.md
**Input**: Feature specification from `specs/096-ops-polish-assignment-dedupe-system-tracking/spec.md`
**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/scripts/` for helper scripts.
## Summary
Improve operational reliability and observability for assignment-related jobs and a housekeeping job by:
- Persisting durable `OperationRun.summary_counts` for assignment fetch / restore runs (final-attempt semantics, no double counting across retries).
- Enforcing server-side deduplication for assignment jobs using a stable identity and the existing DB-level active-run unique indexes.
- Tracking `ReconcileAdapterRunsJob` as a workspace-scoped `OperationRun` (`type = ops.reconcile_adapter_runs`) with stable reason codes + sanitized errors.
- Fixing seed DX so seeded tenants always have a UUID v4 `external_id`.
## Technical Context
<!--
ACTION REQUIRED: Replace the content in this section with the technical details
for the project. The structure here is presented in advisory capacity to guide
the iteration process.
-->
**Language/Version**: PHP 8.4.x, Laravel 12
**Primary Dependencies**: Filament v5, Livewire v4, Laravel Sail (dev), PostgreSQL (dev), Microsoft Graph abstraction (existing)
**Storage**: PostgreSQL (JSONB used for `operation_runs.summary_counts`, `failure_summary`, `context`)
**Testing**: Pest v4 (via `vendor/bin/sail artisan test --compact`)
**Target Platform**: Linux container runtime (Dokploy deploy); macOS for local dev
**Project Type**: Laravel monolith (web + workers)
**Performance Goals**: Deduplication checks must be O(1) per dispatch/execute; no extra remote calls added
**Constraints**: No secrets in dedupe fingerprints, logs, or failure summaries; queued jobs remain safe under concurrency
**Scale/Scope**: Background ops for multiple tenants; correctness > throughput
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
- Inventory-first: PASS (no inventory semantics changed)
- Read/write separation: PASS (no new “write flows”; internal counters + run tracking + seeding)
- Graph contract path: PASS (no new Graph calls; no render-time external calls)
- Deterministic capabilities: PASS (no capability logic changes)
- RBAC-UX planes/isolation: PASS (no routes/pages added)
- Workspace/tenant isolation: PASS (all `OperationRun` reads/writes remain scoped via existing services; workspace-scoped run used only for housekeeping)
- Destructive confirmation standard: N/A (no Filament actions in scope)
- Global search tenant safety: N/A (no global search changes)
- Run observability standard: PASS (adds/strengthens `OperationRun` coverage + stable failures)
- Automation locks + idempotency: PASS (dedupe enforced via existing active-run DB unique indexes + job-level skip)
- Data minimization & safe logging: PASS (fingerprints are non-secret; failure messages sanitized)
- BADGE-001: N/A (no badge domains changed)
- Filament Action Surface Contract: N/A (no Filament UI changed)
## Project Structure
### Documentation (this feature)
```text
specs/096-ops-polish-assignment-dedupe-system-tracking/
├── plan.md # This file (/speckit.plan command output)
├── research.md # Phase 0 output (/speckit.plan command)
├── data-model.md # Phase 1 output (/speckit.plan command)
├── quickstart.md # Phase 1 output (/speckit.plan command)
├── contracts/ # Phase 1 output (/speckit.plan command)
└── tasks.md # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)
```
### Source Code (repository root)
<!--
ACTION REQUIRED: Replace the placeholder tree below with the concrete layout
for this feature. Delete unused options and expand the chosen structure with
real paths (e.g., apps/admin, packages/something). The delivered plan must
not include Option labels.
-->
```text
app/
├── Jobs/
│ ├── FetchAssignmentsJob.php
│ ├── RestoreAssignmentsJob.php
│ └── ReconcileAdapterRunsJob.php
├── Jobs/Middleware/
│ └── TrackOperationRun.php
├── Services/
│ └── OperationRunService.php
└── Support/
├── OperationCatalog.php
└── OperationRunType.php
database/seeders/
└── PoliciesSeeder.php
tests/Feature/
└── (new/updated Pest coverage for summary persistence, dedupe, workspace job tracking, seeding)
```
**Structure Decision**: Laravel monolith. Feature work touches queue jobs + run tracking services + seeders + Pest tests.
## Complexity Tracking
> **Fill ONLY if Constitution Check has violations that must be justified**
| Violation | Why Needed | Simpler Alternative Rejected Because |
|-----------|------------|-------------------------------------|
| [e.g., 4th project] | [current need] | [why 3 projects insufficient] |
| [e.g., Repository pattern] | [specific problem] | [why direct DB access insufficient] |
## Phase 0 — Research (output: `research.md`)
1. Confirm current run tracking + dedupe primitives in repo (`OperationRunService`, active-run unique indexes).
2. Confirm how assignment operations currently persist summaries (services vs jobs) and align on “final attempt wins”.
3. Confirm workspace-scoped `OperationRun` support for tenantless scheduled jobs.
## Phase 1 — Design (output: `data-model.md`, `contracts/*`, `quickstart.md`)
1. Data model: no schema changes expected; document how to use existing `operation_runs` fields for this feature.
2. Job dedupe design:
- Identity rule: prefer `operation_run_id`; otherwise `tenant_id + job_type + stable input fingerprint`.
- Enforcement:
- Dispatch-time: `OperationRunService::ensureRunWithIdentity(...)` (tenant-scoped) reuses the same active run.
- Execute-time: job must detect “not the canonical run” and skip to avoid overlap.
- Window: 15 minutes cooldown (blocks re-runs for the same identity until the window elapses; in addition to active-run overlap prevention).
3. Summary persistence design:
- Persist `summary_counts` at terminal completion via `OperationRunService::updateRun(...)` so retries overwrite.
4. Housekeeping job tracking:
- Use `OperationRunService::ensureWorkspaceRunWithIdentity(...)` with `type = ops.reconcile_adapter_runs`.
- Persist stable failure codes + sanitized messages via existing sanitizer patterns.
5. Seeder DX:
- Ensure `tenants.external_id` is always UUID v4 for seeded tenants (do not reuse `INTUNE_TENANT_ID` string for `external_id`).
## Phase 2 — Implementation Planning (maps to `tasks.md`)
1. Add/adjust helpers for assignment job identity derivation and job-level overlap prevention.
2. Update assignment jobs to persist summary counts on completion (and on terminal failure) with “final attempt” semantics.
3. Update `ReconcileAdapterRunsJob` to be `OperationRun`-tracked (workspace-scoped), with stable failure code(s).
4. Fix `PoliciesSeeder` to generate UUID v4 `external_id` for the seed tenant.
5. Add/adjust Pest tests covering:
- summary persistence for fetch/restore runs
- dedupe/overlap prevention within window
- workspace-scoped tracking for reconcile job
- seed workflow success + UUID `external_id`