spec: refine 056 docs

This commit is contained in:
Ahmed Darrazi 2026-01-18 18:29:06 +01:00
parent af17875b9d
commit bcdeeb5525
10 changed files with 979 additions and 1 deletions

View File

@ -10,6 +10,7 @@ ## Active Technologies
- PostgreSQL (JSONB) (feat/042-inventory-dependencies-graph)
- PHP 8.4.x (Laravel 12) + Laravel 12, Filament v4, Livewire v3 (feat/047-inventory-foundations-nodes)
- PostgreSQL (JSONB for `InventoryItem.meta_jsonb`) (feat/047-inventory-foundations-nodes)
- PostgreSQL (JSONB in `operation_runs.context`, `operation_runs.summary_counts`) (056-remove-legacy-bulkops)
- PHP 8.4.15 (feat/005-bulk-operations)
@ -29,9 +30,9 @@ ## Code Style
PHP 8.4.15: Follow standard conventions
## Recent Changes
- 056-remove-legacy-bulkops: Added PHP 8.4.x + Laravel 12, Filament v4, Livewire v3
- feat/047-inventory-foundations-nodes: Added PHP 8.4.x (Laravel 12) + Laravel 12, Filament v4, Livewire v3
- feat/042-inventory-dependencies-graph: Added PHP 8.4.x + Laravel 12, Filament v4, Livewire v3
- feat/032-backup-scheduling-mvp: Added PHP 8.4.15 + Laravel 12, Filament v4, Livewire v3
<!-- MANUAL ADDITIONS START -->

View File

@ -0,0 +1,35 @@
# Specification Quality Checklist: Remove Legacy BulkOperationRun & Canonicalize Operations (v1.0)
**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-01-18
**Feature**: [specs/056-remove-legacy-bulkops/spec.md](../spec.md)
## Content Quality
- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed
## Requirement Completeness
- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified
## Feature Readiness
- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification
## Notes
- Validation pass: Spec describes the single-run model, action taxonomy, and MSP safety expectations in testable terms.
- Implementation specifics (service names, job design, exact key registries) are intentionally omitted; they belong in the plan/tasks phase.

View File

@ -0,0 +1,58 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "operation-run-context.bulk.schema.json",
"title": "OperationRun Context — Bulk Operation",
"type": "object",
"additionalProperties": true,
"properties": {
"operation": {
"type": "object",
"additionalProperties": true,
"properties": {
"type": { "type": "string", "minLength": 1 }
},
"required": ["type"]
},
"target_scope": {
"type": "object",
"additionalProperties": true,
"properties": {
"entra_tenant_id": { "type": "string", "minLength": 1 },
"directory_context_id": { "type": "string", "minLength": 1 }
},
"anyOf": [
{ "required": ["entra_tenant_id"] },
{ "required": ["directory_context_id"] }
]
},
"selection": {
"type": "object",
"additionalProperties": true,
"properties": {
"kind": { "type": "string", "enum": ["ids", "query"] },
"ids_hash": { "type": "string", "minLength": 1 },
"query_hash": { "type": "string", "minLength": 1 }
},
"allOf": [
{
"if": { "properties": { "kind": { "const": "ids" } }, "required": ["kind"] },
"then": { "required": ["ids_hash"], "not": { "required": ["query_hash"] } }
},
{
"if": { "properties": { "kind": { "const": "query" } }, "required": ["kind"] },
"then": { "required": ["query_hash"], "not": { "required": ["ids_hash"] } }
}
],
"required": ["kind"]
},
"idempotency": {
"type": "object",
"additionalProperties": true,
"properties": {
"fingerprint": { "type": "string", "minLength": 1 }
},
"required": ["fingerprint"]
}
},
"required": ["operation", "selection", "idempotency"]
}

View File

@ -0,0 +1,66 @@
openapi: 3.0.3
info:
title: TenantAtlas Operations — Bulk Enqueue (Conceptual)
version: 0.1.0
description: |
Conceptual contract for enqueue-only bulk operations.
Notes:
- This contract describes the shape of inputs and outputs; it does not prescribe a specific Laravel route.
- Start surfaces are enqueue-only and must not perform remote work inline.
paths:
/operations/bulk/enqueue:
post:
summary: Enqueue a bulk operation (enqueue-only)
operationId: enqueueBulkOperation
requestBody:
required: true
content:
application/json:
schema:
type: object
additionalProperties: true
properties:
operation_type:
type: string
minLength: 1
target_scope:
type: object
additionalProperties: true
properties:
entra_tenant_id:
type: string
directory_context_id:
type: string
selection:
type: object
additionalProperties: true
properties:
kind:
type: string
enum: [ids, query]
ids:
type: array
items:
type: string
query:
type: object
additionalProperties: true
required: [operation_type, selection]
responses:
'202':
description: Enqueued or deduped to an existing active run
content:
application/json:
schema:
type: object
additionalProperties: true
properties:
operation_run_id:
type: integer
status:
type: string
enum: [queued, running]
view_run_url:
type: string
required: [operation_run_id, status]

View File

@ -0,0 +1,87 @@
# Phase 1 — Data Model: Remove Legacy BulkOperationRun & Canonicalize Operations (v1.0)
This document describes the domain entities and data contracts impacted by Feature 056.
## Entities
### OperationRun (canonical)
Represents a single observable operation execution. Tenant-scoped.
**Core fields (existing):**
- `tenant_id`: Tenant scope for isolation.
- `user_id`: Initiator user (nullable for system).
- `initiator_name`: Human name for display.
- `type`: Operation type (stable identifier).
- `status`: queued | running | completed.
- `outcome`: pending | succeeded | partial | failed (exact naming per existing app semantics).
- `run_identity_hash`: Deterministic identity used for dedupe of active runs.
- `context`: JSON object containing operation inputs and metadata.
- `summary_counts`: JSON object with canonical metrics keys.
- `failure_summary`: JSON array/object with stable reason codes + sanitized messages.
- `started_at`, `completed_at`: Timestamps.
**Invariants:**
- Active-run dedupe is tenant-wide.
- Monitoring renders from DB only.
- `summary_counts` keys are constrained to the canonical registry.
- Failure messages are sanitized and do not include secrets.
### Operation Type (logical)
A stable identifier used to categorize runs, label them for admins, and enforce consistent UX.
**Attributes:**
- `type` (string): e.g., `policy_version.prune`, `backup_set.delete`, etc.
- `label` (string): human readable name.
- `expected_duration_seconds` (optional): typical duration.
- `allowed_summary_keys`: canonical registry.
### Target Scope (logical)
A directory/remote tenant scope that an operation targets.
**Context fields (when applicable):**
- `entra_tenant_id`: Azure AD / Entra tenant GUID.
- `directory_context_id`: Internal directory context identifier.
At least one of the above must be present for directory-targeted operations.
### Bulk Selection Identity (logical)
Deterministic definition of “what the bulk action applies to”, required for idempotency.
**Decision**: Hybrid identity.
- Explicit selection: `selection.ids_hash`
- Query/filter selection (“select all”): `selection.query_hash`
**Properties:**
- `selection.kind`: `ids` | `query`
- `selection.ids_hash`: sha256 of stable, sorted IDs.
- `selection.query_hash`: sha256 of normalized filter/query payload.
### Bulk Idempotency Fingerprint (logical)
Deterministic fingerprint used to dedupe active runs and prevent duplicated work.
**Components:**
- operation `type`
- target scope (`entra_tenant_id` or `directory_context_id`)
- selection identity (hybrid)
## Removed Entity
### BulkOperationRun (legacy)
This entity is removed by Feature 056.
- Legacy model/service/table are deleted.
- No new writes to legacy tables after cutover.
- Historical import into OperationRun is not performed.

View File

@ -0,0 +1,163 @@
# Implementation Plan: Remove Legacy BulkOperationRun & Canonicalize Operations (v1.0)
**Branch**: `056-remove-legacy-bulkops` | **Date**: 2026-01-18 | **Spec**: [specs/056-remove-legacy-bulkops/spec.md](./spec.md)
**Input**: Feature specification from `/specs/056-remove-legacy-bulkops/spec.md`
**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/scripts/` for helper scripts.
## Summary
Unify all bulk/operational work onto `OperationRun` (canonical run model + Monitoring surface) by migrating all legacy `BulkOperationRun` workflows to OperationRun-backed orchestration, removing the legacy stack (model/service/table/UI), and adding guardrails that prevent reintroduction.
## Technical Context
**Language/Version**: PHP 8.4.x
**Primary Dependencies**: Laravel 12, Filament v4, Livewire v3
**Storage**: PostgreSQL (JSONB in `operation_runs.context`, `operation_runs.summary_counts`)
**Testing**: Pest (PHPUnit 12)
**Target Platform**: Docker via Laravel Sail (local); Dokploy (staging/prod)
**Project Type**: Web application (Laravel monolith)
**Performance Goals**: Calm polling UX for Monitoring; bulk orchestration chunked and resilient to throttling; per-scope concurrency default=1
**Constraints**: Tenant isolation; no secrets in run failures/notifications; no remote work during UI render; Monitoring is DB-only
**Scale/Scope**: Bulk actions may target large selections; orchestration must remain idempotent and debounced by run identity
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
- Inventory-first: N/A (this feature is operations plumbing; no inventory semantics change)
- Read/write separation: PASS (bulk actions remain enqueue-only; write paths are job-backed and auditable)
- Graph contract path: PASS (no new render-side Graph calls; any remote work stays behind queue + existing Graph client boundary)
- Deterministic capabilities: PASS (no capability derivation changes)
- Tenant isolation: PASS (OperationRun is tenant-scoped; bulk dedupe is tenant-wide; selection identity is deterministic)
- Run observability: PASS (bulk is always OperationRun-backed; DB-only <2s actions remain audit-only)
- Automation: PASS (locks + idempotency required; per-target concurrency is config-driven default=1)
- Data minimization: PASS (failure summaries stay sanitized; no secrets in run records)
## Project Structure
### Documentation (this feature)
```text
specs/056-remove-legacy-bulkops/
├── plan.md # This file (/speckit.plan command output)
├── research.md # Phase 0 output (/speckit.plan command)
├── data-model.md # Phase 1 output (/speckit.plan command)
├── quickstart.md # Phase 1 output (/speckit.plan command)
├── contracts/ # Phase 1 output (/speckit.plan command)
└── tasks.md # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)
```
### Source Code (repository root)
```text
app/
├── Filament/
│ ├── Resources/
│ └── Pages/
├── Jobs/
├── Models/
├── Notifications/
├── Services/
└── Support/
config/
├── tenantpilot.php
└── graph_contracts.php
database/
├── migrations/
└── factories/
resources/
└── views/
routes/
└── web.php
tests/
├── Feature/
└── Unit/
```
**Structure Decision**: Laravel web application (monolith). Feature changes are expected primarily under `app/` (runs, jobs, Filament actions), `database/migrations/` (dropping legacy tables), and `tests/` (Pest guardrails).
## Complexity Tracking
> **Fill ONLY if Constitution Check has violations that must be justified**
| Violation | Why Needed | Simpler Alternative Rejected Because |
|-----------|------------|-------------------------------------|
| None | N/A | N/A |
## Constitution Check (Post-Design)
*Re-check after Phase 1 outputs are complete.*
- Monitoring remains DB-only at render time.
- Start surfaces remain enqueue-only.
- Bulk work is always OperationRun-backed.
- Per-target scope concurrency is config-driven (default=1).
- Bulk idempotency uses hybrid selection identity.
## Phase 0 — Outline & Research (output: research.md)
### Discovery signals (repo sweep)
Legacy bulk-run usage exists and must be migrated/removed:
- Jobs using legacy run/service include:
- `app/Jobs/BulkPolicySyncJob.php`
- `app/Jobs/BulkPolicyDeleteJob.php`
- `app/Jobs/BulkBackupSetDeleteJob.php`
- `app/Jobs/BulkPolicyVersionPruneJob.php`
- `app/Jobs/BulkPolicyVersionForceDeleteJob.php`
- `app/Jobs/BulkRestoreRunDeleteJob.php`
- `app/Jobs/BulkTenantSyncJob.php`
- `app/Jobs/CapturePolicySnapshotJob.php`
- `app/Jobs/GenerateDriftFindingsJob.php` (mixed: legacy + OperationRun)
Legacy data layer present:
- Model: `app/Models/BulkOperationRun.php`
- Service: `app/Services/BulkOperationService.php`
- Migrations: `database/migrations/*_create_bulk_operation_runs_table.php`, `*_add_idempotency_key_to_bulk_operation_runs_table.php`, `*_increase_bulk_operation_runs_status_length.php`
- Factory/Seeder: `database/factories/BulkOperationRunFactory.php`, `database/seeders/BulkOperationsTestSeeder.php`
Existing canonical run patterns to reuse:
- `app/Services/OperationRunService.php` provides tenant-wide active-run dedupe and safe dispatch semantics.
- `app/Support/OperationCatalog.php` centralizes labels/durations and allowed summary keys.
- `app/Support/OpsUx/OperationSummaryKeys.php` + `SummaryCountsNormalizer` enforce summary_counts contract.
Concurrency/idempotency patterns already exist and should be adapted:
- `app/Services/Inventory/InventoryConcurrencyLimiter.php` uses per-scope cache locks with config-driven defaults.
- `app/Services/Inventory/InventorySyncService.php` uses selection hashing + selection locks to prevent duplicate work.
### Research outputs (decisions)
- Per-target scope concurrency is configuration-driven with default=1.
- Bulk selection identity is hybrid (IDs-hash when explicit IDs; query-hash when “select all via filter/query”).
- Legacy bulk-run history is not imported into OperationRun (default path).
## Phase 1 — Design & Contracts (outputs: data-model.md, contracts/*, quickstart.md)
### Design topics
- Define the canonical bulk orchestration shape around OperationRun (orchestrator + workers) while preserving the constitution feedback surfaces.
- Define the minimal run context contract for directory-targeted runs (target scope fields + idempotency fingerprint fields).
- Extend operation type catalog for newly migrated bulk operations.
### Contract artifacts
- JSON schema for `operation_runs.context` for bulk operations (target scope + selection identity + idempotency).
- OpenAPI sketch for internal operation-triggering endpoints (if applicable) as a stable contract for “enqueue-only” start surfaces.
## Phase 2 — Planning (implementation outline; detailed tasks live in tasks.md)
- Perform required discovery sweep and create a migration report.
- Migrate each legacy bulk workflow to OperationRun-backed orchestration.
- Remove legacy model/service/table/UI surfaces.
- Add hard guardrails (CI/test) to forbid legacy references and verify run-backed behavior.
- Add targeted Pest tests for bulk actions, per-scope throttling default=1, and summary key normalization.

View File

@ -0,0 +1,39 @@
# Quickstart: Feature 056 — Remove Legacy BulkOperationRun
## Prerequisites
- Local dev via Laravel Sail
- Database migrations up to date
## Common commands (Sail-first)
- Boot: `./vendor/bin/sail up -d`
- Run migrations: `./vendor/bin/sail artisan migrate`
- Run targeted tests: `./vendor/bin/sail artisan test tests/Feature`
- Format (required): `./vendor/bin/pint --dirty`
## What to build (high level)
- Replace all legacy bulk-run usage with the canonical OperationRun run model.
- Ensure all bulk actions are enqueue-only and visible in Monitoring → Operations.
- Enforce per-target scope concurrency limit (config-driven, default=1).
- Enforce bulk idempotency via deterministic fingerprinting.
- Remove legacy BulkOperationRun stack (model/service/table/UI).
- Add guardrails/tests to prevent reintroduction.
## Where to look first
- Legacy stack:
- `app/Models/BulkOperationRun.php`
- `app/Services/BulkOperationService.php`
- `database/migrations/*bulk_operation_runs*`
- Canonical run stack:
- `app/Models/OperationRun.php`
- `app/Services/OperationRunService.php`
- `app/Support/OperationCatalog.php`
- `app/Support/OpsUx/OperationSummaryKeys.php`
- Locking patterns to reuse:
- `app/Services/Inventory/InventoryConcurrencyLimiter.php`
- `app/Services/Inventory/InventorySyncService.php`

View File

@ -0,0 +1,89 @@
# Phase 0 — Research: Remove Legacy BulkOperationRun & Canonicalize Operations (v1.0)
**Branch**: 056-remove-legacy-bulkops
**Date**: 2026-01-18
## Goals for Research
- Resolve spec clarifications and translate them into concrete implementation constraints.
- Identify existing repo patterns for:
- run identity / dedupe
- summary_counts normalization
- locks / concurrency limiting
- idempotent selection hashing
- Enumerate known legacy usage locations to inform the discovery report.
## Findings & Decisions
### Decision 1 — Per-target scope concurrency
- **Decision**: Concurrency limits are configuration-driven **with default = 1** per target scope.
- **Rationale**: Default=1 is the safest choice against Graph throttling and reduces blast radius when many tenants/scope targets are active.
- **Alternatives considered**:
- Default=2: more throughput but higher throttling risk.
- Default=5: increases throttling/incident risk.
- Hardcoded values: hard to tune per environment.
- **Implementation constraint**: Limit is enforced per `entra_tenant_id` or `directory_context_id`.
### Decision 2 — Selection Identity (idempotency fingerprint)
- **Decision**: Hybrid selection identity.
- If the user explicitly selects IDs: fingerprint includes an **IDs-hash**.
- If the user selects via filter/query (“select all”): fingerprint includes a **query/filter hash**.
- **Rationale**: Supports both UX patterns and avoids duplicate runs while remaining deterministic.
- **Alternatives considered**:
- IDs-hash only: cannot represent “select all by filter”.
- Query-hash only: cannot safely represent explicit selections.
- Always store both: increases complexity without clear value.
### Decision 3 — Legacy history handling
- **Decision**: Do not import legacy bulk-run history into OperationRun.
- **Rationale**: Minimizes migration risk and avoids polluting the canonical run surface with “synthetic” imported data.
- **Alternatives considered**:
- One-time import: adds complexity, new semantics, and additional testing burden.
### Decision 4 — Canonical summary metrics
- **Decision**: Summary metrics are derived and rendered from `operation_runs.summary_counts` using the canonical key registry.
- **Rationale**: Ensures consistent Monitoring UX and prevents ad-hoc keys.
- **Alternatives considered**:
- Per-operation bespoke metrics: causes UX drift and breaks shared widgets.
### Decision 5 — Reuse existing repo patterns
- **Decision**: Reuse existing run, lock, and selection hashing patterns already present in the repository.
- **Rationale**: Aligns with constitution and avoids divergent implementations.
- **Evidence in repo**:
- `OperationRunService` provides tenant-wide active-run dedupe and safe dispatch failure handling.
- `OperationCatalog` centralizes labels/durations and allowed summary keys.
- `InventoryConcurrencyLimiter` shows slot-based lock acquisition with config-driven maxima.
- `InventorySyncService` shows deterministic selection hashing and selection-level locks.
## Legacy Usage Inventory (initial)
This is a starting list derived from a repo search; the Phase 2 Discovery Report must expand/confirm.
- Legacy bulk-run model: `app/Models/BulkOperationRun.php`
- Legacy bulk-run service: `app/Services/BulkOperationService.php`
- Legacy jobs using BulkOperationRun/Service:
- `app/Jobs/BulkPolicySyncJob.php`
- `app/Jobs/BulkPolicyDeleteJob.php`
- `app/Jobs/BulkBackupSetDeleteJob.php`
- `app/Jobs/BulkPolicyVersionPruneJob.php`
- `app/Jobs/BulkPolicyVersionForceDeleteJob.php`
- `app/Jobs/BulkRestoreRunDeleteJob.php`
- `app/Jobs/BulkTenantSyncJob.php`
- `app/Jobs/CapturePolicySnapshotJob.php`
- `app/Jobs/GenerateDriftFindingsJob.php` (mixed usage)
- Legacy DB artifacts:
- `database/migrations/2025_12_23_215901_create_bulk_operation_runs_table.php`
- `database/migrations/2026_01_11_120001_add_idempotency_key_to_bulk_operation_runs_table.php`
- `database/migrations/2025_12_24_005055_increase_bulk_operation_runs_status_length.php`
- Legacy test data:
- `database/factories/BulkOperationRunFactory.php`
- `database/seeders/BulkOperationsTestSeeder.php`
## Open Questions
None remaining for Phase 0 (spec clarifications resolved). Any additional unknowns found during discovery are to be added to the Phase 2 discovery report and/or tasks.

View File

@ -0,0 +1,190 @@
# Feature Specification: Remove Legacy BulkOperationRun & Canonicalize Operations (v1.0)
**Feature Branch**: `056-remove-legacy-bulkops`
**Created**: 2026-01-18
**Status**: Draft
**Input**: User description: "Feature 056 — Remove Legacy BulkOperationRun & Canonicalize Operations (v1.0)"
## Clarifications
### Session 2026-01-18
- Q: What should be the default max concurrency per target scope (entra_tenant_id / directory_context_id) for bulk operations? → A: Config-driven, default=1
- Q: How should Selection Identity be determined for idempotency fingerprinting? → A: Hybrid (IDs-hash for explicit selection; query-hash for “select all via filter/query”)
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Run-backed bulk actions are always observable (Priority: P1)
An admin performs a bulk action (e.g., apply/ignore/restore/prune across many records). The system records a single canonical run that can be monitored end-to-end, including partial failures, and provides consistent user feedback.
**Why this priority**: Bulk changes are operationally significant and must be traceable, support partial outcomes, and have a consistent mental model for admins.
**Independent Test**: Trigger a representative bulk action and verify that a run record exists, appears in the Monitoring list, has a detail view, and emits the correct feedback surfaces.
**Acceptance Scenarios**:
1. **Given** an admin selects multiple items for a bulk action, **When** the action is confirmed and submitted, **Then** a canonical run record is created or reused and the UI confirms the enqueue/queued state via a toast.
2. **Given** a bulk run is queued or running, **When** the admin opens Monitoring → Operations, **Then** the run appears in the list and can be opened via a canonical “View run” link.
3. **Given** a bulk run completes with a mix of successes and failures, **When** the run reaches a terminal state, **Then** the initiator receives a terminal notification and the run detail shows a summary of outcomes.
---
### User Story 2 - Monitoring is the single source of run history (Priority: P2)
An admin (or operator) relies on Monitoring → Operations to see the full history of operational work (including bulk). There are no separate legacy run surfaces; links from anywhere in the app point to the canonical run detail.
**Why this priority**: Multiple run systems lead to missed incidents, inconsistent retention, and developer confusion. One canonical surface improves operational clarity and reduces support overhead.
**Independent Test**: Navigate from a bulk action result to “View run” and confirm it lands in Monitorings run detail; confirm there is no legacy “bulk runs” navigation or pages.
**Acceptance Scenarios**:
1. **Given** any UI element offers a “View run” link, **When** it is clicked, **Then** it opens the canonical Monitoring → Operations → Run Detail page for that run.
2. **Given** the app navigation, **When** an admin searches for legacy bulk-run screens, **Then** no legacy bulk-run navigation or pages exist.
---
### User Story 3 - Developers cant accidentally reintroduce legacy patterns (Priority: P3)
A developer adds or modifies an admin action. They can clearly determine whether it is an audit-only action or a run-backed operation, and the repository enforces the single-run model by preventing legacy references and UX drift.
**Why this priority**: Preventing regression is essential for suite readiness and long-term maintainability.
**Independent Test**: Introduce a legacy reference or a bulk action without a run-backed record and confirm CI/automated checks fail.
**Acceptance Scenarios**:
1. **Given** a change introduces any reference to the legacy bulk-run system, **When** tests/CI run, **Then** the pipeline fails with a clear message.
2. **Given** a security-relevant DB-only action that is eligible for audit-only classification, **When** the action runs, **Then** an audit log entry is recorded and no run record is created.
### Edge Cases
- Bulk selection is empty or resolves to zero items: the system does not start work and provides a clear non-destructive result.
- A bulk selection is very large: the system remains responsive and continues to show progress via run summary metrics.
- Target scope is required but missing: the system fails safely, records a terminal run with a stable reason code, and does not execute remote/bulk mutations.
- Remote calls experience throttling: the system applies bounded retries with jittered backoff and records failures without losing overall run visibility.
- Duplicate submissions (double click / retry / re-run): idempotency prevents duplicate processing and preserves a single canonical outcome per selection identity.
- Tenant isolation: no run, selection, summary, or notifications leak across tenants.
## Requirements *(mandatory)*
**Constitution alignment (required):** This feature consolidates operational work onto a single canonical run model and a single monitoring surface. It must preserve the defined user feedback surfaces (queued toast, active widget, terminal notification), ensure tenant-scoped observability, and maintain stable, sanitized messages and reason codes.
### Functional Requirements
- **FR-001 Single run model**: The system MUST use a single canonical run model (`OperationRun`) for all run-backed operations; the legacy bulk-run model MUST not exist after this feature.
- **FR-002 Bulk actions are run-backed**: Any bulk action (apply to N records, chunked work, mass ignore/restore/prune/delete) MUST create or reuse an `OperationRun` and MUST be visible in Monitoring → Operations.
- **FR-003 Action taxonomy**: Every admin action MUST be classified as exactly one of:
- **Audit-only DB action**: DB-only, no remote/external calls, no queued work, and bounded DB work; typically completes within ~2 seconds (guidance, not a hard rule). MUST write an audit log for security/ops-relevant state changes; MUST NOT create an `OperationRun`.
- **Run-backed operation**: queued/long-running/remote/bulk/scheduled or otherwise operationally significant; MUST create or reuse an `OperationRun`.
**Decision rule**: If classification is uncertain, default to **Run-backed operation**.
- **FR-004 Canonical UX surfaces**: For run-backed operations, the system MUST use only these feedback surfaces:
- **Queued**: toast-only
- **Active**: tenant-wide active widget
- **Terminal**: database-backed notification to the initiator only
- **FR-005 Canonical routing**: All “View run” links MUST route to Monitoring → Operations → Run Detail.
- **FR-006 Legacy removal**: The system MUST remove legacy bulk-run tables/models/services/routes/widgets/navigation and MUST prevent any new legacy writes.
- **FR-007 Canonical summary metrics**: The runs summary metrics MUST use a single canonical set of keys and MUST be presented consistently in the run detail view.
- **FR-008 Target scope recording**: For operations targeting a directory/remote tenant, the run context MUST record the target scope (directory identifier) and Monitoring/Run Detail MUST display it in a human-friendly way when available.
- **FR-009 Per-target throttling**: Bulk orchestration MUST enforce concurrency limits per target scope to reduce throttling risk and provide predictable execution; the limit MUST be configuration-driven with a default of 1 per target scope.
- **FR-010 Idempotency for bulk**: Bulk operations MUST be idempotent using a deterministic fingerprint that includes operation type, target scope, and selection identity; retries MUST NOT duplicate work.
- **FR-011 Discovery completeness**: The implementation MUST include a repo-wide discovery sweep of legacy references and bulk-like actions; findings MUST be recorded in a discovery report with classification and migration/deferral decisions.
- **FR-012 Regression guardrails**: Automated checks MUST fail if legacy bulk-run references reappear or if bulk actions bypass the canonical run-backed model.
### Non-Functional Requirements (NFR)
#### NFR-01 Monitoring is DB-only at render time (Constitution Gate)
All Monitoring → Operations pages (index and run detail) MUST be DB-only at render time:
- No Graph/remote calls during initial render or reactive renders.
- No side-effectful work triggered by view rendering.
**Verification**:
- Add a regression test/guard that mocks the Graph client (or equivalent remote client) and asserts it is not called during Monitoring renders.
- Add a regression test/guard that mocks the Graph client (or equivalent remote client) and asserts it is not called during Monitoring renders.
#### NFR-02 Failure reason codes and message sanitization
Run-backed operations MUST store failures as stable, machine-readable `reason_code` values plus a sanitized, user-facing message.
**Minimal required reason_code set (baseline)**:
| reason_code | Meaning |
|------------|---------|
| graph_throttled | Remote service throttled (e.g., rate limited) |
| graph_timeout | Remote call timed out |
| permission_denied | Missing/insufficient permissions |
| validation_error | Input/selection validation failure |
| conflict_detected | Conflict detected (concurrency/version/resource state) |
| unknown_error | Fallback when no specific code applies |
**Rules**:
- `reason_code` is stable over time and safe to use in programmatic filters/alerts.
- Failure messages are sanitized and bounded in length; failures/notifications MUST NOT persist secrets/tokens/PII or raw payload dumps.
#### NFR-03 Retry/backoff/jitter for remote throttling
When worker jobs perform remote calls, they MUST handle transient failures (including 429/503) via a shared policy:
- bounded retries
- exponential backoff with jitter
- no hand-rolled `sleep()` loops or ad-hoc random retry logic in feature code
### Implementation Shape (decision)
**Decision: standard orchestrator + item workers**
- 1 orchestrator job per run:
- resolves selection deterministically
- chunks work
- dispatches item worker jobs (idempotent per item)
- Worker jobs update `operation_runs.summary_counts` via canonical normalization.
- Finalization sets terminal status once.
### Target Scope (canonical keys)
**Canonical context keys**:
- `entra_tenant_id` (Azure AD tenant GUID)
- optional `entra_tenant_name` (human-friendly; if available)
- optional `directory_context_id` (internal directory context identifier, if/when introduced)
For operations targeting a directory/remote tenant, the run context MUST record target scope using the canonical keys above, and Monitoring/Run Detail MUST display the target scope (human-friendly name if available).
#### Assumptions
- Existing run status semantics remain unchanged (queued/running/succeeded/partial/failed).
- Existing monitoring experience is not redesigned; it is aligned so that all operational work is represented consistently.
#### Dependencies
- Prior consolidation work establishing `OperationRun` as the canonical run model and Monitoring → Operations as the canonical surface.
- Existing audit logging conventions for security/ops-relevant DB-only actions.
#### Legacy History Decision (recorded)
- Default path: legacy bulk-run history is not migrated into the canonical run model. The legacy tables are removed after cutover, relying on database backups/exports if historical investigation is needed.
### Key Entities *(include if feature involves data)*
- **OperationRun**: A tenant-scoped record of operational work with status, timestamps, sanitized user-facing message/reason code, summary metrics, and context.
- **Operation Type**: A stable identifier describing the kind of operation (used for categorization, labeling, and governance).
- **Target Scope**: The directory / remote tenant scope that the operation targets (when applicable).
- **Selection Identity**: The deterministic definition of “what the bulk action applies to” used for idempotency and traceability.
- **Audit Log Entry**: A record of security/ops-relevant state changes for audit-only DB actions.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: 100% of bulk actions in the admin UI create or reuse a canonical run record and appear in Monitoring → Operations.
- **SC-002**: Repository contains 0 references to the legacy bulk-run system after completion, enforced by automated checks.
- **SC-003**: For directory-targeted operations, 100% of run records display a target scope in Monitoring/Run Detail.
- **SC-004**: For bulk operations, duplicate submissions do not increase processed item count beyond one idempotent execution per selection identity.
- **SC-005**: Admins can locate a completed bulk run in Monitoring within 30 seconds using standard navigation and filters, without relying on legacy pages.

View File

@ -0,0 +1,250 @@
# Tasks: Remove Legacy BulkOperationRun & Canonicalize Operations (v1.0)
**Input**: Design documents from `/specs/056-remove-legacy-bulkops/`
**Prerequisites**: plan.md (required), spec.md (required), research.md, data-model.md, contracts/, quickstart.md
**Tests**: Required (Pest)
**Operations**: This feature consolidates queued/bulk work onto canonical `OperationRun` and removes the legacy `BulkOperationRun` system.
## Phase 1: Setup (Shared Infrastructure)
**Purpose**: Ensure feature docs/paths are ready for implementation and review
- [ ] T001 Review and update quickstart commands in specs/056-remove-legacy-bulkops/quickstart.md
- [ ] T002 [P] Create discovery report scaffold in specs/056-remove-legacy-bulkops/discovery.md
- [ ] T003 [P] Add a “legacy history decision” section to specs/056-remove-legacy-bulkops/discovery.md
---
## Phase 2: Foundational (Blocking Prerequisites)
**Purpose**: Shared primitives required for all bulk migrations and hardening
- [ ] T004 Populate the full repo-wide discovery sweep in specs/056-remove-legacy-bulkops/discovery.md (app/, resources/, database/, tests/)
- [ ] T005 [P] Add config keys for per-target scope concurrency (default=1) in config/tenantpilot.php
- [ ] T006 [P] Register new bulk operation types/labels/durations in app/Support/OperationCatalog.php
- [ ] T007 [P] Implement hybrid selection identity hasher in app/Services/Operations/BulkSelectionIdentity.php
- [ ] T008 [P] Implement idempotency fingerprint builder in app/Services/Operations/BulkIdempotencyFingerprint.php
- [ ] T009 [P] Implement per-target scope concurrency limiter (Cache locks) in app/Services/Operations/TargetScopeConcurrencyLimiter.php
- [ ] T010 Extend OperationRun identity inputs to include target scope + selection identity in app/Services/OperationRunService.php
- [ ] T011 Add a bulk enqueue helper that standardizes ensureRun + dispatchOrFail usage in app/Services/OperationRunService.php
**Checkpoint**: Shared primitives exist; bulk migrations can proceed consistently
---
## Phase 3: User Story 1 - Run-backed bulk actions are always observable (Priority: P1) 🎯 MVP
**Goal**: All bulk actions are OperationRun-backed and observable end-to-end
**Independent Test**: Trigger one migrated bulk action and verify OperationRun is created/reused, queued toast occurs, and Monitoring → Operations shows it.
### Tests for User Story 1
- [ ] T012 [P] [US1] Add bulk enqueue idempotency test in tests/Feature/OpsUx/BulkEnqueueIdempotencyTest.php
- [ ] T013 [P] [US1] Add per-target concurrency default=1 test in tests/Feature/OpsUx/TargetScopeConcurrencyLimiterTest.php
- [ ] T014 [P] [US1] Add hybrid selection identity hashing test in tests/Unit/Operations/BulkSelectionIdentityTest.php
### Implementation for User Story 1
- [ ] T015 [P] [US1] Add OperationRun context shape helpers for bulk runs in app/Support/OpsUx/BulkRunContext.php
- [ ] T016 [P] [US1] Implement orchestrator job skeleton for bulk runs in app/Jobs/Operations/BulkOperationOrchestratorJob.php
- [ ] T017 [P] [US1] Implement worker job skeleton for bulk items in app/Jobs/Operations/BulkOperationWorkerJob.php
- [ ] T018 [US1] Ensure worker jobs update summary_counts via canonical whitelist in app/Services/OperationRunService.php
**Decision (applies to all US1 migrations)**: Use the standard orchestrator + item worker pattern.
- Keep T016/T017 as the canonical implementation.
- Per-domain bulk logic MUST be implemented as item worker(s) invoked by the orchestrator.
- Avoid parallel legacy bulk job systems.
- [ ] T019 [US1] Migrate policy bulk delete start action to OperationRun-backed flow in app/Filament/Resources/PolicyResource.php
- [ ] T020 [US1] Refactor policy bulk delete execution into orchestrator/item worker pattern (replace legacy bulk job semantics) in app/Jobs/BulkPolicyDeleteJob.php
- [ ] T021 [US1] Migrate backup set bulk delete start action to OperationRun-backed flow in app/Filament/Resources/BackupSetResource.php
- [ ] T022 [US1] Refactor backup set bulk delete execution into orchestrator/item worker pattern (replace legacy bulk job semantics) in app/Jobs/BulkBackupSetDeleteJob.php
- [ ] T023 [US1] Migrate policy version prune start action to OperationRun-backed flow in app/Filament/Resources/PolicyVersionResource.php
- [ ] T024 [US1] Refactor policy version prune execution into orchestrator/item worker pattern (replace legacy bulk job semantics) in app/Jobs/BulkPolicyVersionPruneJob.php
- [ ] T025 [US1] Migrate policy version force delete start action to OperationRun-backed flow in app/Filament/Resources/PolicyVersionResource.php
- [ ] T026 [US1] Refactor policy version force delete execution into orchestrator/item worker pattern (replace legacy bulk job semantics) in app/Jobs/BulkPolicyVersionForceDeleteJob.php
- [ ] T027 [US1] Migrate restore run bulk delete start action to OperationRun-backed flow in app/Filament/Resources/RestoreRunResource.php
- [ ] T028 [US1] Refactor restore run bulk delete execution into orchestrator/item worker pattern (replace legacy bulk job semantics) in app/Jobs/BulkRestoreRunDeleteJob.php
- [ ] T029 [US1] Migrate tenant bulk sync start action to OperationRun-backed flow in app/Filament/Resources/TenantResource.php
- [ ] T030 [US1] Refactor tenant bulk sync execution into orchestrator/item worker pattern (replace legacy bulk job semantics) in app/Jobs/BulkTenantSyncJob.php
- [ ] T031 [US1] Migrate policy snapshot capture action to OperationRun-backed flow in app/Filament/Resources/PolicyResource/Pages/ViewPolicy.php
- [ ] T032 [US1] Refactor snapshot capture execution into orchestrator/item worker pattern (replace legacy bulk job semantics) in app/Jobs/CapturePolicySnapshotJob.php
- [ ] T033 [US1] Migrate drift generation flow to OperationRun-backed flow in app/Filament/Pages/DriftLanding.php
- [ ] T034 [US1] Remove legacy BulkOperationRun coupling from drift generator job in app/Jobs/GenerateDriftFindingsJob.php
- [ ] T035 [US1] Remove legacy “fallback link” usage and standardize view-run URLs to canonical OperationRun links in app/Jobs/AddPoliciesToBackupSetJob.php
**Checkpoint**: At this point, at least one representative bulk action is fully run-backed and visible in Monitoring
---
## Phase 4: User Story 2 - Monitoring is the single source of run history (Priority: P2)
**Goal**: No legacy “Bulk Operation Runs” surfaces exist; all view-run links route to Monitorings canonical run detail
**Independent Test**: From any bulk action, “View run” navigates to OperationRun detail and there is no legacy BulkOperationRun resource.
### Tests for User Story 2
- [ ] T036 [P] [US2] Add regression test that notifications do not link to BulkOperationRun resources in tests/Feature/OpsUx/NotificationViewRunLinkTest.php
### Implementation for User Story 2
- [ ] T037 [US2] Replace BulkOperationRun resource links with OperationRun links in app/Notifications/RunStatusChangedNotification.php
- [ ] T038 [US2] Replace BulkOperationRun resource links with OperationRun links in app/Filament/Resources/BackupSetResource.php
- [ ] T039 [US2] Replace BulkOperationRun resource links with OperationRun links in app/Filament/Resources/PolicyVersionResource.php
- [ ] T040 [US2] Replace BulkOperationRun resource links/redirects with OperationRun links in app/Filament/Resources/PolicyResource/Pages/ViewPolicy.php
- [ ] T041 [US2] Remove legacy resource and pages: app/Filament/Resources/BulkOperationRunResource.php
- [ ] T042 [US2] Remove legacy resource pages: app/Filament/Resources/BulkOperationRunResource/Pages/ListBulkOperationRuns.php
- [ ] T043 [US2] Remove legacy resource pages: app/Filament/Resources/BulkOperationRunResource/Pages/ViewBulkOperationRun.php
- [ ] T044 [US2] Remove legacy authorization policy in app/Policies/BulkOperationRunPolicy.php
- [ ] T045 [US2] Remove BulkOperationRun gate registration in app/Providers/AppServiceProvider.php
**Checkpoint**: No navigation/link surfaces reference legacy bulk runs; Monitoring is the sole run history surface
---
## Phase 5: User Story 3 - Developers cant accidentally reintroduce legacy patterns (Priority: P3)
**Goal**: Guardrails enforce single-run model and prevent UX drift / legacy reintroduction
**Independent Test**: Introduce a forbidden legacy reference or bulk start surface without OperationRun and confirm automated tests fail.
### Tests for User Story 3
- [ ] T046 [P] [US3] Add “no legacy references” test guard in tests/Feature/Guards/NoLegacyBulkOperationsTest.php
- [ ] T047 [P] [US3] Add tenant isolation guard for bulk enqueue inputs in tests/Feature/OpsUx/BulkTenantIsolationTest.php
- [ ] T048 [P] [US3] Extend summary_counts whitelist coverage for bulk updates in tests/Feature/OpsUx/SummaryCountsWhitelistTest.php
### Implementation for User Story 3
- [ ] T049 [US3] Remove legacy BulkOperationRun unit tests in tests/Unit/BulkOperationRunStatusBucketTest.php
- [ ] T050 [US3] Remove legacy BulkOperationRun unit tests in tests/Unit/BulkOperationRunProgressTest.php
- [ ] T051 [US3] Remove legacy factory and update any dependent tests in database/factories/BulkOperationRunFactory.php
- [ ] T052 [US3] Remove legacy test seeder and update any dependent docs/tests in database/seeders/BulkOperationsTestSeeder.php
**Checkpoint**: Guardrails prevent reintroduction; test suite enforces taxonomy and canonical run surfaces
---
## Phase 6: Polish & Cross-Cutting Concerns
**Purpose**: Final removal of legacy DB artifacts and cleanup
- [ ] T053 Create migration to drop legacy bulk_operation_runs table in database/migrations/2026_01_18_000001_drop_bulk_operation_runs_table.php
- [ ] T054 Do NOT delete historical migrations; add forward drop migrations only
- Keep old migrations to support fresh installs and CI rebuilds
- Add a new forward migration: drop legacy bulk tables after cutover
- Document the cutover precondition (no new legacy writes)
- [ ] T055 (optional) If schema history cleanup is required, use a documented squash/snapshot process
- Only via explicit procedure (not ad-hoc deletes)
- Must keep a reproducible schema for new environments
- [ ] T056 Validate feature via targeted test run list and update notes in specs/056-remove-legacy-bulkops/quickstart.md
---
## Monitoring DB-only render guard (NFR-01)
- [ ] T061 Add a regression test ensuring Monitoring → Operations pages do not invoke Graph/remote calls during render
- Approach:
- Mock/spy Graph client (or equivalent remote client)
- Render Operations index and OperationRun detail pages
- Assert no remote calls were made
- DoD: test fails if any Graph client is called from Monitoring render paths
## Legacy removal (FR-006)
- [ ] T062 Remove BulkOperationRun model + related database artifacts (after cutover)
- Delete app/Models/BulkOperationRun.php (and any related models)
- Ensure no runtime references remain
- [ ] T063 Remove BulkOperationService and migrate all call sites to OperationRunService patterns
- Replace all uses of BulkOperationService::createRun(...) / dispatch flows
- Ensure all bulk actions create OperationRun and dispatch orchestrator/worker jobs
- [ ] T064 Add CI guard to prevent reintroduction of BulkOperationRun/BulkOperationService
- Grep/arch test: fail if repo contains BulkOperationRun or BulkOperationService
## Target scope display (FR-008)
- [ ] T065 Update OperationRun run detail view to display target scope consistently
- Show entra_tenant_name if present, else show entra_tenant_id
- If directory_context_id exists, optionally show it as secondary info
- Ensure this is visible in Monitoring → Operations → Run Detail
- DoD: reviewers can start a run for a specific Entra tenant and see the target clearly on Run Detail
## Failure semantics hardening (NFR-02)
- [ ] T066 Define/standardize reason codes for migrated bulk operations and enforce message sanitization bounds
- Baseline reason_code set: graph_throttled, graph_timeout, permission_denied, validation_error, conflict_detected, unknown_error
- Ensure reason_code is stable and machine-readable
- Ensure failure message is sanitized + bounded (no secrets/tokens/PII/raw payload dumps)
- DoD: for each new/migrated bulk operation type, expected reason_code usage is clear and consistent
- [ ] T067 Add a regression test asserting failures/notifications never persist secrets/PII
- Approach: create a run failure with sensitive-looking strings and assert persisted failures/notifications are sanitized
- DoD: test fails if sensitive patterns appear in stored failures/notifications
## Remote retry/backoff/jitter policy (NFR-03)
- [ ] T068 Ensure migrated remote calls use the shared retry/backoff policy (429/503) and forbid ad-hoc retry loops
- Use bounded retries + exponential backoff with jitter
- DoD: no hand-rolled sleep/random retry logic in bulk workers; one test or assertion proves shared policy is used
## Canonical “View run” sweep and guard (FR-005)
- [ ] T069 Perform a repo-wide sweep to ensure all “View run” links route to Monitoring → Operations → Run Detail
- Grep (or ripgrep if available) for legacy routes/resource URLs and legacy BulkOperationRun links
- Ensure links go through a canonical OperationRun URL helper (or equivalent single source)
- Optional: add a CI grep/guard forbidding known legacy route names/URLs
- DoD: documented check/list shows no legacy “View run” links remain
---
## Dependencies & Execution Order
### User Story Dependencies
- **US1 (P1)**: Foundation for cutover; must complete before deleting legacy UI/DB.
- **US2 (P2)**: Depends on US1 (cutover) so links can be fully canonicalized.
- **US3 (P3)**: Can start after Foundational and run in parallel with US2, but should be finalized after US1 cutover.
### Parallel Opportunities
- Tasks marked **[P]** are safe to do in parallel (new files or isolated edits).
- Within US1, jobs and Filament start-surface migrations can be split by resource (Policies vs BackupSets vs PolicyVersions vs RestoreRuns).
---
## Parallel Example: User Story 1
- Task: T012 [US1] tests/Feature/OpsUx/BulkEnqueueIdempotencyTest.php
- Task: T013 [US1] tests/Feature/OpsUx/TargetScopeConcurrencyLimiterTest.php
- Task: T014 [US1] tests/Unit/Operations/BulkSelectionIdentityTest.php
---
## Implementation Strategy
### MVP First (User Story 1 Only)
1. Complete Setup + Foundational
2. Complete US1 migrations for at least one representative bulk action (end-to-end)
3. Validate Monitoring visibility + queued toast + terminal notification
### Incremental Delivery
- Migrate bulk workflows in small, independently testable slices (one resource at a time) while keeping Monitoring canonical.
- Remove legacy surfaces only after all start surfaces are migrated.