# Phase 0 — Research: Remove Legacy BulkOperationRun & Canonicalize Operations (v1.0) **Branch**: 056-remove-legacy-bulkops **Date**: 2026-01-18 ## Goals for Research - Resolve spec clarifications and translate them into concrete implementation constraints. - Identify existing repo patterns for: - run identity / dedupe - summary_counts normalization - locks / concurrency limiting - idempotent selection hashing - Enumerate known legacy usage locations to inform the discovery report. ## Findings & Decisions ### Decision 1 — Per-target scope concurrency - **Decision**: Concurrency limits are configuration-driven **with default = 1** per target scope. - **Rationale**: Default=1 is the safest choice against Graph throttling and reduces blast radius when many tenants/scope targets are active. - **Alternatives considered**: - Default=2: more throughput but higher throttling risk. - Default=5: increases throttling/incident risk. - Hardcoded values: hard to tune per environment. - **Implementation constraint**: Limit is enforced per `entra_tenant_id` or `directory_context_id`. ### Decision 2 — Selection Identity (idempotency fingerprint) - **Decision**: Hybrid selection identity. - If the user explicitly selects IDs: fingerprint includes an **IDs-hash**. - If the user selects via filter/query (“select all”): fingerprint includes a **query/filter hash**. - **Rationale**: Supports both UX patterns and avoids duplicate runs while remaining deterministic. - **Alternatives considered**: - IDs-hash only: cannot represent “select all by filter”. - Query-hash only: cannot safely represent explicit selections. - Always store both: increases complexity without clear value. ### Decision 3 — Legacy history handling - **Decision**: Do not import legacy bulk-run history into OperationRun. - **Rationale**: Minimizes migration risk and avoids polluting the canonical run surface with “synthetic” imported data. - **Alternatives considered**: - One-time import: adds complexity, new semantics, and additional testing burden. ### Decision 4 — Canonical summary metrics - **Decision**: Summary metrics are derived and rendered from `operation_runs.summary_counts` using the canonical key registry. - **Rationale**: Ensures consistent Monitoring UX and prevents ad-hoc keys. - **Alternatives considered**: - Per-operation bespoke metrics: causes UX drift and breaks shared widgets. ### Decision 5 — Reuse existing repo patterns - **Decision**: Reuse existing run, lock, and selection hashing patterns already present in the repository. - **Rationale**: Aligns with constitution and avoids divergent implementations. - **Evidence in repo**: - `OperationRunService` provides tenant-wide active-run dedupe and safe dispatch failure handling. - `OperationCatalog` centralizes labels/durations and allowed summary keys. - `InventoryConcurrencyLimiter` shows slot-based lock acquisition with config-driven maxima. - `InventorySyncService` shows deterministic selection hashing and selection-level locks. ## Legacy Usage Inventory (initial) This is a starting list derived from a repo search; the Phase 2 Discovery Report must expand/confirm. - Legacy bulk-run model: `app/Models/BulkOperationRun.php` - Legacy bulk-run service: `app/Services/BulkOperationService.php` - Legacy jobs using BulkOperationRun/Service: - `app/Jobs/BulkPolicySyncJob.php` - `app/Jobs/BulkPolicyDeleteJob.php` - `app/Jobs/BulkBackupSetDeleteJob.php` - `app/Jobs/BulkPolicyVersionPruneJob.php` - `app/Jobs/BulkPolicyVersionForceDeleteJob.php` - `app/Jobs/BulkRestoreRunDeleteJob.php` - `app/Jobs/BulkTenantSyncJob.php` - `app/Jobs/CapturePolicySnapshotJob.php` - `app/Jobs/GenerateDriftFindingsJob.php` (mixed usage) - Legacy DB artifacts: - `database/migrations/2025_12_23_215901_create_bulk_operation_runs_table.php` - `database/migrations/2026_01_11_120001_add_idempotency_key_to_bulk_operation_runs_table.php` - `database/migrations/2025_12_24_005055_increase_bulk_operation_runs_status_length.php` - Legacy test data: - `database/factories/BulkOperationRunFactory.php` - `database/seeders/BulkOperationsTestSeeder.php` ## Open Questions None remaining for Phase 0 (spec clarifications resolved). Any additional unknowns found during discovery are to be added to the Phase 2 discovery report and/or tasks.