TenantAtlas/specs/056-remove-legacy-bulkops/spec.md
ahmido a97beefda3 056-remove-legacy-bulkops (#65)
Kurzbeschreibung

Versteckt die Rerun-Row-Action für archivierte (soft-deleted) RestoreRuns und verhindert damit fehlerhafte Neu-Starts aus dem Archiv; ergänzt einen Regressionstest.
Änderungen

Code: RestoreRunResource.php — Sichtbarkeit der rerun-Action geprüft auf ! $record->trashed() und defensive Abbruchprüfung im Action-Handler.
Tests: RestoreRunRerunTest.php — neuer Test rerun action is hidden for archived restore runs.
Warum

Archivierte RestoreRuns durften nicht neu gestartet werden; UI zeigte trotzdem die Option. Das führte zu verwirrendem Verhalten und möglichen Fehlern beim Enqueueing.
Verifikation / QA

Unit/Feature:
./vendor/bin/sail artisan test tests/Feature/RestoreRunRerunTest.php
Stil/format:
./vendor/bin/pint --dirty
Manuell (UI):
Als Tenant-Admin Filament → Restore Runs öffnen.
Filter Archived aktivieren (oder Trashed filter auswählen).
Sicherstellen, dass für archivierte Einträge die Rerun-Action nicht sichtbar ist.
Auf einem aktiven (nicht-archivierten) Run prüfen, dass Rerun sichtbar bleibt und wie erwartet eine neue RestoreRun erzeugt.
Wichtige Hinweise

Kein DB-Migration required.
Diese PR enthält nur den UI-/Filament-Fix; die zuvor gemachten operative Fixes für Queue/adapter-Reconciliation bleiben ebenfalls auf dem Branch (z. B. frühere commits während der Debugging-Session).
T055 (Schema squash) wurde bewusst zurückgestellt und ist nicht Teil dieses PRs.
Merge-Checklist

 Tests lokal laufen (RestoreRunRerunTest grünt)
 Pint läuft ohne ungepatchte Fehler
 Branch gepusht: 056-remove-legacy-bulkops (PR-URL: https://git.cloudarix.de/ahmido/TenantAtlas/compare/dev...056-remove-legacy-bulkops)

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local>
Reviewed-on: #65
2026-01-19 23:27:52 +00:00

191 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Feature Specification: Remove Legacy BulkOperationRun & Canonicalize Operations (v1.0)
**Feature Branch**: `056-remove-legacy-bulkops`
**Created**: 2026-01-18
**Status**: Draft
**Input**: User description: "Feature 056 — Remove Legacy BulkOperationRun & Canonicalize Operations (v1.0)"
## Clarifications
### Session 2026-01-18
- Q: What should be the default max concurrency per target scope (entra_tenant_id / directory_context_id) for bulk operations? → A: Config-driven, default=1
- Q: How should Selection Identity be determined for idempotency fingerprinting? → A: Hybrid (IDs-hash for explicit selection; query-hash for “select all via filter/query”)
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Run-backed bulk actions are always observable (Priority: P1)
An admin performs a bulk action (e.g., apply/ignore/restore/prune across many records). The system records a single canonical run that can be monitored end-to-end, including partial failures, and provides consistent user feedback.
**Why this priority**: Bulk changes are operationally significant and must be traceable, support partial outcomes, and have a consistent mental model for admins.
**Independent Test**: Trigger a representative bulk action and verify that a run record exists, appears in the Monitoring list, has a detail view, and emits the correct feedback surfaces.
**Acceptance Scenarios**:
1. **Given** an admin selects multiple items for a bulk action, **When** the action is confirmed and submitted, **Then** a canonical run record is created or reused and the UI confirms the enqueue/queued state via a toast.
2. **Given** a bulk run is queued or running, **When** the admin opens Monitoring → Operations, **Then** the run appears in the list and can be opened via a canonical “View run” link.
3. **Given** a bulk run completes with a mix of successes and failures, **When** the run reaches a terminal state, **Then** the initiator receives a terminal notification and the run detail shows a summary of outcomes.
---
### User Story 2 - Monitoring is the single source of run history (Priority: P2)
An admin (or operator) relies on Monitoring → Operations to see the full history of operational work (including bulk). There are no separate legacy run surfaces; links from anywhere in the app point to the canonical run detail.
**Why this priority**: Multiple run systems lead to missed incidents, inconsistent retention, and developer confusion. One canonical surface improves operational clarity and reduces support overhead.
**Independent Test**: Navigate from a bulk action result to “View run” and confirm it lands in Monitorings run detail; confirm there is no legacy “bulk runs” navigation or pages.
**Acceptance Scenarios**:
1. **Given** any UI element offers a “View run” link, **When** it is clicked, **Then** it opens the canonical Monitoring → Operations → Run Detail page for that run.
2. **Given** the app navigation, **When** an admin searches for legacy bulk-run screens, **Then** no legacy bulk-run navigation or pages exist.
---
### User Story 3 - Developers cant accidentally reintroduce legacy patterns (Priority: P3)
A developer adds or modifies an admin action. They can clearly determine whether it is an audit-only action or a run-backed operation, and the repository enforces the single-run model by preventing legacy references and UX drift.
**Why this priority**: Preventing regression is essential for suite readiness and long-term maintainability.
**Independent Test**: Introduce a legacy reference or a bulk action without a run-backed record and confirm CI/automated checks fail.
**Acceptance Scenarios**:
1. **Given** a change introduces any reference to the legacy bulk-run system, **When** tests/CI run, **Then** the pipeline fails with a clear message.
2. **Given** a security-relevant DB-only action that is eligible for audit-only classification, **When** the action runs, **Then** an audit log entry is recorded and no run record is created.
### Edge Cases
- Bulk selection is empty or resolves to zero items: the system does not start work and provides a clear non-destructive result.
- A bulk selection is very large: the system remains responsive and continues to show progress via run summary metrics.
- Target scope is required but missing: the system fails safely, records a terminal run with a stable reason code, and does not execute remote/bulk mutations.
- Remote calls experience throttling: the system applies bounded retries with jittered backoff and records failures without losing overall run visibility.
- Duplicate submissions (double click / retry / re-run): idempotency prevents duplicate processing and preserves a single canonical outcome per selection identity.
- Tenant isolation: no run, selection, summary, or notifications leak across tenants.
## Requirements *(mandatory)*
**Constitution alignment (required):** This feature consolidates operational work onto a single canonical run model and a single monitoring surface. It must preserve the defined user feedback surfaces (queued toast, active widget, terminal notification), ensure tenant-scoped observability, and maintain stable, sanitized messages and reason codes.
### Functional Requirements
- **FR-001 Single run model**: The system MUST use a single canonical run model (`OperationRun`) for all run-backed operations; the legacy bulk-run model MUST not exist after this feature.
- **FR-002 Bulk actions are run-backed**: Any bulk action (apply to N records, chunked work, mass ignore/restore/prune/delete) MUST create or reuse an `OperationRun` and MUST be visible in Monitoring → Operations.
- **FR-003 Action taxonomy**: Every admin action MUST be classified as exactly one of:
- **Audit-only DB action**: DB-only, no remote/external calls, no queued work, and bounded DB work; typically completes within ~2 seconds (guidance, not a hard rule). MUST write an audit log for security/ops-relevant state changes; MUST NOT create an `OperationRun`.
- **Run-backed operation**: queued/long-running/remote/bulk/scheduled or otherwise operationally significant; MUST create or reuse an `OperationRun`.
**Decision rule**: If classification is uncertain, default to **Run-backed operation**.
- **FR-004 Canonical UX surfaces**: For run-backed operations, the system MUST use only these feedback surfaces:
- **Queued**: toast-only
- **Active**: tenant-wide active widget
- **Terminal**: database-backed notification to the initiator only
- **FR-005 Canonical routing**: All “View run” links MUST route to Monitoring → Operations → Run Detail.
- **FR-006 Legacy removal**: The system MUST remove legacy bulk-run tables/models/services/routes/widgets/navigation and MUST prevent any new legacy writes.
- **FR-007 Canonical summary metrics**: The runs summary metrics MUST use a single canonical set of keys and MUST be presented consistently in the run detail view.
- **FR-008 Target scope recording**: For operations targeting a directory/remote tenant, the run context MUST record the target scope (directory identifier) and Monitoring/Run Detail MUST display it in a human-friendly way when available.
- **FR-009 Per-target throttling**: Bulk orchestration MUST enforce concurrency limits per target scope to reduce throttling risk and provide predictable execution; the limit MUST be configuration-driven with a default of 1 per target scope.
- **FR-010 Idempotency for bulk**: Bulk operations MUST be idempotent using a deterministic fingerprint that includes operation type, target scope, and selection identity; retries MUST NOT duplicate work.
- **FR-011 Discovery completeness**: The implementation MUST include a repo-wide discovery sweep of legacy references and bulk-like actions; findings MUST be recorded in a discovery report with classification and migration/deferral decisions.
- **FR-012 Regression guardrails**: Automated checks MUST fail if legacy bulk-run references reappear or if bulk actions bypass the canonical run-backed model.
### Non-Functional Requirements (NFR)
#### NFR-01 Monitoring is DB-only at render time (Constitution Gate)
All Monitoring → Operations pages (index and run detail) MUST be DB-only at render time:
- No Graph/remote calls during initial render or reactive renders.
- No side-effectful work triggered by view rendering.
**Verification**:
- Add a regression test/guard that mocks the Graph client (or equivalent remote client) and asserts it is not called during Monitoring renders.
- Add a regression test/guard that mocks the Graph client (or equivalent remote client) and asserts it is not called during Monitoring renders.
#### NFR-02 Failure reason codes and message sanitization
Run-backed operations MUST store failures as stable, machine-readable `reason_code` values plus a sanitized, user-facing message.
**Minimal required reason_code set (baseline)**:
| reason_code | Meaning |
|------------|---------|
| graph_throttled | Remote service throttled (e.g., rate limited) |
| graph_timeout | Remote call timed out |
| permission_denied | Missing/insufficient permissions |
| validation_error | Input/selection validation failure |
| conflict_detected | Conflict detected (concurrency/version/resource state) |
| unknown_error | Fallback when no specific code applies |
**Rules**:
- `reason_code` is stable over time and safe to use in programmatic filters/alerts.
- Failure messages are sanitized and bounded in length; failures/notifications MUST NOT persist secrets/tokens/PII or raw payload dumps.
#### NFR-03 Retry/backoff/jitter for remote throttling
When worker jobs perform remote calls, they MUST handle transient failures (including 429/503) via a shared policy:
- bounded retries
- exponential backoff with jitter
- no hand-rolled `sleep()` loops or ad-hoc random retry logic in feature code
### Implementation Shape (decision)
**Decision: standard orchestrator + item workers**
- 1 orchestrator job per run:
- resolves selection deterministically
- chunks work
- dispatches item worker jobs (idempotent per item)
- Worker jobs update `operation_runs.summary_counts` via canonical normalization.
- Finalization sets terminal status once.
### Target Scope (canonical keys)
**Canonical context keys**:
- `entra_tenant_id` (Azure AD tenant GUID)
- optional `entra_tenant_name` (human-friendly; if available)
- optional `directory_context_id` (internal directory context identifier, if/when introduced)
For operations targeting a directory/remote tenant, the run context MUST record target scope using the canonical keys above, and Monitoring/Run Detail MUST display the target scope (human-friendly name if available).
#### Assumptions
- Existing run status semantics remain unchanged (queued/running/succeeded/partial/failed).
- Existing monitoring experience is not redesigned; it is aligned so that all operational work is represented consistently.
#### Dependencies
- Prior consolidation work establishing `OperationRun` as the canonical run model and Monitoring → Operations as the canonical surface.
- Existing audit logging conventions for security/ops-relevant DB-only actions.
#### Legacy History Decision (recorded)
- Default path: legacy bulk-run history is not migrated into the canonical run model. The legacy tables are removed after cutover, relying on database backups/exports if historical investigation is needed.
### Key Entities *(include if feature involves data)*
- **OperationRun**: A tenant-scoped record of operational work with status, timestamps, sanitized user-facing message/reason code, summary metrics, and context.
- **Operation Type**: A stable identifier describing the kind of operation (used for categorization, labeling, and governance).
- **Target Scope**: The directory / remote tenant scope that the operation targets (when applicable).
- **Selection Identity**: The deterministic definition of “what the bulk action applies to” used for idempotency and traceability.
- **Audit Log Entry**: A record of security/ops-relevant state changes for audit-only DB actions.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: 100% of bulk actions in the admin UI create or reuse a canonical run record and appear in Monitoring → Operations.
- **SC-002**: Repository contains 0 references to the legacy bulk-run system after completion, enforced by automated checks.
- **SC-003**: For directory-targeted operations, 100% of run records display a target scope in Monitoring/Run Detail.
- **SC-004**: For bulk operations, duplicate submissions do not increase processed item count beyond one idempotent execution per selection identity.
- **SC-005**: Admins can locate a completed bulk run in Monitoring within 30 seconds using standard navigation and filters, without relying on legacy pages.