Kurzbeschreibung Versteckt die Rerun-Row-Action für archivierte (soft-deleted) RestoreRuns und verhindert damit fehlerhafte Neu-Starts aus dem Archiv; ergänzt einen Regressionstest. Änderungen Code: RestoreRunResource.php — Sichtbarkeit der rerun-Action geprüft auf ! $record->trashed() und defensive Abbruchprüfung im Action-Handler. Tests: RestoreRunRerunTest.php — neuer Test rerun action is hidden for archived restore runs. Warum Archivierte RestoreRuns durften nicht neu gestartet werden; UI zeigte trotzdem die Option. Das führte zu verwirrendem Verhalten und möglichen Fehlern beim Enqueueing. Verifikation / QA Unit/Feature: ./vendor/bin/sail artisan test tests/Feature/RestoreRunRerunTest.php Stil/format: ./vendor/bin/pint --dirty Manuell (UI): Als Tenant-Admin Filament → Restore Runs öffnen. Filter Archived aktivieren (oder Trashed filter auswählen). Sicherstellen, dass für archivierte Einträge die Rerun-Action nicht sichtbar ist. Auf einem aktiven (nicht-archivierten) Run prüfen, dass Rerun sichtbar bleibt und wie erwartet eine neue RestoreRun erzeugt. Wichtige Hinweise Kein DB-Migration required. Diese PR enthält nur den UI-/Filament-Fix; die zuvor gemachten operative Fixes für Queue/adapter-Reconciliation bleiben ebenfalls auf dem Branch (z. B. frühere commits während der Debugging-Session). T055 (Schema squash) wurde bewusst zurückgestellt und ist nicht Teil dieses PRs. Merge-Checklist Tests lokal laufen (RestoreRunRerunTest grünt) Pint läuft ohne ungepatchte Fehler Branch gepusht: 056-remove-legacy-bulkops (PR-URL: https://git.cloudarix.de/ahmido/TenantAtlas/compare/dev...056-remove-legacy-bulkops) Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local> Reviewed-on: #65
13 KiB
Feature Specification: Remove Legacy BulkOperationRun & Canonicalize Operations (v1.0)
Feature Branch: 056-remove-legacy-bulkops
Created: 2026-01-18
Status: Draft
Input: User description: "Feature 056 — Remove Legacy BulkOperationRun & Canonicalize Operations (v1.0)"
Clarifications
Session 2026-01-18
- Q: What should be the default max concurrency per target scope (entra_tenant_id / directory_context_id) for bulk operations? → A: Config-driven, default=1
- Q: How should Selection Identity be determined for idempotency fingerprinting? → A: Hybrid (IDs-hash for explicit selection; query-hash for “select all via filter/query”)
User Scenarios & Testing (mandatory)
User Story 1 - Run-backed bulk actions are always observable (Priority: P1)
An admin performs a bulk action (e.g., apply/ignore/restore/prune across many records). The system records a single canonical run that can be monitored end-to-end, including partial failures, and provides consistent user feedback.
Why this priority: Bulk changes are operationally significant and must be traceable, support partial outcomes, and have a consistent mental model for admins.
Independent Test: Trigger a representative bulk action and verify that a run record exists, appears in the Monitoring list, has a detail view, and emits the correct feedback surfaces.
Acceptance Scenarios:
- Given an admin selects multiple items for a bulk action, When the action is confirmed and submitted, Then a canonical run record is created or reused and the UI confirms the enqueue/queued state via a toast.
- Given a bulk run is queued or running, When the admin opens Monitoring → Operations, Then the run appears in the list and can be opened via a canonical “View run” link.
- Given a bulk run completes with a mix of successes and failures, When the run reaches a terminal state, Then the initiator receives a terminal notification and the run detail shows a summary of outcomes.
User Story 2 - Monitoring is the single source of run history (Priority: P2)
An admin (or operator) relies on Monitoring → Operations to see the full history of operational work (including bulk). There are no separate legacy run surfaces; links from anywhere in the app point to the canonical run detail.
Why this priority: Multiple run systems lead to missed incidents, inconsistent retention, and developer confusion. One canonical surface improves operational clarity and reduces support overhead.
Independent Test: Navigate from a bulk action result to “View run” and confirm it lands in Monitoring’s run detail; confirm there is no legacy “bulk runs” navigation or pages.
Acceptance Scenarios:
- Given any UI element offers a “View run” link, When it is clicked, Then it opens the canonical Monitoring → Operations → Run Detail page for that run.
- Given the app navigation, When an admin searches for legacy bulk-run screens, Then no legacy bulk-run navigation or pages exist.
User Story 3 - Developers can’t accidentally reintroduce legacy patterns (Priority: P3)
A developer adds or modifies an admin action. They can clearly determine whether it is an audit-only action or a run-backed operation, and the repository enforces the single-run model by preventing legacy references and UX drift.
Why this priority: Preventing regression is essential for suite readiness and long-term maintainability.
Independent Test: Introduce a legacy reference or a bulk action without a run-backed record and confirm CI/automated checks fail.
Acceptance Scenarios:
- Given a change introduces any reference to the legacy bulk-run system, When tests/CI run, Then the pipeline fails with a clear message.
- Given a security-relevant DB-only action that is eligible for audit-only classification, When the action runs, Then an audit log entry is recorded and no run record is created.
Edge Cases
- Bulk selection is empty or resolves to zero items: the system does not start work and provides a clear non-destructive result.
- A bulk selection is very large: the system remains responsive and continues to show progress via run summary metrics.
- Target scope is required but missing: the system fails safely, records a terminal run with a stable reason code, and does not execute remote/bulk mutations.
- Remote calls experience throttling: the system applies bounded retries with jittered backoff and records failures without losing overall run visibility.
- Duplicate submissions (double click / retry / re-run): idempotency prevents duplicate processing and preserves a single canonical outcome per selection identity.
- Tenant isolation: no run, selection, summary, or notifications leak across tenants.
Requirements (mandatory)
Constitution alignment (required): This feature consolidates operational work onto a single canonical run model and a single monitoring surface. It must preserve the defined user feedback surfaces (queued toast, active widget, terminal notification), ensure tenant-scoped observability, and maintain stable, sanitized messages and reason codes.
Functional Requirements
- FR-001 Single run model: The system MUST use a single canonical run model (
OperationRun) for all run-backed operations; the legacy bulk-run model MUST not exist after this feature. - FR-002 Bulk actions are run-backed: Any bulk action (apply to N records, chunked work, mass ignore/restore/prune/delete) MUST create or reuse an
OperationRunand MUST be visible in Monitoring → Operations. - FR-003 Action taxonomy: Every admin action MUST be classified as exactly one of:
- Audit-only DB action: DB-only, no remote/external calls, no queued work, and bounded DB work; typically completes within ~2 seconds (guidance, not a hard rule). MUST write an audit log for security/ops-relevant state changes; MUST NOT create an
OperationRun. - Run-backed operation: queued/long-running/remote/bulk/scheduled or otherwise operationally significant; MUST create or reuse an
OperationRun.
- Audit-only DB action: DB-only, no remote/external calls, no queued work, and bounded DB work; typically completes within ~2 seconds (guidance, not a hard rule). MUST write an audit log for security/ops-relevant state changes; MUST NOT create an
Decision rule: If classification is uncertain, default to Run-backed operation.
- FR-004 Canonical UX surfaces: For run-backed operations, the system MUST use only these feedback surfaces:
- Queued: toast-only
- Active: tenant-wide active widget
- Terminal: database-backed notification to the initiator only
- FR-005 Canonical routing: All “View run” links MUST route to Monitoring → Operations → Run Detail.
- FR-006 Legacy removal: The system MUST remove legacy bulk-run tables/models/services/routes/widgets/navigation and MUST prevent any new legacy writes.
- FR-007 Canonical summary metrics: The run’s summary metrics MUST use a single canonical set of keys and MUST be presented consistently in the run detail view.
- FR-008 Target scope recording: For operations targeting a directory/remote tenant, the run context MUST record the target scope (directory identifier) and Monitoring/Run Detail MUST display it in a human-friendly way when available.
- FR-009 Per-target throttling: Bulk orchestration MUST enforce concurrency limits per target scope to reduce throttling risk and provide predictable execution; the limit MUST be configuration-driven with a default of 1 per target scope.
- FR-010 Idempotency for bulk: Bulk operations MUST be idempotent using a deterministic fingerprint that includes operation type, target scope, and selection identity; retries MUST NOT duplicate work.
- FR-011 Discovery completeness: The implementation MUST include a repo-wide discovery sweep of legacy references and bulk-like actions; findings MUST be recorded in a discovery report with classification and migration/deferral decisions.
- FR-012 Regression guardrails: Automated checks MUST fail if legacy bulk-run references reappear or if bulk actions bypass the canonical run-backed model.
Non-Functional Requirements (NFR)
NFR-01 Monitoring is DB-only at render time (Constitution Gate)
All Monitoring → Operations pages (index and run detail) MUST be DB-only at render time:
- No Graph/remote calls during initial render or reactive renders.
- No side-effectful work triggered by view rendering.
Verification:
-
Add a regression test/guard that mocks the Graph client (or equivalent remote client) and asserts it is not called during Monitoring renders.
-
Add a regression test/guard that mocks the Graph client (or equivalent remote client) and asserts it is not called during Monitoring renders.
NFR-02 Failure reason codes and message sanitization
Run-backed operations MUST store failures as stable, machine-readable reason_code values plus a sanitized, user-facing message.
Minimal required reason_code set (baseline):
| reason_code | Meaning |
|---|---|
| graph_throttled | Remote service throttled (e.g., rate limited) |
| graph_timeout | Remote call timed out |
| permission_denied | Missing/insufficient permissions |
| validation_error | Input/selection validation failure |
| conflict_detected | Conflict detected (concurrency/version/resource state) |
| unknown_error | Fallback when no specific code applies |
Rules:
reason_codeis stable over time and safe to use in programmatic filters/alerts.- Failure messages are sanitized and bounded in length; failures/notifications MUST NOT persist secrets/tokens/PII or raw payload dumps.
NFR-03 Retry/backoff/jitter for remote throttling
When worker jobs perform remote calls, they MUST handle transient failures (including 429/503) via a shared policy:
- bounded retries
- exponential backoff with jitter
- no hand-rolled
sleep()loops or ad-hoc random retry logic in feature code
Implementation Shape (decision)
Decision: standard orchestrator + item workers
- 1 orchestrator job per run:
- resolves selection deterministically
- chunks work
- dispatches item worker jobs (idempotent per item)
- Worker jobs update
operation_runs.summary_countsvia canonical normalization. - Finalization sets terminal status once.
Target Scope (canonical keys)
Canonical context keys:
entra_tenant_id(Azure AD tenant GUID)- optional
entra_tenant_name(human-friendly; if available) - optional
directory_context_id(internal directory context identifier, if/when introduced)
For operations targeting a directory/remote tenant, the run context MUST record target scope using the canonical keys above, and Monitoring/Run Detail MUST display the target scope (human-friendly name if available).
Assumptions
- Existing run status semantics remain unchanged (queued/running/succeeded/partial/failed).
- Existing monitoring experience is not redesigned; it is aligned so that all operational work is represented consistently.
Dependencies
- Prior consolidation work establishing
OperationRunas the canonical run model and Monitoring → Operations as the canonical surface. - Existing audit logging conventions for security/ops-relevant DB-only actions.
Legacy History Decision (recorded)
- Default path: legacy bulk-run history is not migrated into the canonical run model. The legacy tables are removed after cutover, relying on database backups/exports if historical investigation is needed.
Key Entities (include if feature involves data)
- OperationRun: A tenant-scoped record of operational work with status, timestamps, sanitized user-facing message/reason code, summary metrics, and context.
- Operation Type: A stable identifier describing the kind of operation (used for categorization, labeling, and governance).
- Target Scope: The directory / remote tenant scope that the operation targets (when applicable).
- Selection Identity: The deterministic definition of “what the bulk action applies to” used for idempotency and traceability.
- Audit Log Entry: A record of security/ops-relevant state changes for audit-only DB actions.
Success Criteria (mandatory)
Measurable Outcomes
- SC-001: 100% of bulk actions in the admin UI create or reuse a canonical run record and appear in Monitoring → Operations.
- SC-002: Repository contains 0 references to the legacy bulk-run system after completion, enforced by automated checks.
- SC-003: For directory-targeted operations, 100% of run records display a target scope in Monitoring/Run Detail.
- SC-004: For bulk operations, duplicate submissions do not increase processed item count beyond one idempotent execution per selection identity.
- SC-005: Admins can locate a completed bulk run in Monitoring within 30 seconds using standard navigation and filters, without relying on legacy pages.