TenantAtlas/specs/005-bulk-operations/spec.md
Ahmed Darrazi 1fa15b4db2 spec: Feature 004 & 005 with production-tested Graph API strategies
Feature 004 (Assignments & Scope Tags):
- Use fallback strategy for assignments read (direct + $expand)
- Use POST /directoryObjects/getByIds for stable group resolution
- POST /assign only (not PATCH) for assignments write
- Handle 204 No Content responses

Feature 005 (Bulk Operations):
- Policies: Local delete only (ignored_at flag, no Graph DELETE)
- Policy Versions: Eligibility checks + retention policy
- BulkOperationRun model for progress tracking
- Livewire polling for UI updates (not automatic)
- Chunked processing + circuit breaker (abort >50% fail)
- array $ids in Job constructor (not Collection)
2025-12-22 01:07:03 +01:00

618 lines
22 KiB
Markdown

# Feature 005: Bulk Operations for Resource Management
## Overview
Enable efficient bulk operations across TenantPilot's main resources (Policies, Policy Versions, Backup Sets, Restore Runs) to improve admin productivity and reduce repetitive actions.
## Problem Statement
Currently, admins must perform actions one-by-one on individual resources:
- Deleting 20 old Policy Versions = 20 clicks + confirmations
- Exporting 50 Policies to a Backup = 50 manual selections
- Cleaning up 30 failed Restore Runs = 30 delete actions
**This is tedious, error-prone, and time-consuming.**
**With bulk operations:**
- Select multiple items → single action → confirm → done
- Clear audit trail (one bulk action = one audit event + per-item outcomes)
- Progress notifications for long-running operations
- Consistent UX across all resources
## Goals
- **Primary**: Implement bulk delete, bulk export, bulk restore (soft delete) for main resources
- **Secondary**: Safety gates (confirmation dialogs, type-to-confirm for destructive ops)
- **Tertiary**: Queue-based processing for large batches with progress tracking
- **Non-Goal**: Bulk edit/update (too complex, deferred to future feature)
---
## User Stories
### User Story 1 - Bulk Delete Policies (Priority: P1)
**As an admin**, I want to soft-delete multiple policies **locally in TenantPilot** at once, so I can clean up outdated or test policies efficiently.
**Important**: This action marks policies as deleted locally, does NOT delete them in Intune. Policies are flagged as `ignored_at` to prevent re-sync.
**Acceptance Criteria:**
1. **Given** I select 15 policies in the Policies table,
**When** I click "Delete (Local)" in the bulk actions menu,
**Then** a confirmation dialog appears: "Delete 15 policies locally? They will be hidden from listings and ignored in sync."
2. **Given** I confirm the bulk delete,
**When** the operation completes,
**Then**:
- All 15 policies are flagged (`ignored_at` timestamp set, optionally `deleted_at`)
- A success notification shows: "Deleted 15 policies locally"
- An audit log entry `policies.bulk_deleted_local` is created with policy IDs
- Policies remain in Intune (unchanged)
3. **Given** I bulk-delete 50 policies,
**When** the operation runs,
**Then** it processes asynchronously via queue (job) with progress notification
4. **Given** I lack `policies.delete` permission,
**When** I try to bulk-delete,
**Then** the bulk action is disabled/hidden (same permission model as single delete)
---
### User Story 2 - Bulk Export Policies to Backup (Priority: P1)
**As an admin**, I want to export multiple policies to a new Backup Set in one action, so I can quickly snapshot a subset of policies.
**Acceptance Criteria:**
1. **Given** I select 25 policies,
**When** I click "Export to Backup",
**Then** a dialog prompts: "Backup Set Name" + "Include Assignments?" checkbox
2. **Given** I confirm the export,
**When** the backup job runs,
**Then**:
- A new Backup Set is created
- 25 Backup Items are captured (one per policy)
- Progress notification: "Backing up... 10/25"
- Final notification: "Backup Set 'Production Snapshot' created (25 items)"
3. **Given** 3 of 25 policies fail to backup (Graph error),
**When** the job completes,
**Then**:
- 22 items succeed, 3 fail
- Notification: "Backup completed: 22 succeeded, 3 failed"
- Audit log records per-item outcomes
---
### User Story 3 - Bulk Delete Policy Versions (Priority: P2)
**As an admin**, I want to bulk-delete old policy versions to free up database space, respecting retention policies.
**Important**: Policy Versions are immutable snapshots. Deletion only allowed if version is NOT referenced (no active Backup Items, Restore Runs, or audit trails) and meets retention threshold (e.g., >90 days old).
**Acceptance Criteria:**
1. **Given** I select 30 policy versions older than 90 days,
**When** I click "Delete",
**Then** confirmation dialog: "Delete 30 policy versions? This is permanent and cannot be undone."
2. **Given** I confirm,
**When** the operation completes,
**Then**:
- System checks each version: is_current=false + not referenced + age >90 days
- Eligible versions are hard-deleted
- Ineligible versions are skipped with reason (e.g., "Referenced by Backup Set ID 5")
- Success notification: "Deleted 28 policy versions (2 skipped)"
- Audit log: `policy_versions.bulk_pruned` with version IDs + skip reasons
3. **Given** I lack `policy_versions.prune` permission,
**When** I try to bulk-delete,
**Then** the bulk action is hidden
---
### User Story 4 - Bulk Delete Restore Runs (Priority: P2)
**As an admin**, I want to bulk-delete completed or failed Restore Runs to declutter the history.
**Acceptance Criteria:**
1. **Given** I select 20 restore runs (status: completed/failed),
**When** I click "Delete",
**Then** confirmation: "Delete 20 restore runs? Historical data will be removed."
2. **Given** I confirm,
**When** the operation completes,
**Then**:
- 20 restore runs are soft-deleted
- Notification: "Deleted 20 restore runs"
- Audit log: `restore_runs.bulk_deleted`
3. **Given** I select restore runs with mixed statuses (running + completed),
**When** I attempt bulk delete,
**Then** only completed/failed runs are deleted (running ones skipped with warning)
---
### User Story 5 - Bulk Delete with Type-to-Confirm (Priority: P1)
**As an admin**, I want extra confirmation for large destructive operations, so I don't accidentally delete important data.
**Acceptance Criteria:**
1. **Given** I bulk-delete ≥20 items,
**When** the confirmation dialog appears,
**Then** I must type "DELETE" in a text field to enable the confirm button
2. **Given** I type an incorrect word (e.g., "delete" lowercase),
**When** I try to confirm,
**Then** the button remains disabled with error: "Type DELETE to confirm"
3. **Given** I type "DELETE" correctly,
**When** I click confirm,
**Then** the bulk operation proceeds
---
### User Story 6 - Bulk Operation Progress Tracking (Priority: P2)
**As an admin**, I want to see real-time progress for bulk operations, so I know the system is working.
**Acceptance Criteria:**
1. **Given** I bulk-delete 100 policies,
**When** the job starts,
**Then** a Filament notification shows: "Deleting policies... 0/100"
2. **Given** the job processes items,
**When** progress updates,
**Then** the notification updates every 5 seconds: "Deleting... 45/100"
3. **Given** the job completes,
**When** all items are processed,
**Then**:
- Final notification: "Deleted 98 policies (2 failed)"
- Clickable link: "View details" → opens audit log entry
---
## Functional Requirements
### General Bulk Operations
**FR-005.1**: System MUST provide bulk action checkboxes on table rows for:
- Policies
- Policy Versions
- Backup Sets
- Restore Runs
**FR-005.2**: Bulk actions menu MUST appear when ≥1 item is selected, showing:
- Action name (e.g., "Delete")
- Count badge (e.g., "3 selected")
- Disabled state if user lacks permission
**FR-005.3**: System MUST enforce same permissions for bulk actions as single actions (e.g., `policies.delete` for bulk delete).
**FR-005.4**: Bulk operations processing ≥20 items MUST run via Laravel Queue (async job) using Bus::batch() or chunked processing (batches of 10-20 items).
**FR-005.4a**: System MUST create a `bulk_operation_runs` table to track progress:
```php
Schema::create('bulk_operation_runs', function (Blueprint $table) {
$table->id();
$table->foreignId('tenant_id')->constrained();
$table->foreignId('user_id')->constrained();
$table->string('resource'); // 'policies', 'policy_versions', etc.
$table->string('action'); // 'delete', 'export', etc.
$table->string('status'); // 'running', 'completed', 'failed', 'aborted'
$table->integer('total_items');
$table->integer('processed_items')->default(0);
$table->integer('succeeded')->default(0);
$table->integer('failed')->default(0);
$table->integer('skipped')->default(0);
$table->json('item_ids'); // array of IDs
$table->json('failures')->nullable(); // [{id, reason}, ...]
$table->foreignId('audit_log_id')->nullable()->constrained();
$table->timestamps();
});
**FR-005.5**: Bulk operations <20 items MAY run synchronously (immediate feedback).
### Confirmation Dialogs
**FR-005.6**: Confirmation dialog MUST show:
- Action description: "Delete 15 policies?"
- Impact warning: "This moves them to trash." or "This is permanent."
- Item count badge
- Cancel/Confirm buttons
**FR-005.7**: For destructive operations with ≥20 items, dialog MUST require typing "DELETE" (case-sensitive) to enable confirm button.
**FR-005.8**: For non-destructive operations (export, restore), typing confirmation is NOT required.
### Audit Logging
**FR-005.9**: System MUST create one audit log entry per bulk operation with:
- Event type: `{resource}.bulk_{action}` (e.g., `policies.bulk_deleted`)
- Actor (user ID/email)
- Metadata: `{ item_count: 15, item_ids: [...], outcomes: {...} }`
**FR-005.10**: Audit log MUST record per-item outcomes:
```json
{
"item_count": 15,
"succeeded": 13,
"failed": 2,
"skipped": 0,
"failures": [
{"id": "abc-123", "reason": "Graph API error: 503"},
{"id": "def-456", "reason": "Policy not found"}
]
}
```
### Progress Tracking
**FR-005.11**: For queued bulk jobs (≥20 items), system MUST emit progress via:
- `BulkOperationRun` model (status, processed_items updated after each batch)
- Livewire polling on UI (every 3-5 seconds) to fetch updated progress
- Filament notification with progress bar:
- Initial: "Processing... 0/{count}"
- Periodic: "Processing... {done}/{count}"
- Final: "Completed: {succeeded} succeeded, {failed} failed"
**FR-005.11a**: UI MUST poll `BulkOperationRun` status endpoint (e.g., `/api/bulk-operations/{id}/status`) or use Livewire wire:poll to refresh progress.
**FR-005.12**: Final notification MUST include link to audit log entry for details.
**FR-005.13**: If job fails catastrophically (exception), notification MUST show: "Bulk operation failed. Contact support."
### Error Handling
**FR-005.14**: System MUST continue processing remaining items if one fails (fail-soft, not fail-fast).
**FR-005.15**: System MUST collect all failures and report them in final notification + audit log.
**FR-005.16**: If >50% of items fail, system MUST:
- Abort processing remaining items (status = `aborted`)
- Final notification: "Bulk operation aborted: {failed}/{total} failures exceeded threshold"
- Admin can manually trigger "Retry Failed Items" from BulkOperationRun detail view (future enhancement)
---
## Bulk Actions by Resource
### Policies Resource
| Action | Priority | Destructive | Scope | Threshold for Queue | Type-to-Confirm |
|--------|----------|-------------|-------|---------------------|-----------------|
| Delete (local) | P1 | Yes (local only) | TenantPilot DB | ≥20 | ≥20 |
| Export to Backup | P1 | No | TenantPilot DB | ≥20 | No |
| Force Delete | P3 | Yes (local) | TenantPilot DB | ≥10 | Always |
| Restore (untrash) | P3 | No | TenantPilot DB | ≥50 | No |
| Sync (re-fetch) | P4 | No | Graph read | ≥50 | No |
**FR-005.17**: Bulk Delete for Policies MUST set `ignored_at` timestamp (prevents re-sync) + optionally `deleted_at` (soft delete). Does NOT call Graph DELETE.
**FR-005.17a**: Sync Job MUST skip policies where `ignored_at IS NOT NULL`.
**FR-005.18**: Bulk Export to Backup MUST prompt for:
- Backup Set name (auto-generated default: "Bulk Export {date}")
- "Include Assignments" checkbox (if Feature 004 implemented)
**FR-005.19**: Bulk Sync MUST queue a SyncPoliciesJob for each selected policy.
### Policy Versions Resource
| Action | Priority | Destructive | Threshold for Queue | Type-to-Confirm |
|--------|----------|-------------|---------------------|-----------------|
| Delete | P2 | Yes | ≥20 | ≥20 |
| Export to Backup | P3 | No | ≥20 | No |
**FR-005.20**: Bulk Delete for Policy Versions MUST:
- Check eligibility: `is_current = false` AND `created_at < NOW() - 90 days` AND NOT referenced
- Referenced = exists in `backup_items.policy_version_id` OR `restore_runs.metadata` OR critical audit logs
- Hard-delete eligible versions
- Skip ineligible with reason: "Referenced", "Too recent", "Current version"
**FR-005.21**: System MUST require `policy_versions.prune` permission (separate from `policy_versions.delete`).
### Backup Sets Resource
| Action | Priority | Destructive | Threshold for Queue | Type-to-Confirm |
|--------|----------|-------------|---------------------|-----------------|
| Delete | P2 | Yes | ≥10 | ≥10 |
| Archive (flag) | P3 | No | N/A | No |
**FR-005.22**: Bulk Delete for Backup Sets MUST cascade-delete related Backup Items.
**FR-005.23**: Bulk Archive MUST set `archived_at` timestamp (soft flag, keeps data).
### Restore Runs Resource
| Action | Priority | Destructive | Threshold for Queue | Type-to-Confirm |
|--------|----------|-------------|---------------------|-----------------|
| Delete | P2 | Yes | ≥20 | ≥20 |
| Rerun | P3 | No | N/A | No |
| Cancel (abort) | P3 | No | N/A | No |
**FR-005.24**: Bulk Delete for Restore Runs MUST soft-delete.
**FR-005.25**: Bulk Delete MUST skip runs with status `running` (show warning in results).
**FR-005.26**: Bulk Rerun (if T156 implemented) MUST create new RestoreRun for each selected run.
---
## Non-Functional Requirements
**NFR-005.1**: Bulk operations MUST handle up to 500 items per operation without timeout.
**NFR-005.2**: Queue jobs MUST process items in batches of 10-20 (configurable) to avoid memory issues.
**NFR-005.3**: Progress notifications MUST update at least every 10 seconds (avoid spamming).
**NFR-005.4**: UI MUST remain responsive during bulk operations (no blocking spinner).
**NFR-005.5**: Bulk operations MUST respect tenant isolation (only act on current tenant's data).
---
## Technical Implementation
### Filament Bulk Actions Setup
```php
// Example: PolicyResource.php
public static function table(Table $table): Table
{
return $table
->columns([...])
->bulkActions([
Tables\Actions\BulkActionGroup::make([
Tables\Actions\DeleteBulkAction::make()
->requiresConfirmation()
->modalHeading(fn (Collection $records) => "Delete {$records->count()} policies?")
->modalDescription('This moves them to trash.')
->action(fn (Collection $records) =>
BulkPolicyDeleteJob::dispatch($records->pluck('id'))
),
Tables\Actions\BulkAction::make('export_to_backup')
->label('Export to Backup')
->icon('heroicon-o-arrow-down-tray')
->form([
Forms\Components\TextInput::make('backup_name')
->default('Bulk Export ' . now()->format('Y-m-d')),
Forms\Components\Checkbox::make('include_assignments')
->label('Include Assignments & Scope Tags'),
])
->action(fn (Collection $records, array $data) =>
BulkPolicyExportJob::dispatch($records->pluck('id'), $data)
),
]),
]);
}
```
### Queue Job Structure
```php
// app/Jobs/BulkPolicyDeleteJob.php
class BulkPolicyDeleteJob implements ShouldQueue
{
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;
public function __construct(
public array $policyIds, // array, NOT Collection (serialization)
public int $tenantId, // explicit tenant isolation
public int $actorId, // user ID, not just email
public int $bulkOperationRunId // FK to bulk_operation_runs table
) {}
public function handle(
AuditLogger $audit,
PolicyRepository $policies
): void {
$run = BulkOperationRun::find($this->bulkOperationRunId);
$run->update(['status' => 'running']);
$results = ['succeeded' => 0, 'failed' => 0, 'skipped' => 0, 'failures' => []];
// Process in chunks for memory efficiency
collect($this->policyIds)->chunk(10)->each(function ($chunk) use (&$results, $policies, $run) {
foreach ($chunk as $id) {
try {
$policies->markIgnored($id); // set ignored_at
$results['succeeded']++;
} catch (\Exception $e) {
$results['failed']++;
$results['failures'][] = ['id' => $id, 'reason' => $e->getMessage()];
}
}
// Update progress after each chunk
$run->update([
'processed_items' => $results['succeeded'] + $results['failed'],
'succeeded' => $results['succeeded'],
'failed' => $results['failed'],
'failures' => $results['failures'],
]);
// Circuit breaker: abort if >50% failed
if ($results['failed'] > count($this->policyIds) * 0.5) {
$run->update(['status' => 'aborted']);
throw new \Exception('Bulk operation aborted: >50% failure rate');
}
});
$auditLogId = $audit->log('policies.bulk_deleted_local', [
'item_count' => count($this->policyIds),
'outcomes' => $results,
'bulk_operation_run_id' => $this->bulkOperationRunId,
]);
$run->update(['status' => 'completed', 'audit_log_id' => $auditLogId]);
}
}
```
### Type-to-Confirm Modal
```php
Tables\Actions\DeleteBulkAction::make()
->requiresConfirmation()
->modalHeading(fn (Collection $records) =>
$records->count() >= 20
? "⚠️ Delete {$records->count()} policies?"
: "Delete {$records->count()} policies?"
)
->form(fn (Collection $records) =>
$records->count() >= 20
? [
Forms\Components\TextInput::make('confirm_delete')
->label('Type DELETE to confirm')
->rule('in:DELETE')
->required()
->helperText('This action cannot be undone.')
]
: []
)
```
---
## UI/UX Patterns
### Bulk Action Menu
```
┌────────────────────────────────────────────┐
│ ☑ Select All (50 items) │
│ │
│ 15 selected │
│ [Delete] [Export to Backup] [More ▾] │
└────────────────────────────────────────────┘
```
### Confirmation Dialog (≥20 items)
```
⚠️ Delete 25 policies?
This moves them to trash. You can restore them later.
Type DELETE to confirm:
[________________]
[Cancel] [Confirm] (disabled until typed)
```
### Progress Notification
```
🔄 Deleting policies...
████████████░░░░░░░░ 45 / 100
[View Details]
```
### Final Notification
```
✅ Deleted 98 policies
2 items failed (click for details)
[View Audit Log] [Dismiss]
```
---
## Testing Strategy
### Unit Tests
- `BulkPolicyDeleteJobTest`: Mock policy repo, test outcomes
- `BulkActionPermissionTest`: Verify permission checks
- `ConfirmationDialogTest`: Test type-to-confirm logic
### Feature Tests
- `BulkDeletePoliciesTest`: E2E flow (select → confirm → verify soft delete)
- `BulkExportToBackupTest`: E2E export with job queue
- `BulkProgressNotificationTest`: Verify progress events emitted
### Load Tests
- 500 items bulk delete (should complete in <5 minutes)
- 1000 items bulk export (queue + batch processing)
### Manual QA
- Select 30 policies bulk delete verify trash
- Export 50 policies verify backup set created
- Test type-to-confirm with correct/incorrect input
- Force job failure verify error handling
---
## Rollout Plan
### Phase 1: Foundation (P1 Actions)
- Policies: Bulk Delete, Bulk Export
- Confirmation dialogs + type-to-confirm
- **Duration**: ~8-12 hours
### Phase 2: Queue + Progress (P1 Features)
- Queue jobs for 20 items
- Progress notifications
- Audit logging
- **Duration**: ~8-10 hours
### Phase 3: Additional Resources (P2 Actions)
- Policy Versions: Bulk Delete
- Restore Runs: Bulk Delete
- Backup Sets: Bulk Delete
- **Duration**: ~6-8 hours
### Phase 4: Advanced Actions (P3 Optional)
- Bulk Force Delete
- Bulk Restore (untrash)
- Bulk Rerun (depends on T156)
- **Duration**: ~4-6 hours per action
---
## Dependencies
- Laravel Queue (✅ configured)
- Filament Bulk Actions (✅ built-in)
- Feature 001: Audit Logger (✅ complete)
## Risks & Mitigations
| Risk | Mitigation |
|------|------------|
| Large batches cause timeout | Queue jobs + chunked processing (10-20 items/batch) + Bus::batch() |
| User accidentally deletes 500 items | Type-to-confirm for 20 items + `ignored_at` flag (restorable) |
| Job fails mid-process | Fail-soft, log failures in `bulk_operation_runs`, abort if >50% fail |
| UI becomes unresponsive | Async jobs + Livewire polling for progress |
| Policy Versions deleted while referenced | Eligibility check: not referenced in backups/restores/audits |
| Sync re-adds "deleted" policies | `ignored_at` flag prevents re-sync |
| Progress notifications don't update | `BulkOperationRun` model + polling required (not automatic Filament feature) |
---
## Success Criteria
1. ✅ Bulk delete 100 policies in <2 minutes (queued)
2. Type-to-confirm prevents accidental deletes
3. Progress notifications update every 5-10s
4. Audit log captures per-item outcomes
5. 95%+ success rate for bulk operations
6. Tests cover all P1/P2 actions
---
## Open Questions
1. Should we add bulk "Tag" (apply labels/categories)?
2. Bulk "Clone" for policies (create duplicates)?
3. Max items per bulk operation (hard limit)?
4. Retry failed items in bulk operation?
---
**Status**: Draft for Review
**Created**: 2025-12-22
**Author**: AI + Ahmed
**Next Steps**: Review Plan Tasks