Commit Graph

3 Commits

Author SHA1 Message Date
3030dd9af2 054-unify-runs-suitewide (#63)
Summary

Kurz: Implementiert Feature 054 — canonical OperationRun-flow, Monitoring UI, dispatch-safety, notifications, dedupe, plus small UX safety clarifications (RBAC group search delegated; Restore group mapping DB-only).
What Changed

Core service: OperationRun lifecycle, dedupe and dispatch helpers — OperationRunService.php.
Model + migration: OperationRun model and migration — OperationRun.php, 2026_01_16_180642_create_operation_runs_table.php.
Notifications: queued + terminal DB notifications (initiator-only) — OperationRunQueued.php, OperationRunCompleted.php.
Monitoring UI: Filament list/detail + Livewire pieces (DB-only render) — OperationRunResource.php and related pages/views.
Start surfaces / Jobs: instrumented start surfaces, job middleware, and job updates to use canonical runs — multiple app/Jobs/* and app/Filament/* updates (see tests for full coverage).
RBAC + Restore UX clarifications: RBAC group search is delegated-Graph-based and disabled without delegated token; Restore group mapping remains DB-only (directory cache) and helper text always visible — TenantResource.php, RestoreRunResource.php.
Specs / Constitution: updated spec & quickstart and added one-line constitution guideline about Graph usage:
spec.md
quickstart.md
constitution.md
Tests & Verification

Unit / Feature tests added/updated for run lifecycle, notifications, idempotency, and UI guards: see tests/Feature/* (notably OperationRunServiceTest, MonitoringOperationsTest, OperationRunNotificationTest, and various Filament feature tests).
Full test run locally: ./vendor/bin/sail artisan test → 587 passed, 5 skipped.
Migrations

Adds create_operation_runs_table migration; run php artisan migrate in staging after review.
Notes / Rationale

Monitoring pages are explicitly DB-only at render time (no Graph calls). Start surfaces enqueue work only and return a “View run” link.
Delegated Graph access is used only for explicit user actions (RBAC group search); restore mapping intentionally uses cached DB data only to avoid render-time Graph calls.
Dispatch wrapper marks runs failed immediately if background dispatch throws synchronously to avoid misleading “queued” states.
Upgrade / Deploy Considerations

Run migrations: ./vendor/bin/sail artisan migrate.
Background workers should be running to process queued jobs (recommended to monitor queue health during rollout).
No secret or token persistence changes.
PR checklist

 Tests updated/added for changed behavior
 Specs updated: 054-unify-runs-suitewide docs + quickstart
 Constitution note added (.specify)
 Pint formatting applied

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local>
Reviewed-on: #63
2026-01-17 22:25:00 +00:00
30ad57baab feat/053-unify-runs-monitoring (#60)
Summary

This PR introduces Unified Operations Runs + Monitoring Hub (053).

Goal: Standardize how long-running operations are tracked and monitored using the existing tenant-scoped run record (BulkOperationRun) as the canonical “operation run”, and surface it in a single Monitoring → Operations hub (view-only, tenant-scoped, role-aware).

Phase 1 adoption scope (per spec):
	•	Drift generation (drift.generate)
	•	Backup Set “Add Policies” (backup_set.add_policies)

Note: This PR does not convert every run type yet (e.g. GroupSyncRuns / InventorySyncRuns remain separate for now). This is intentionally incremental.

⸻

What changed

Monitoring / Operations hub
	•	Moved/organized run monitoring under Monitoring → Operations
	•	Added:
	•	status buckets (queued / running / succeeded / partially succeeded / failed)
	•	filters (run type, status bucket, time range)
	•	run detail “Related” links (e.g. Drift findings, Backup Set context)
	•	All hub pages are DB-only and view-only (no rerun/cancel/delete actions)

Canonical run semantics
	•	Added canonical helpers on BulkOperationRun:
	•	runType() (resource.action)
	•	statusBucket() derived from status + counts (testable semantics)

Drift integration (Phase 1)
	•	Drift generation start behavior now:
	•	creates/reuses a BulkOperationRun with drift context payload (scope_key + baseline/current run ids)
	•	dispatches generation job
	•	emits DB notifications including “View run” link
	•	On generation failure: stores sanitized failure entries + sends failure notification

Permissions / tenant isolation
	•	Monitoring run list/view is tenant-scoped and returns 403 for cross-tenant access
	•	Readonly can view runs but cannot start drift generation

⸻

Tests

Added/updated Pest coverage:
	•	BulkOperationRunStatusBucketTest.php
	•	DriftGenerationDispatchTest.php
	•	GenerateDriftFindingsJobNotificationTest.php
	•	RunAuthorizationTenantIsolationTest.php

Validation run locally:
	•	./vendor/bin/pint --dirty
	•	targeted tests from feature quickstart / drift monitoring tests

⸻

Manual QA
	1.	Go to Monitoring → Operations
	•	verify filters (run type / status / time range)
	•	verify run detail shows counts + sanitized failures + “Related” links
	2.	Open Drift Landing
	•	with >=2 successful inventory runs for scope: should queue drift generation + show notification with “View run”
	•	as readonly: should not start generation
	3.	Run detail
	•	drift.generate runs show “Drift findings” related link
	•	failure entries are sanitized (no secrets/tokens/raw payload dumps)

⸻

Notes / Ops
	•	Queue workers must be restarted after deploy so they load the new code:
	•	php artisan queue:restart (or Sail equivalent)
	•	This PR standardizes monitoring for Phase 1 producers only; follow-ups will migrate additional run types into the unified pattern.

⸻

Spec / Docs
	•	SpecKit artifacts added under specs/053-unify-runs-monitoring/
	•	Checklists are complete:
	•	requirements checklist PASS
	•	writing checklist PASS

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local>
Reviewed-on: #60
2026-01-16 15:10:31 +00:00
bcf4996a1e feat/049-backup-restore-job-orchestration (#56)
Summary

This PR implements Spec 049 – Backup/Restore Job Orchestration: all critical Backup/Restore execution paths are job-only, idempotent, tenant-scoped, and observable via run records + DB notifications (Phase 1). The UI no longer performs heavy Graph work inside request/Filament actions for these flows.

Why

We want predictable UX and operations at MSP scale:
	•	no timeouts / long-running requests
	•	reproducible run state + per-item results
	•	safe error persistence (no secrets / no token leakage)
	•	strict tenant isolation + auditability for write paths

What changed

Foundational (Runs + Idempotency + Observability)
	•	Added a shared RunIdempotency helper (dedupe while queued/running).
	•	Added a read-only BulkOperationRuns surface (list + view) for status/progress.
	•	Added DB notifications for run status changes (with “View run” link).

US1 – Policy “Capture snapshot” is job-only
	•	Policy detail “Capture snapshot” now:
	•	creates/reuses a run (dedupe key: tenant + policy.capture_snapshot + policy DB id)
	•	dispatches a queued job
	•	returns immediately with notification + link to run detail
	•	Graph capture work moved fully into the job; request path stays Graph-free.

US3 – Restore runs orchestration is job-only + safe
	•	Live restore execution is queued and updates RestoreRun status/progress.
	•	Per-item outcomes are persisted deterministically (per internal DB record).
	•	Audit logging is written for live restore.
	•	Preview/dry-run is enforced as read-only (no writes).

Tenant isolation / authorization (non-negotiable)
	•	Run list/view/start are tenant-scoped and policy-guarded (cross-tenant access => 403, not 404).
	•	Explicit Pest tests cover cross-tenant denial and start authorization.

Tests / Verification
	•	./vendor/bin/pint --dirty
	•	Targeted suite (examples):
	•	policy capture snapshot queued + idempotency tests
	•	restore orchestration + audit logging + preview read-only tests
	•	run authorization / tenant isolation tests

Notes / Scope boundaries
	•	Phase 1 UX = DB notifications + run detail page. A global “progress widget” is tracked as Phase 2 and not required for merge.
	•	Resilience/backoff is tracked in tasks but can be iterated further after merge.

Review focus
	•	Dedupe behavior for queued/running runs (reuse vs create-new)
	•	Tenant scoping & policy gates for all run surfaces
	•	Restore safety: audit event + preview no-writes

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local>
Reviewed-on: #56
2026-01-11 15:59:06 +00:00