TenantAtlas

ahmido/TenantAtlas

Fork 0

Commit Graph

Author	SHA1	Message	Date
ahmido	3030dd9af2	054-unify-runs-suitewide (#63 ) Summary Kurz: Implementiert Feature 054 — canonical OperationRun-flow, Monitoring UI, dispatch-safety, notifications, dedupe, plus small UX safety clarifications (RBAC group search delegated; Restore group mapping DB-only). What Changed Core service: OperationRun lifecycle, dedupe and dispatch helpers — OperationRunService.php. Model + migration: OperationRun model and migration — OperationRun.php, 2026_01_16_180642_create_operation_runs_table.php. Notifications: queued + terminal DB notifications (initiator-only) — OperationRunQueued.php, OperationRunCompleted.php. Monitoring UI: Filament list/detail + Livewire pieces (DB-only render) — OperationRunResource.php and related pages/views. Start surfaces / Jobs: instrumented start surfaces, job middleware, and job updates to use canonical runs — multiple app/Jobs/* and app/Filament/* updates (see tests for full coverage). RBAC + Restore UX clarifications: RBAC group search is delegated-Graph-based and disabled without delegated token; Restore group mapping remains DB-only (directory cache) and helper text always visible — TenantResource.php, RestoreRunResource.php. Specs / Constitution: updated spec & quickstart and added one-line constitution guideline about Graph usage: spec.md quickstart.md constitution.md Tests & Verification Unit / Feature tests added/updated for run lifecycle, notifications, idempotency, and UI guards: see tests/Feature/* (notably OperationRunServiceTest, MonitoringOperationsTest, OperationRunNotificationTest, and various Filament feature tests). Full test run locally: ./vendor/bin/sail artisan test → 587 passed, 5 skipped. Migrations Adds create_operation_runs_table migration; run php artisan migrate in staging after review. Notes / Rationale Monitoring pages are explicitly DB-only at render time (no Graph calls). Start surfaces enqueue work only and return a “View run” link. Delegated Graph access is used only for explicit user actions (RBAC group search); restore mapping intentionally uses cached DB data only to avoid render-time Graph calls. Dispatch wrapper marks runs failed immediately if background dispatch throws synchronously to avoid misleading “queued” states. Upgrade / Deploy Considerations Run migrations: ./vendor/bin/sail artisan migrate. Background workers should be running to process queued jobs (recommended to monitor queue health during rollout). No secret or token persistence changes. PR checklist Tests updated/added for changed behavior Specs updated: 054-unify-runs-suitewide docs + quickstart Constitution note added (.specify) Pint formatting applied Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local> Reviewed-on: #63	2026-01-17 22:25:00 +00:00
ahmido	30ad57baab	feat/053-unify-runs-monitoring (#60 ) Summary This PR introduces Unified Operations Runs + Monitoring Hub (053). Goal: Standardize how long-running operations are tracked and monitored using the existing tenant-scoped run record (BulkOperationRun) as the canonical “operation run”, and surface it in a single Monitoring → Operations hub (view-only, tenant-scoped, role-aware). Phase 1 adoption scope (per spec): • Drift generation (drift.generate) • Backup Set “Add Policies” (backup_set.add_policies) Note: This PR does not convert every run type yet (e.g. GroupSyncRuns / InventorySyncRuns remain separate for now). This is intentionally incremental. ⸻ What changed Monitoring / Operations hub • Moved/organized run monitoring under Monitoring → Operations • Added: • status buckets (queued / running / succeeded / partially succeeded / failed) • filters (run type, status bucket, time range) • run detail “Related” links (e.g. Drift findings, Backup Set context) • All hub pages are DB-only and view-only (no rerun/cancel/delete actions) Canonical run semantics • Added canonical helpers on BulkOperationRun: • runType() (resource.action) • statusBucket() derived from status + counts (testable semantics) Drift integration (Phase 1) • Drift generation start behavior now: • creates/reuses a BulkOperationRun with drift context payload (scope_key + baseline/current run ids) • dispatches generation job • emits DB notifications including “View run” link • On generation failure: stores sanitized failure entries + sends failure notification Permissions / tenant isolation • Monitoring run list/view is tenant-scoped and returns 403 for cross-tenant access • Readonly can view runs but cannot start drift generation ⸻ Tests Added/updated Pest coverage: • BulkOperationRunStatusBucketTest.php • DriftGenerationDispatchTest.php • GenerateDriftFindingsJobNotificationTest.php • RunAuthorizationTenantIsolationTest.php Validation run locally: • ./vendor/bin/pint --dirty • targeted tests from feature quickstart / drift monitoring tests ⸻ Manual QA 1. Go to Monitoring → Operations • verify filters (run type / status / time range) • verify run detail shows counts + sanitized failures + “Related” links 2. Open Drift Landing • with >=2 successful inventory runs for scope: should queue drift generation + show notification with “View run” • as readonly: should not start generation 3. Run detail • drift.generate runs show “Drift findings” related link • failure entries are sanitized (no secrets/tokens/raw payload dumps) ⸻ Notes / Ops • Queue workers must be restarted after deploy so they load the new code: • php artisan queue:restart (or Sail equivalent) • This PR standardizes monitoring for Phase 1 producers only; follow-ups will migrate additional run types into the unified pattern. ⸻ Spec / Docs • SpecKit artifacts added under specs/053-unify-runs-monitoring/ • Checklists are complete: • requirements checklist PASS • writing checklist PASS Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local> Reviewed-on: #60	2026-01-16 15:10:31 +00:00
ahmido	bcf4996a1e	feat/049-backup-restore-job-orchestration (#56 ) Summary This PR implements Spec 049 – Backup/Restore Job Orchestration: all critical Backup/Restore execution paths are job-only, idempotent, tenant-scoped, and observable via run records + DB notifications (Phase 1). The UI no longer performs heavy Graph work inside request/Filament actions for these flows. Why We want predictable UX and operations at MSP scale: • no timeouts / long-running requests • reproducible run state + per-item results • safe error persistence (no secrets / no token leakage) • strict tenant isolation + auditability for write paths What changed Foundational (Runs + Idempotency + Observability) • Added a shared RunIdempotency helper (dedupe while queued/running). • Added a read-only BulkOperationRuns surface (list + view) for status/progress. • Added DB notifications for run status changes (with “View run” link). US1 – Policy “Capture snapshot” is job-only • Policy detail “Capture snapshot” now: • creates/reuses a run (dedupe key: tenant + policy.capture_snapshot + policy DB id) • dispatches a queued job • returns immediately with notification + link to run detail • Graph capture work moved fully into the job; request path stays Graph-free. US3 – Restore runs orchestration is job-only + safe • Live restore execution is queued and updates RestoreRun status/progress. • Per-item outcomes are persisted deterministically (per internal DB record). • Audit logging is written for live restore. • Preview/dry-run is enforced as read-only (no writes). Tenant isolation / authorization (non-negotiable) • Run list/view/start are tenant-scoped and policy-guarded (cross-tenant access => 403, not 404). • Explicit Pest tests cover cross-tenant denial and start authorization. Tests / Verification • ./vendor/bin/pint --dirty • Targeted suite (examples): • policy capture snapshot queued + idempotency tests • restore orchestration + audit logging + preview read-only tests • run authorization / tenant isolation tests Notes / Scope boundaries • Phase 1 UX = DB notifications + run detail page. A global “progress widget” is tracked as Phase 2 and not required for merge. • Resilience/backoff is tracked in tasks but can be iterated further after merge. Review focus • Dedupe behavior for queued/running runs (reuse vs create-new) • Tenant scoping & policy gates for all run surfaces • Restore safety: audit event + preview no-writes Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local> Reviewed-on: #56	2026-01-11 15:59:06 +00:00

Author

SHA1

Message

Date

ahmido

3030dd9af2

054-unify-runs-suitewide (#63 )

Summary

Kurz: Implementiert Feature 054 — canonical OperationRun-flow, Monitoring UI, dispatch-safety, notifications, dedupe, plus small UX safety clarifications (RBAC group search delegated; Restore group mapping DB-only).
What Changed

Core service: OperationRun lifecycle, dedupe and dispatch helpers — OperationRunService.php.
Model + migration: OperationRun model and migration — OperationRun.php, 2026_01_16_180642_create_operation_runs_table.php.
Notifications: queued + terminal DB notifications (initiator-only) — OperationRunQueued.php, OperationRunCompleted.php.
Monitoring UI: Filament list/detail + Livewire pieces (DB-only render) — OperationRunResource.php and related pages/views.
Start surfaces / Jobs: instrumented start surfaces, job middleware, and job updates to use canonical runs — multiple app/Jobs/* and app/Filament/* updates (see tests for full coverage).
RBAC + Restore UX clarifications: RBAC group search is delegated-Graph-based and disabled without delegated token; Restore group mapping remains DB-only (directory cache) and helper text always visible — TenantResource.php, RestoreRunResource.php.
Specs / Constitution: updated spec & quickstart and added one-line constitution guideline about Graph usage:
spec.md
quickstart.md
constitution.md
Tests & Verification

Unit / Feature tests added/updated for run lifecycle, notifications, idempotency, and UI guards: see tests/Feature/* (notably OperationRunServiceTest, MonitoringOperationsTest, OperationRunNotificationTest, and various Filament feature tests).
Full test run locally: ./vendor/bin/sail artisan test → 587 passed, 5 skipped.
Migrations

Adds create_operation_runs_table migration; run php artisan migrate in staging after review.
Notes / Rationale

Monitoring pages are explicitly DB-only at render time (no Graph calls). Start surfaces enqueue work only and return a “View run” link.
Delegated Graph access is used only for explicit user actions (RBAC group search); restore mapping intentionally uses cached DB data only to avoid render-time Graph calls.
Dispatch wrapper marks runs failed immediately if background dispatch throws synchronously to avoid misleading “queued” states.
Upgrade / Deploy Considerations

Run migrations: ./vendor/bin/sail artisan migrate.
Background workers should be running to process queued jobs (recommended to monitor queue health during rollout).
No secret or token persistence changes.
PR checklist

Tests updated/added for changed behavior
Specs updated: 054-unify-runs-suitewide docs + quickstart
Constitution note added (.specify)
Pint formatting applied

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local>
Reviewed-on: #63

2026-01-17 22:25:00 +00:00

ahmido

30ad57baab

feat/053-unify-runs-monitoring (#60 )

Summary

This PR introduces Unified Operations Runs + Monitoring Hub (053).

Goal: Standardize how long-running operations are tracked and monitored using the existing tenant-scoped run record (BulkOperationRun) as the canonical “operation run”, and surface it in a single Monitoring → Operations hub (view-only, tenant-scoped, role-aware).

Phase 1 adoption scope (per spec):
	•	Drift generation (drift.generate)
	•	Backup Set “Add Policies” (backup_set.add_policies)

Note: This PR does not convert every run type yet (e.g. GroupSyncRuns / InventorySyncRuns remain separate for now). This is intentionally incremental.

⸻

What changed

Monitoring / Operations hub
	•	Moved/organized run monitoring under Monitoring → Operations
	•	Added:
	•	status buckets (queued / running / succeeded / partially succeeded / failed)
	•	filters (run type, status bucket, time range)
	•	run detail “Related” links (e.g. Drift findings, Backup Set context)
	•	All hub pages are DB-only and view-only (no rerun/cancel/delete actions)

Canonical run semantics
	•	Added canonical helpers on BulkOperationRun:
	•	runType() (resource.action)
	•	statusBucket() derived from status + counts (testable semantics)

Drift integration (Phase 1)
	•	Drift generation start behavior now:
	•	creates/reuses a BulkOperationRun with drift context payload (scope_key + baseline/current run ids)
	•	dispatches generation job
	•	emits DB notifications including “View run” link
	•	On generation failure: stores sanitized failure entries + sends failure notification

Permissions / tenant isolation
	•	Monitoring run list/view is tenant-scoped and returns 403 for cross-tenant access
	•	Readonly can view runs but cannot start drift generation

⸻

Tests

Added/updated Pest coverage:
	•	BulkOperationRunStatusBucketTest.php
	•	DriftGenerationDispatchTest.php
	•	GenerateDriftFindingsJobNotificationTest.php
	•	RunAuthorizationTenantIsolationTest.php

Validation run locally:
	•	./vendor/bin/pint --dirty
	•	targeted tests from feature quickstart / drift monitoring tests

⸻

Manual QA
	1.	Go to Monitoring → Operations
	•	verify filters (run type / status / time range)
	•	verify run detail shows counts + sanitized failures + “Related” links
	2.	Open Drift Landing
	•	with >=2 successful inventory runs for scope: should queue drift generation + show notification with “View run”
	•	as readonly: should not start generation
	3.	Run detail
	•	drift.generate runs show “Drift findings” related link
	•	failure entries are sanitized (no secrets/tokens/raw payload dumps)

⸻

Notes / Ops
	•	Queue workers must be restarted after deploy so they load the new code:
	•	php artisan queue:restart (or Sail equivalent)
	•	This PR standardizes monitoring for Phase 1 producers only; follow-ups will migrate additional run types into the unified pattern.

⸻

Spec / Docs
	•	SpecKit artifacts added under specs/053-unify-runs-monitoring/
	•	Checklists are complete:
	•	requirements checklist PASS
	•	writing checklist PASS

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local>
Reviewed-on: #60

2026-01-16 15:10:31 +00:00

ahmido

bcf4996a1e

feat/049-backup-restore-job-orchestration (#56 )

Summary

This PR implements Spec 049 – Backup/Restore Job Orchestration: all critical Backup/Restore execution paths are job-only, idempotent, tenant-scoped, and observable via run records + DB notifications (Phase 1). The UI no longer performs heavy Graph work inside request/Filament actions for these flows.

Why

We want predictable UX and operations at MSP scale:
• no timeouts / long-running requests
• reproducible run state + per-item results
• safe error persistence (no secrets / no token leakage)
• strict tenant isolation + auditability for write paths

What changed

Foundational (Runs + Idempotency + Observability)
• Added a shared RunIdempotency helper (dedupe while queued/running).
• Added a read-only BulkOperationRuns surface (list + view) for status/progress.
• Added DB notifications for run status changes (with “View run” link).

US1 – Policy “Capture snapshot” is job-only
• Policy detail “Capture snapshot” now:
• creates/reuses a run (dedupe key: tenant + policy.capture_snapshot + policy DB id)
• dispatches a queued job
• returns immediately with notification + link to run detail
• Graph capture work moved fully into the job; request path stays Graph-free.

US3 – Restore runs orchestration is job-only + safe
• Live restore execution is queued and updates RestoreRun status/progress.
• Per-item outcomes are persisted deterministically (per internal DB record).
• Audit logging is written for live restore.
• Preview/dry-run is enforced as read-only (no writes).

Tenant isolation / authorization (non-negotiable)
• Run list/view/start are tenant-scoped and policy-guarded (cross-tenant access => 403, not 404).
• Explicit Pest tests cover cross-tenant denial and start authorization.

Tests / Verification
• ./vendor/bin/pint --dirty
• Targeted suite (examples):
• policy capture snapshot queued + idempotency tests
• restore orchestration + audit logging + preview read-only tests
• run authorization / tenant isolation tests

Notes / Scope boundaries
• Phase 1 UX = DB notifications + run detail page. A global “progress widget” is tracked as Phase 2 and not required for merge.
• Resilience/backoff is tracked in tasks but can be iterated further after merge.

Review focus
• Dedupe behavior for queued/running runs (reuse vs create-new)
• Tenant scoping & policy gates for all run surfaces
• Restore safety: audit event + preview no-writes

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local>
Reviewed-on: #56

2026-01-11 15:59:06 +00:00

3 Commits