Commit Graph

6 Commits

Author SHA1 Message Date
Ahmed Darrazi
8b9ab52138 feat(ops-ux): constitution rollout v1.3.0 2026-01-18 14:44:16 +01:00
Ahmed Darrazi
9f980ce80e feat(054): finalize docs — RBAC delegated group search + Restore DB-only mapping; constitution note 2026-01-17 23:14:20 +01:00
30ad57baab feat/053-unify-runs-monitoring (#60)
Summary

This PR introduces Unified Operations Runs + Monitoring Hub (053).

Goal: Standardize how long-running operations are tracked and monitored using the existing tenant-scoped run record (BulkOperationRun) as the canonical “operation run”, and surface it in a single Monitoring → Operations hub (view-only, tenant-scoped, role-aware).

Phase 1 adoption scope (per spec):
	•	Drift generation (drift.generate)
	•	Backup Set “Add Policies” (backup_set.add_policies)

Note: This PR does not convert every run type yet (e.g. GroupSyncRuns / InventorySyncRuns remain separate for now). This is intentionally incremental.

⸻

What changed

Monitoring / Operations hub
	•	Moved/organized run monitoring under Monitoring → Operations
	•	Added:
	•	status buckets (queued / running / succeeded / partially succeeded / failed)
	•	filters (run type, status bucket, time range)
	•	run detail “Related” links (e.g. Drift findings, Backup Set context)
	•	All hub pages are DB-only and view-only (no rerun/cancel/delete actions)

Canonical run semantics
	•	Added canonical helpers on BulkOperationRun:
	•	runType() (resource.action)
	•	statusBucket() derived from status + counts (testable semantics)

Drift integration (Phase 1)
	•	Drift generation start behavior now:
	•	creates/reuses a BulkOperationRun with drift context payload (scope_key + baseline/current run ids)
	•	dispatches generation job
	•	emits DB notifications including “View run” link
	•	On generation failure: stores sanitized failure entries + sends failure notification

Permissions / tenant isolation
	•	Monitoring run list/view is tenant-scoped and returns 403 for cross-tenant access
	•	Readonly can view runs but cannot start drift generation

⸻

Tests

Added/updated Pest coverage:
	•	BulkOperationRunStatusBucketTest.php
	•	DriftGenerationDispatchTest.php
	•	GenerateDriftFindingsJobNotificationTest.php
	•	RunAuthorizationTenantIsolationTest.php

Validation run locally:
	•	./vendor/bin/pint --dirty
	•	targeted tests from feature quickstart / drift monitoring tests

⸻

Manual QA
	1.	Go to Monitoring → Operations
	•	verify filters (run type / status / time range)
	•	verify run detail shows counts + sanitized failures + “Related” links
	2.	Open Drift Landing
	•	with >=2 successful inventory runs for scope: should queue drift generation + show notification with “View run”
	•	as readonly: should not start generation
	3.	Run detail
	•	drift.generate runs show “Drift findings” related link
	•	failure entries are sanitized (no secrets/tokens/raw payload dumps)

⸻

Notes / Ops
	•	Queue workers must be restarted after deploy so they load the new code:
	•	php artisan queue:restart (or Sail equivalent)
	•	This PR standardizes monitoring for Phase 1 producers only; follow-ups will migrate additional run types into the unified pattern.

⸻

Spec / Docs
	•	SpecKit artifacts added under specs/053-unify-runs-monitoring/
	•	Checklists are complete:
	•	requirements checklist PASS
	•	writing checklist PASS

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local>
Reviewed-on: #60
2026-01-16 15:10:31 +00:00
bc846d7c5c 051-entra-group-directory-cache (#57)
Summary

Adds a tenant-scoped Entra Groups “Directory Cache” to enable DB-only group name resolution across the app (no render-time Graph calls), plus sync runs + observability.

What’s included
	•	Entra Groups cache
	•	New entra_groups storage (tenant-scoped) for group metadata (no memberships).
	•	Retention semantics: groups become stale / retained per spec (no hard delete on first miss).
	•	Group Sync Runs
	•	New “Group Sync Runs” UI (list + detail) with tenant isolation (403 on cross-tenant access).
	•	Manual “Sync Groups” action: creates/reuses a run, dispatches job, DB notification with “View run” link.
	•	Scheduled dispatcher command wired in console.php.
	•	DB-only label resolution (US3)
	•	Shared EntraGroupLabelResolver with safe fallback Unresolved (…last8) and UUID guarding.
	•	Refactors to prefer cached names (no typeahead / no live Graph) in:
	•	Tenant RBAC group selects
	•	Policy version assignments widget
	•	Restore results + restore wizard group mapping labels

Safety / Guardrails
	•	No render-time Graph calls: fail-hard guard test verifies UI paths don’t call GraphClientInterface during page render.
	•	Tenant isolation & authorization: policies + scoped queries enforced (cross-tenant access returns 403, not 404).
	•	Data minimization: only group metadata is cached (no membership/owners).

Tests / Verification
	•	Added/updated tests under tests/Feature/DirectoryGroups and tests/Unit/DirectoryGroups:
	•	Start sync → run record + job dispatch + upserts
	•	Retention purge semantics
	•	Scheduled dispatch wiring
	•	Render-time Graph guard
	•	UI/resource access isolation
	•	Ran:
	•	./vendor/bin/pint --dirty
	•	./vendor/bin/sail artisan test tests/Feature/DirectoryGroups
	•	./vendor/bin/sail artisan test tests/Unit/DirectoryGroups

Notes / Follow-ups
	•	UI polish remains (picker/lookup UX, consistent progress widget/toasts across modules, navigation grouping).
	•	pr-gate checklist still has non-blocking open items (mostly UX/ops polish); requirements gate is green.

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local>
Reviewed-on: #57
2026-01-11 23:24:12 +00:00
bcf4996a1e feat/049-backup-restore-job-orchestration (#56)
Summary

This PR implements Spec 049 – Backup/Restore Job Orchestration: all critical Backup/Restore execution paths are job-only, idempotent, tenant-scoped, and observable via run records + DB notifications (Phase 1). The UI no longer performs heavy Graph work inside request/Filament actions for these flows.

Why

We want predictable UX and operations at MSP scale:
	•	no timeouts / long-running requests
	•	reproducible run state + per-item results
	•	safe error persistence (no secrets / no token leakage)
	•	strict tenant isolation + auditability for write paths

What changed

Foundational (Runs + Idempotency + Observability)
	•	Added a shared RunIdempotency helper (dedupe while queued/running).
	•	Added a read-only BulkOperationRuns surface (list + view) for status/progress.
	•	Added DB notifications for run status changes (with “View run” link).

US1 – Policy “Capture snapshot” is job-only
	•	Policy detail “Capture snapshot” now:
	•	creates/reuses a run (dedupe key: tenant + policy.capture_snapshot + policy DB id)
	•	dispatches a queued job
	•	returns immediately with notification + link to run detail
	•	Graph capture work moved fully into the job; request path stays Graph-free.

US3 – Restore runs orchestration is job-only + safe
	•	Live restore execution is queued and updates RestoreRun status/progress.
	•	Per-item outcomes are persisted deterministically (per internal DB record).
	•	Audit logging is written for live restore.
	•	Preview/dry-run is enforced as read-only (no writes).

Tenant isolation / authorization (non-negotiable)
	•	Run list/view/start are tenant-scoped and policy-guarded (cross-tenant access => 403, not 404).
	•	Explicit Pest tests cover cross-tenant denial and start authorization.

Tests / Verification
	•	./vendor/bin/pint --dirty
	•	Targeted suite (examples):
	•	policy capture snapshot queued + idempotency tests
	•	restore orchestration + audit logging + preview read-only tests
	•	run authorization / tenant isolation tests

Notes / Scope boundaries
	•	Phase 1 UX = DB notifications + run detail page. A global “progress widget” is tracked as Phase 2 and not required for merge.
	•	Resilience/backoff is tracked in tasks but can be iterated further after merge.

Review focus
	•	Dedupe behavior for queued/running runs (reuse vs create-new)
	•	Tenant scoping & policy gates for all run surfaces
	•	Restore safety: audit event + preview no-writes

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local>
Reviewed-on: #56
2026-01-11 15:59:06 +00:00
4d3fcd28a9 feat/032-backup-scheduling-mvp (#34)
What
Implements tenant-scoped backup scheduling end-to-end: schedules CRUD, minute-based dispatch, queued execution, run history, manual “Run now/Retry”, retention (keep last N), and auditability.

Key changes

Filament UI: Backup Schedules resource with tenant scoping + SEC-002 role gating.
Scheduler + queue: tenantpilot:schedules:dispatch command wired in scheduler (runs every minute), creates idempotent BackupScheduleRun records and dispatches jobs.
Execution: RunBackupScheduleJob syncs policies, creates immutable backup sets, updates run status, writes audit logs, applies retry/backoff mapping, and triggers retention.
Run history: Relation manager + “View” modal rendering run details.
UX polish: row actions grouped; bulk actions grouped (run now / retry / delete). Bulk dispatch writes DB notifications (shows in notifications panel).
Validation: policy type hard-validation on save; unknown policy types handled safely at runtime (skipped/partial).
Tests: comprehensive Pest coverage for CRUD/scoping/validation, idempotency, job outcomes, error mapping, retention, view modal, run-now/retry notifications, bulk delete (incl. operator forbidden).
Files / Areas

Filament: BackupScheduleResource.php and app/Filament/Resources/BackupScheduleResource/*
Scheduling/Jobs: app/Console/Commands/TenantpilotDispatchBackupSchedules.php, app/Jobs/RunBackupScheduleJob.php, app/Jobs/ApplyBackupScheduleRetentionJob.php, console.php
Models/Migrations: app/Models/BackupSchedule.php, app/Models/BackupScheduleRun.php, database/migrations/backup_schedules, backup_schedule_runs
Notifications: BackupScheduleRunDispatchedNotification.php
Specs: specs/032-backup-scheduling-mvp/* (tasks/checklist/quickstart updates)
How to test (Sail)

Run tests: ./vendor/bin/sail artisan test tests/Feature/BackupScheduling
Run formatter: ./vendor/bin/sail php ./vendor/bin/pint --dirty
Apply migrations: ./vendor/bin/sail artisan migrate
Manual dispatch: ./vendor/bin/sail artisan tenantpilot:schedules:dispatch
Notes

Uses DB notifications for queued UI actions to ensure they appear in the notifications panel even under queue fakes in tests.
Checklist gate for 032 is PASS; tasks updated accordingly.

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local>
Reviewed-on: #34
2026-01-05 04:22:13 +00:00