TenantAtlas/specs/053-unify-runs-monitoring/tasks.md
ahmido 30ad57baab feat/053-unify-runs-monitoring (#60)
Summary

This PR introduces Unified Operations Runs + Monitoring Hub (053).

Goal: Standardize how long-running operations are tracked and monitored using the existing tenant-scoped run record (BulkOperationRun) as the canonical “operation run”, and surface it in a single Monitoring → Operations hub (view-only, tenant-scoped, role-aware).

Phase 1 adoption scope (per spec):
	•	Drift generation (drift.generate)
	•	Backup Set “Add Policies” (backup_set.add_policies)

Note: This PR does not convert every run type yet (e.g. GroupSyncRuns / InventorySyncRuns remain separate for now). This is intentionally incremental.

⸻

What changed

Monitoring / Operations hub
	•	Moved/organized run monitoring under Monitoring → Operations
	•	Added:
	•	status buckets (queued / running / succeeded / partially succeeded / failed)
	•	filters (run type, status bucket, time range)
	•	run detail “Related” links (e.g. Drift findings, Backup Set context)
	•	All hub pages are DB-only and view-only (no rerun/cancel/delete actions)

Canonical run semantics
	•	Added canonical helpers on BulkOperationRun:
	•	runType() (resource.action)
	•	statusBucket() derived from status + counts (testable semantics)

Drift integration (Phase 1)
	•	Drift generation start behavior now:
	•	creates/reuses a BulkOperationRun with drift context payload (scope_key + baseline/current run ids)
	•	dispatches generation job
	•	emits DB notifications including “View run” link
	•	On generation failure: stores sanitized failure entries + sends failure notification

Permissions / tenant isolation
	•	Monitoring run list/view is tenant-scoped and returns 403 for cross-tenant access
	•	Readonly can view runs but cannot start drift generation

⸻

Tests

Added/updated Pest coverage:
	•	BulkOperationRunStatusBucketTest.php
	•	DriftGenerationDispatchTest.php
	•	GenerateDriftFindingsJobNotificationTest.php
	•	RunAuthorizationTenantIsolationTest.php

Validation run locally:
	•	./vendor/bin/pint --dirty
	•	targeted tests from feature quickstart / drift monitoring tests

⸻

Manual QA
	1.	Go to Monitoring → Operations
	•	verify filters (run type / status / time range)
	•	verify run detail shows counts + sanitized failures + “Related” links
	2.	Open Drift Landing
	•	with >=2 successful inventory runs for scope: should queue drift generation + show notification with “View run”
	•	as readonly: should not start generation
	3.	Run detail
	•	drift.generate runs show “Drift findings” related link
	•	failure entries are sanitized (no secrets/tokens/raw payload dumps)

⸻

Notes / Ops
	•	Queue workers must be restarted after deploy so they load the new code:
	•	php artisan queue:restart (or Sail equivalent)
	•	This PR standardizes monitoring for Phase 1 producers only; follow-ups will migrate additional run types into the unified pattern.

⸻

Spec / Docs
	•	SpecKit artifacts added under specs/053-unify-runs-monitoring/
	•	Checklists are complete:
	•	requirements checklist PASS
	•	writing checklist PASS

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local>
Reviewed-on: #60
2026-01-16 15:10:31 +00:00

175 lines
8.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
description: "Task list for implementing Unified Operations Runs + Monitoring Hub (053)"
---
# Tasks: Unified Operations Runs + Monitoring Hub (053)
**Input**: Design documents from `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/053-unify-runs-monitoring/`
**Prerequisites**: plan.md (required), spec.md (required), research.md, data-model.md, contracts/, quickstart.md
**Tests**: Not explicitly requested in spec.md. Add/adjust Pest tests as needed during implementation; validate with the existing test suite.
**Organization**: Tasks are grouped by user story so each story can be implemented and validated independently.
## Format: `- [ ] T### [P?] [US#?] Description with file path`
- **[P]**: Can run in parallel (different files, no dependencies)
- **[US#]**: User story mapping (US1/US2/US3). Setup/Foundational/Polish tasks have no story label.
## Path Conventions (Laravel)
- App code: `app/`
- Filament admin: `app/Filament/`
- Livewire: `app/Livewire/`
- Jobs: `app/Jobs/`
- DB: `database/migrations/`
- Views: `resources/views/`
- Tests (Pest): `tests/Feature/`, `tests/Unit/`
---
## Phase 1: Setup (Shared Infrastructure)
**Purpose**: Confirm baseline assumptions and align documentation artifacts with the codebase.
- [x] T001 [P] Confirm “Monitoring/Operations hub = evolve BulkOperationRunResource” decision remains correct and update notes if needed in specs/053-unify-runs-monitoring/research.md
- [x] T002 [P] Verify Filament URLs match contracts (index/view) and update specs/053-unify-runs-monitoring/contracts/admin-pages.openapi.yaml if paths differ
---
## Phase 2: Foundational (Blocking Prerequisites)
**Purpose**: Shared building blocks required by all user stories.
**⚠️ CRITICAL**: No user story work should begin until this phase is complete.
- [x] T003 Add `runType()` and `statusBucket()` accessors (queued/running/succeeded/partial/failed) to app/Models/BulkOperationRun.php
- [x] T004 [P] Confirm `Readonly` users can view run list/detail tenant-scoped (and only view) by reviewing/updating app/Policies/BulkOperationRunPolicy.php
**Checkpoint**: Foundation ready — Monitoring UI and run producers can reuse consistent status semantics.
---
## Phase 3: User Story 1 - Monitor operations in one place (Priority: P1) 🎯 MVP
**Goal**: Provide a single Monitoring/Operations area to list and drill into tenant runs with consistent status semantics and safe failure visibility.
**Independent Test**: Visit Monitoring → Operations for a tenant with runs; filter by type/status; open a run and confirm counts + sanitized failures are visible; verify `Readonly` sees view-only UI.
### Implementation
- [x] T005 [US1] Move Operations runs into “Monitoring” navigation and label it “Operations” in app/Filament/Resources/BulkOperationRunResource.php
- [x] T006 [US1] Render status badges using `statusBucket()` (not raw status) in app/Filament/Resources/BulkOperationRunResource.php
- [x] T007 [US1] Add filters for run type (`resource.action`) and status bucket in app/Filament/Resources/BulkOperationRunResource.php
- [x] T008 [US1] Add time range filter (created_at from/to) in app/Filament/Resources/BulkOperationRunResource.php
- [x] T009 [US1] Add a “Related” section on the run detail view linking to the relevant feature context (e.g., Backup Set for `backup_set.add_policies`) in app/Filament/Resources/BulkOperationRunResource.php
**Checkpoint**: US1 complete — operators can monitor and drill into runs in one place.
---
## Phase 4: User Story 2 - Start long-running actions without waiting (Priority: P2)
**Goal**: Starting a supported long-running operation is non-blocking and provides immediate confirmation + “View run” link; unauthorized users cannot start.
**Independent Test**: Trigger Drift generation and Backup Set “Add Policies”; confirm immediate feedback with “View run” link; confirm `Readonly` cannot start drift generation and no run is created.
### Implementation
- [x] T010 [US2] Prevent drift generation from being started by `Readonly` users (blocked state + message) in app/Filament/Pages/DriftLanding.php
- [x] T011 [US2] Emit a queued DB notification with “View run” link when Drift generation is queued in app/Filament/Pages/DriftLanding.php
- [x] T012 [P] [US2] Emit Drift completion and failure DB notifications with “View run” link in app/Jobs/GenerateDriftFindingsJob.php
**Checkpoint**: US2 complete — start UX is consistent and permission-gated.
---
## Phase 5: User Story 3 - Drift generation is observable like other operations (Priority: P3)
**Goal**: Drift generation creates/reuses a run, surfaces safe failure details, and links operators to results.
**Independent Test**: Trigger Drift generation; observe it in Monitoring → Operations; open the run and follow a link to Drift findings; simulate failure and confirm safe failure reason is visible on the run.
### Implementation
- [x] T013 [US3] Store Drift context (scope_key, baseline_run_id, current_run_id) inside the run payload so Monitoring can link to results in app/Filament/Pages/DriftLanding.php
- [x] T014 [P] [US3] Record a sanitized failure entry (reason_code + short message) into `BulkOperationRun.failures` when Drift generation fails in app/Jobs/GenerateDriftFindingsJob.php
- [x] T015 [US3] Add a “Drift findings” link for `drift.generate` runs in the run detail “Related” section in app/Filament/Resources/BulkOperationRunResource.php
**Checkpoint**: US3 complete — drift runs are actionable and consistent with other operations.
---
## Phase 6: Polish & Cross-Cutting Concerns
**Purpose**: Final alignment, validation, and guardrails.
- [x] T016 [P] Update operator-facing notes and validation commands in specs/053-unify-runs-monitoring/quickstart.md (only if implementation changes)
- [x] T017 [P] Update docs to match implementation if needed: specs/053-unify-runs-monitoring/spec.md and specs/053-unify-runs-monitoring/data-model.md
- [x] T018 Run formatting on changed PHP files with `./vendor/bin/pint --dirty` (reference: specs/053-unify-runs-monitoring/quickstart.md)
- [x] T019 Run targeted validation commands from specs/053-unify-runs-monitoring/quickstart.md (queue worker optional; run relevant Pest tests)
- [x] T020 [P] Re-verify contracts match real URLs and access behavior in specs/053-unify-runs-monitoring/contracts/admin-pages.openapi.yaml
---
## Dependencies & Execution Order
### Dependency Graph (User Stories)
```text
Phase 1 (Setup) ─┬─> Phase 2 (Foundational) ─┬─> US1 (P1) ─┬─> Polish
│ ├─> US2 (P2) │
│ └─> US3 (P3) ┘
└────────────────────────────────────────────
```
### User Story Dependencies
- **US1** depends on Phase 2 (Foundational); independent of US2/US3.
- **US2** depends on Phase 2 (Foundational); independent of US1/US3.
- **US3** depends on Phase 2 (Foundational) and benefits from US1 (Monitoring visibility) but can be implemented independently.
---
## Parallel Execution Examples
### US1 (Monitoring UI)
```text
After Phase 2 is complete, one developer can focus on:
- app/Filament/Resources/BulkOperationRunResource.php (T005T009)
```
### US2 (Start UX / Notifications)
```text
These can be done in parallel after Phase 2:
- app/Filament/Pages/DriftLanding.php (T010T011)
- app/Jobs/GenerateDriftFindingsJob.php (T012)
```
### US3 (Drift observability)
```text
These can be done in parallel after Phase 2:
- app/Filament/Pages/DriftLanding.php (T013)
- app/Jobs/GenerateDriftFindingsJob.php (T014)
```
---
## Implementation Strategy
### MVP First (US1 Only)
1. Complete Phase 1 + Phase 2
2. Complete US1 (Phase 3) and validate Monitoring/Operations end-to-end
3. Ship/demonstrate Monitoring value before expanding run producer behavior
### Incremental Delivery
1. US1 (Monitoring hub) → validates visibility/auditability
2. US2 (start guardrails + notifications) → standardizes operator feedback
3. US3 (drift linking + safe failure detail) → makes drift runs fully actionable