TenantAtlas/specs/094-assignment-ops-observability-hardening/plan.md
ahmido bda1d90fc4 Spec 094: Assignment ops observability hardening (#113)
Implements spec 094 (assignment fetch/restore observability hardening):

- Adds OperationRun tracking for assignment fetch (during backup) and assignment restore (during restore execution)
- Normalizes failure codes/reason_code and sanitizes failure messages
- Ensures exactly one audit log entry per assignment restore execution
- Enforces correct guard/membership vs capability semantics on affected admin surfaces
- Switches assignment Graph services to depend on GraphClientInterface

Also includes Postgres-only FK defense-in-depth check and a discoverable `composer test:pgsql` runner (scoped to the FK constraint test).

Tests:
- `vendor/bin/sail artisan test --compact` (passed)
- `vendor/bin/sail composer test:pgsql` (passed)

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #113
2026-02-15 14:08:14 +00:00

175 lines
6.6 KiB
Markdown

# Implementation Plan: 094 Assignment Operations Observability Hardening
**Branch**: `094-assignment-ops-observability-hardening` | **Date**: 2026-02-15 | **Spec**: [spec.md](spec.md)
**Input**: Feature specification from [spec.md](spec.md)
## Summary
This work hardens ship-safety by making assignment fetch/restore executions fully observable in Monitoring (via `OperationRun`), improving Graph testability by using the Graph client interface, and closing a small set of authorization inconsistencies (cross-plane guard leak, policy bypass, and 404/403 ordering).
## Technical Context
**Language/Version**: PHP 8.4 (Laravel 12)
**Primary Dependencies**: Filament v5, Livewire v4, Laravel Sail, Microsoft Graph integration
**Storage**: PostgreSQL (Sail)
**Testing**: Pest v4 (`./vendor/bin/sail artisan test`)
**Target Platform**: Linux container runtime (Sail/Dokploy)
**Project Type**: Laravel web application (admin UI via Filament)
**Performance Goals**: Monitoring pages must remain DB-only and render quickly (no external calls during render).
**Constraints**: No secrets/tokens in logs or run failures; RBAC-UX semantics must match constitution (404 vs 403).
**Scale/Scope**: Small hardening change set; no new domain entities; focused on two jobs + a few UI/policy surfaces.
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
- Inventory-first: PASS (no inventory semantics changed).
- Read/write separation: PASS (restore remains a user-triggered operation; change is observability/auditability).
- Graph contract path: PASS-BY-DESIGN (plan includes removing concrete client injections in assignment-related services).
- Deterministic capabilities: PASS (no new capability model; micro-fixes ensure enforcement uses canonical patterns).
- RBAC-UX planes: PASS (plan includes closing remaining cross-plane guard leak; non-member 404, member missing capability 403).
- Workspace isolation: PASS (operations remain tenant-bound; Monitoring remains DB-only).
- Destructive confirmations: PASS (no new destructive actions added; enforcement fixes must preserve confirmations).
- Global search safety: N/A (no global search changes).
- Tenant isolation: PASS (no cross-tenant views added; monitoring surfaces already entitlement-checked).
- Run observability: PASS-BY-DESIGN (this spec exists to bring remaining assignment jobs under `OperationRun`).
- Automation: PASS (dedupe identity clarified; existing partial unique index patterns used).
- Data minimization: PASS (failures must be sanitized; no raw payloads stored).
- Badge semantics: N/A (no badge mapping changes).
- Filament UI Action Surface Contract: PASS (no new resources; micro-fixes must not remove inspect affordances).
## Project Structure
### Documentation (this feature)
```text
specs/094-assignment-ops-observability-hardening/
├── plan.md
├── spec.md
├── tasks.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│ └── assignment-ops.openapi.yaml
└── checklists/
└── requirements.md
```
### Source Code (repository root)
```text
app/
├── Jobs/
│ ├── FetchAssignmentsJob.php
│ ├── RestoreAssignmentsJob.php
│ └── Middleware/TrackOperationRun.php
├── Services/
│ ├── OperationRunService.php
│ ├── AssignmentBackupService.php
│ ├── AssignmentRestoreService.php
│ └── Graph/
│ ├── GraphClientInterface.php
│ ├── MicrosoftGraphClient.php
│ ├── NullGraphClient.php
│ ├── AssignmentFetcher.php
│ ├── AssignmentFilterResolver.php
│ └── GroupResolver.php
├── Filament/
│ └── Resources/...
└── Http/
└── Middleware/EnsureCorrectGuard.php
routes/
└── web.php
tests/
├── Feature/
└── Unit/
```
**Structure Decision**: Use the existing Laravel structure; changes are limited to the job layer, Graph service DI, a few Filament surfaces, and targeted Pest tests.
## Phase 0 — Outline & Research
Artifacts:
- [research.md](research.md)
Research outcomes (resolved decisions):
- Operation run identity & dedupe: tenant + type + target/scope.
- Counters semantics: total attempted, processed succeeded, failed failed.
- Failure convention: operation-specific `code` + normalized `reason_code`.
- Audit log granularity: exactly one entry per restore execution.
## Phase 1 — Design & Contracts
Artifacts:
- [data-model.md](data-model.md)
- [contracts/assignment-ops.openapi.yaml](contracts/assignment-ops.openapi.yaml)
- [quickstart.md](quickstart.md)
Design notes:
- No new persistent entities.
- Operation run tracking must use existing dedupe/index patterns.
- Monitoring surfaces remain DB-only.
## Phase 1 — Agent Context Update
Run:
- `.specify/scripts/bash/update-agent-context.sh copilot`
## Phase 1 — Constitution Check (re-evaluation)
Re-check result: PASS expected once implementation removes concrete Graph client injections and adds OperationRun tracking for the two remaining assignment jobs.
## Phase 2 — Implementation Plan
### Step 1 — OperationRun tracking for assignment fetch/restore
- Locate job dispatch/start surfaces.
- Ensure each execution creates/reuses an `OperationRun`:
- Dedupe identity:
- Fetch: tenant + type + backup item (or equivalent policy-version identifier).
- Restore: tenant + type + restore run identifier.
- Ensure Tracking middleware is applied and the job exposes a run handle per existing conventions.
- Ensure failure details:
- `code`: operation-specific namespace.
- `reason_code`: normalized cause.
- message: sanitized.
- Ensure counters match the agreed semantics.
### Step 2 — Audit log for assignment restore execution
- Ensure exactly one audit log entry is written per assignment restore execution.
### Step 3 — Graph client interface enforcement
- Update assignment-related services to accept the Graph client interface.
### Step 4 — Authorization micro-fixes
- Close cross-plane guard leak on workspace-scoped admin routes.
- Remove any policy bypasses on Provider Connections list surfaces.
- Fix membership (404) vs capability (403) ordering for backup item surfaces.
- Ensure legacy UI enforcement helpers are not used where the canonical helper exists.
### Step 5 — Tests + formatting
- Add targeted Pest regression coverage for:
- operation run tracking (success/failure)
- guard leak
- policy enforcement
- 404/403 ordering
- Graph interface mockability
- Run `./vendor/bin/sail bin pint --dirty`.
- Run targeted tests then widen as needed.
## Complexity Tracking
No constitution violations are required for this feature.