Implements spec 111 (Findings workflow + SLA) and fixes Workspace findings SLA settings UX/validation. Key changes: - Findings workflow service + SLA policy and alerting. - Workspace settings: allow partial SLA overrides without auto-filling unset severities in the UI; effective values still resolve via defaults. - New migrations, jobs, command, UI/resource updates, and comprehensive test coverage. Tests: - `vendor/bin/sail artisan test --compact` (1779 passed, 8 skipped). Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de> Reviewed-on: #135
181 lines
14 KiB
Markdown
181 lines
14 KiB
Markdown
# Implementation Plan: Findings Workflow V2 + SLA
|
|
|
|
**Branch**: `111-findings-workflow-sla` | **Date**: 2026-02-24 | **Spec**: `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/spec.md`
|
|
**Input**: Feature specification from `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/spec.md`
|
|
|
|
**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/scripts/` for helper scripts.
|
|
|
|
## Summary
|
|
|
|
Standardize the Findings lifecycle across all finding types (drift, permission posture, Entra admin roles) by introducing a v2 workflow (`new → triaged → in_progress → resolved/closed/risk_accepted`, plus `reopened`), ownership/assignment, recurrence tracking, and due-date (SLA) behavior. Drift findings will stop creating “new row per re-drift” noise by using a stable recurrence identity and reopening the canonical record when a resolved issue reappears. SLA due alerting will be re-enabled by implementing a producer that emits a single tenant-level SLA due event (summarizing overdue counts) when newly-overdue open findings exist (at most one per tenant per evaluation window), and by re-adding the event type to AlertRule configuration once the producer exists. Review Pack “open findings” selection and fingerprinting will be updated to use the v2 open-status set. A one-time OperationRun-backed backfill/consolidation operation upgrades legacy findings (acknowledged → triaged, lifecycle fields populated, due dates assigned from backfill time + SLA days, drift duplicates consolidated).
|
|
|
|
## Technical Context
|
|
|
|
**Language/Version**: PHP 8.4.15 (Laravel 12.52.0)
|
|
**Primary Dependencies**: Filament v5.2.1, Livewire v4.1.4
|
|
**Storage**: PostgreSQL (JSONB used for evidence and settings values)
|
|
**Testing**: Pest v4.3.1 (PHPUnit 12.5.4)
|
|
**Target Platform**: Docker via Laravel Sail (local); Dokploy (staging/production)
|
|
**Project Type**: Web application (Laravel monolith with Filament admin panel)
|
|
**Performance Goals**: Findings list remains performant at 10k+ rows/tenant by relying on pagination and index-backed filters (status/severity/due date/assignee); bulk workflow actions handle 100 records per action; scheduled alert evaluation relies on index-backed queries over `(workspace_id, tenant_id, status, due_at)` and avoids full-table scans
|
|
**Constraints**: No new external API calls; server-side enforcement for workflow transitions and RBAC; all long-running backfill/consolidation runs are OperationRun-backed with OPS-UX 3-surface feedback; Monitoring/Alerts evaluation remains DB-only at render time
|
|
**Scale/Scope**: Multi-workspace, multi-tenant; findings volumes can grow continuously (recurrence reduces drift row churn); SLA due event is tenant-level (one per tenant per evaluation window)
|
|
|
|
## Constitution Check
|
|
|
|
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
|
|
|
|
| Rule | Status | Notes |
|
|
|------|--------|-------|
|
|
| **Inventory-first** | PASS | No change to inventory semantics. Findings remain operational artifacts derived from inventory-backed runs. |
|
|
| **Read/write separation** | PASS | Workflow mutations (triage/assign/resolve/close/risk accept/reopen) are explicit user actions with confirmation (where required), server-side authorization, audit logs, and tests. |
|
|
| **Graph contract path** | PASS | No new Graph calls. All changes are DB + UI + queued jobs. |
|
|
| **Deterministic capabilities** | PASS | New capabilities are added to the canonical registry (`app/Support/Auth/Capabilities.php`) and mapped in `RoleCapabilityMap`. Legacy `TENANT_FINDINGS_ACKNOWLEDGE` remains as a deprecated alias for v2 triage permission. |
|
|
| **RBAC-UX: two planes** | PASS | Only `/admin` plane involved. Tenant-context findings remain under `/admin/t/{tenant}/...`. |
|
|
| **RBAC-UX: non-member = 404** | PASS | UI uses `UiEnforcement`; server-side policies/guards return deny-as-not-found for non-members. |
|
|
| **RBAC-UX: member missing capability = 403** | PASS | UI disables with tooltips; server-side policies/guards deny with 403 for members missing capability. |
|
|
| **RBAC-UX: destructive confirmation** | PASS | Resolve/Close/Risk accept/Reopen require confirmation; bulk actions use confirmation and typed confirmation when large. |
|
|
| **RBAC-UX: global search** | N/A | Findings are tenant-context and already have a View page; no new global-search surfaces are introduced in this feature. |
|
|
| **Workspace isolation** | PASS | Findings queries remain workspace+tenant safe; overdue evaluation operates on findings scoped by `workspace_id` and `tenant_id`. |
|
|
| **Tenant isolation** | PASS | All finding reads/writes are tenant-scoped; bulk actions and backfill are tenant-context operations. |
|
|
| **Run observability** | PASS | Backfill/consolidation is OperationRun-backed. Alerts evaluation already uses OperationRun. |
|
|
| **Ops-UX 3-surface feedback** | PASS | Backfill uses `OperationUxPresenter` queued/dedupe toasts, standard progress surfaces, and `OperationRunCompleted` terminal DB notification (initiator-only). |
|
|
| **Ops-UX lifecycle/service-owned** | PASS | Any new run transitions use `OperationRunService` (no direct status/outcome writes). |
|
|
| **Ops-UX summary counts contract** | PASS | Backfill and generator updates use only keys allowed by `OperationSummaryKeys::all()` and numeric-only values. |
|
|
| **Ops-UX system runs** | PASS | Scheduled alert evaluation remains system-run (no initiator DB notification). Any tenant-wide escalation uses Alerts, not OperationRun notifications. |
|
|
| **Automation / idempotency** | PASS | Backfill runs are deduped per tenant + scope identity; queued jobs are idempotent and lock-protected where needed. |
|
|
| **Data minimization** | PASS | Evidence remains sanitized; audit logs store before/after at a metadata level and avoid secrets/tokens. |
|
|
| **BADGE-001** | PASS | Finding status badge mapping is extended to include all v2 statuses with tests. |
|
|
| **Filament Action Surface Contract** | PASS | Findings Resource already declares ActionSurface; will be updated to cover v2 actions with “More” grouping + bulk groups and empty-state exemption retained. |
|
|
| **UX-001** | PASS | Findings View already uses Infolist. Workflow actions are surfaced via header/row/bulk actions with safety and grouping conventions. |
|
|
|
|
**Post-design re-evaluation**: All checks PASS. No constitution violations expected after Phase 1 outputs below.
|
|
|
|
## Project Structure
|
|
|
|
### Documentation (this feature)
|
|
|
|
```text
|
|
/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/
|
|
├── plan.md # This file (/speckit.plan)
|
|
├── spec.md # Feature spec (/speckit.specify + /speckit.clarify)
|
|
├── checklists/
|
|
│ └── requirements.md # Spec quality checklist
|
|
├── research.md # Phase 0 output (/speckit.plan)
|
|
├── data-model.md # Phase 1 output (/speckit.plan)
|
|
├── quickstart.md # Phase 1 output (/speckit.plan)
|
|
├── contracts/ # Phase 1 output (/speckit.plan)
|
|
│ └── api-contracts.md
|
|
└── tasks.md # Phase 2 output (/speckit.tasks - NOT created by /speckit.plan)
|
|
```
|
|
|
|
### Source Code (repository root)
|
|
|
|
```text
|
|
app/
|
|
├── Console/Commands/
|
|
│ └── TenantpilotBackfillFindingLifecycle.php # New: OperationRun-backed backfill entrypoint
|
|
├── Filament/Pages/Settings/
|
|
│ └── WorkspaceSettings.php # Updated: Findings SLA policy setting UI
|
|
├── Filament/Resources/
|
|
│ ├── FindingResource.php # Updated: defaults + actions + filters + workflow actions
|
|
│ └── AlertRuleResource.php # Updated: re-enable sla_due event type (after producer exists)
|
|
├── Jobs/
|
|
│ ├── BackfillFindingLifecycleJob.php # New: queued worker for backfill + consolidation
|
|
│ └── Alerts/EvaluateAlertsJob.php # Updated: add SLA due producer
|
|
├── Models/Finding.php # Updated: v2 statuses + workflow helpers + relationships
|
|
├── Policies/FindingPolicy.php # Updated: per-action capabilities
|
|
├── Services/
|
|
│ ├── Findings/
|
|
│ │ ├── FindingSlaPolicy.php # New: SLA policy resolver
|
|
│ │ └── FindingWorkflowService.php # New: workflow transitions + audit
|
|
│ ├── Drift/DriftFindingGenerator.php # Updated: recurrence_key + lifecycle fields + auto-resolve stale
|
|
│ ├── PermissionPosture/PermissionPostureFindingGenerator.php # Updated: lifecycle fields + due_at semantics
|
|
│ └── EntraAdminRoles/EntraAdminRolesFindingGenerator.php # Updated: lifecycle fields + due_at semantics
|
|
└── Support/
|
|
├── Auth/Capabilities.php # Updated: new tenant findings capabilities
|
|
├── Badges/Domains/FindingStatusBadge.php # Updated: v2 status mapping
|
|
├── OperationCatalog.php # Updated: label + expected duration for backfill run type
|
|
├── OpsUx/OperationSummaryKeys.php # Possibly updated if new summary keys are required
|
|
└── Settings/SettingsRegistry.php # Updated: findings.sla_days registry + validation
|
|
|
|
database/migrations/
|
|
├── 2026_02_24_160000_add_finding_lifecycle_v2_fields_to_findings_table.php
|
|
├── 2026_02_24_160001_add_finding_recurrence_key_and_sla_indexes_to_findings_table.php
|
|
└── 2026_02_24_160002_enforce_not_null_on_finding_seen_fields.php
|
|
|
|
tests/
|
|
├── Feature/Findings/
|
|
│ ├── FindingsListDefaultsTest.php # Default list + open statuses across all types
|
|
│ ├── FindingsListFiltersTest.php # Quick filters (Open/Overdue/High severity/My assigned)
|
|
│ ├── FindingWorkflowRowActionsTest.php # Row workflow actions + confirmations
|
|
│ ├── FindingWorkflowViewActionsTest.php # View header workflow actions
|
|
│ ├── FindingRbacTest.php # 404/403 matrix per capability
|
|
│ ├── FindingAuditLogTest.php # Audit before/after + reasons + actor
|
|
│ ├── FindingRecurrenceTest.php # drift recurrence_key + reopen + concurrency gating
|
|
│ ├── DriftStaleAutoResolveTest.php # drift stale auto-resolve reason
|
|
│ ├── FindingBackfillTest.php # backfill + consolidation behavior
|
|
│ └── FindingBulkActionsTest.php # bulk actions + audit
|
|
├── Feature/Alerts/
|
|
│ └── SlaDueAlertTest.php # tenant-level sla_due event producer + rule selection
|
|
└── Unit/
|
|
├── Findings/FindingWorkflowServiceTest.php # transition enforcement + due_at semantics
|
|
└── Settings/FindingsSlaDaysSettingTest.php # validation + normalization
|
|
```
|
|
|
|
**Structure Decision**: Standard Laravel monolith. Changes are concentrated in the Findings model/generators, Alerts evaluation, Filament resources, migrations, and tests.
|
|
|
|
## Complexity Tracking
|
|
|
|
> **Fill ONLY if Constitution Check has violations that must be justified**
|
|
|
|
| Violation | Why Needed | Simpler Alternative Rejected Because |
|
|
|-----------|------------|-------------------------------------|
|
|
| None | N/A | N/A |
|
|
|
|
## Filament v5 Agent Output Contract
|
|
|
|
1. **Livewire v4.0+ compliance**: Yes (Filament v5 requires Livewire v4; project is on Livewire v4.1.4).
|
|
2. **Provider registration**: No new panel required. Existing panel providers remain registered under `bootstrap/providers.php`.
|
|
3. **Global search**: FindingResource already has a View page. No new globally-searchable resources are introduced by this feature.
|
|
4. **Destructive actions**: Resolve/Close/Risk accept/Reopen and bulk equivalents use `->action(...)` + `->requiresConfirmation()` and server-side authorization.
|
|
5. **Asset strategy**: No new frontend asset pipeline requirements; standard Filament components only.
|
|
6. **Testing plan**: Pest coverage for workflow transitions, RBAC 404/403 semantics, drift recurrence_key behavior (reopen + stale auto-resolve), SLA due event producer, and backfill/consolidation.
|
|
|
|
## Phase 0 — Outline & Research (output: research.md)
|
|
|
|
Phase 0 resolves the key design decisions needed for implementation consistency:
|
|
- v2 status model + timestamp semantics
|
|
- SLA policy storage and defaults
|
|
- drift recurrence_key strategy (stable identity)
|
|
- SLA due event contract (tenant-level, throttling-friendly)
|
|
- backfill/consolidation approach (OperationRun-backed, idempotent)
|
|
|
|
Outputs:
|
|
- `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/research.md`
|
|
|
|
## Phase 1 — Design & Contracts (outputs: data-model.md, contracts/*, quickstart.md)
|
|
|
|
Design deliverables:
|
|
- Data model changes to `findings` and supporting settings/capabilities/badges
|
|
- Contracts for workflow actions and SLA due events (alerts)
|
|
- Implementation quickstart to validate locally with Sail
|
|
|
|
Outputs:
|
|
- `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/data-model.md`
|
|
- `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/contracts/api-contracts.md`
|
|
- `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/quickstart.md`
|
|
|
|
## Phase 2 — Planning (implementation outline; detailed tasks live in tasks.md)
|
|
|
|
- Migrations: add lifecycle + workflow columns + recurrence_key + indexes (two-phase: nullable → backfill → enforce not-null where appropriate).
|
|
- Workflow enforcement: server-side transition validation + timestamps + reasons; audit logs for every user-initiated workflow mutation.
|
|
- SLA policy: add `findings.sla_days` to SettingsRegistry + Workspace Settings UI; due_at set on create + reset on reopen.
|
|
- Generator updates:
|
|
- Drift: stable recurrence_key upsert + reopen semantics + stale auto-resolve reason.
|
|
- Permission posture + Entra roles: lifecycle fields + due_at semantics, keep reopen/auto-resolve behavior.
|
|
- Alerts: implement SLA due producer in EvaluateAlertsJob; re-enable `sla_due` option in AlertRuleResource event types.
|
|
- Filament UI: remove drift-only default filters; default to Open across all types; quick filters: Open, Overdue, High severity, My assigned; row + bulk actions for v2 workflow and assignment.
|
|
- Backfill operation: tenant-scoped backfill entrypoint and job with OperationRun observability + dedupe; consolidate drift duplicates into a canonical recurrence_key record and mark old duplicates terminal.
|
|
- Tests: workflow transition matrix + RBAC 404/403 behavior; recurrence + stale resolve; SLA due alert producer contract; backfill/consolidation correctness and idempotency.
|