TenantAtlas/specs/111-findings-workflow-sla/plan.md
2026-02-25 02:45:20 +01:00

14 KiB

Implementation Plan: Findings Workflow V2 + SLA

Branch: 111-findings-workflow-sla | Date: 2026-02-24 | Spec: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/spec.md
Input: Feature specification from /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/spec.md

Note: This template is filled in by the /speckit.plan command. See .specify/scripts/ for helper scripts.

Summary

Standardize the Findings lifecycle across all finding types (drift, permission posture, Entra admin roles) by introducing a v2 workflow (new → triaged → in_progress → resolved/closed/risk_accepted, plus reopened), ownership/assignment, recurrence tracking, and due-date (SLA) behavior. Drift findings will stop creating “new row per re-drift” noise by using a stable recurrence identity and reopening the canonical record when a resolved issue reappears. SLA due alerting will be re-enabled by implementing a producer that emits a single tenant-level SLA due event (summarizing overdue counts) when newly-overdue open findings exist (at most one per tenant per evaluation window), and by re-adding the event type to AlertRule configuration once the producer exists. Review Pack “open findings” selection and fingerprinting will be updated to use the v2 open-status set. A one-time OperationRun-backed backfill/consolidation operation upgrades legacy findings (acknowledged → triaged, lifecycle fields populated, due dates assigned from backfill time + SLA days, drift duplicates consolidated).

Technical Context

Language/Version: PHP 8.4.15 (Laravel 12.52.0)
Primary Dependencies: Filament v5.2.1, Livewire v4.1.4
Storage: PostgreSQL (JSONB used for evidence and settings values)
Testing: Pest v4.3.1 (PHPUnit 12.5.4)
Target Platform: Docker via Laravel Sail (local); Dokploy (staging/production)
Project Type: Web application (Laravel monolith with Filament admin panel)
Performance Goals: Findings list remains performant at 10k+ rows/tenant by relying on pagination and index-backed filters (status/severity/due date/assignee); bulk workflow actions handle 100 records per action; scheduled alert evaluation relies on index-backed queries over (workspace_id, tenant_id, status, due_at) and avoids full-table scans
Constraints: No new external API calls; server-side enforcement for workflow transitions and RBAC; all long-running backfill/consolidation runs are OperationRun-backed with OPS-UX 3-surface feedback; Monitoring/Alerts evaluation remains DB-only at render time
Scale/Scope: Multi-workspace, multi-tenant; findings volumes can grow continuously (recurrence reduces drift row churn); SLA due event is tenant-level (one per tenant per evaluation window)

Constitution Check

GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.

Rule Status Notes
Inventory-first PASS No change to inventory semantics. Findings remain operational artifacts derived from inventory-backed runs.
Read/write separation PASS Workflow mutations (triage/assign/resolve/close/risk accept/reopen) are explicit user actions with confirmation (where required), server-side authorization, audit logs, and tests.
Graph contract path PASS No new Graph calls. All changes are DB + UI + queued jobs.
Deterministic capabilities PASS New capabilities are added to the canonical registry (app/Support/Auth/Capabilities.php) and mapped in RoleCapabilityMap. Legacy TENANT_FINDINGS_ACKNOWLEDGE remains as a deprecated alias for v2 triage permission.
RBAC-UX: two planes PASS Only /admin plane involved. Tenant-context findings remain under /admin/t/{tenant}/....
RBAC-UX: non-member = 404 PASS UI uses UiEnforcement; server-side policies/guards return deny-as-not-found for non-members.
RBAC-UX: member missing capability = 403 PASS UI disables with tooltips; server-side policies/guards deny with 403 for members missing capability.
RBAC-UX: destructive confirmation PASS Resolve/Close/Risk accept/Reopen require confirmation; bulk actions use confirmation and typed confirmation when large.
RBAC-UX: global search N/A Findings are tenant-context and already have a View page; no new global-search surfaces are introduced in this feature.
Workspace isolation PASS Findings queries remain workspace+tenant safe; overdue evaluation operates on findings scoped by workspace_id and tenant_id.
Tenant isolation PASS All finding reads/writes are tenant-scoped; bulk actions and backfill are tenant-context operations.
Run observability PASS Backfill/consolidation is OperationRun-backed. Alerts evaluation already uses OperationRun.
Ops-UX 3-surface feedback PASS Backfill uses OperationUxPresenter queued/dedupe toasts, standard progress surfaces, and OperationRunCompleted terminal DB notification (initiator-only).
Ops-UX lifecycle/service-owned PASS Any new run transitions use OperationRunService (no direct status/outcome writes).
Ops-UX summary counts contract PASS Backfill and generator updates use only keys allowed by OperationSummaryKeys::all() and numeric-only values.
Ops-UX system runs PASS Scheduled alert evaluation remains system-run (no initiator DB notification). Any tenant-wide escalation uses Alerts, not OperationRun notifications.
Automation / idempotency PASS Backfill runs are deduped per tenant + scope identity; queued jobs are idempotent and lock-protected where needed.
Data minimization PASS Evidence remains sanitized; audit logs store before/after at a metadata level and avoid secrets/tokens.
BADGE-001 PASS Finding status badge mapping is extended to include all v2 statuses with tests.
Filament Action Surface Contract PASS Findings Resource already declares ActionSurface; will be updated to cover v2 actions with “More” grouping + bulk groups and empty-state exemption retained.
UX-001 PASS Findings View already uses Infolist. Workflow actions are surfaced via header/row/bulk actions with safety and grouping conventions.

Post-design re-evaluation: All checks PASS. No constitution violations expected after Phase 1 outputs below.

Project Structure

Documentation (this feature)

/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/
├── plan.md                  # This file (/speckit.plan)
├── spec.md                  # Feature spec (/speckit.specify + /speckit.clarify)
├── checklists/
│   └── requirements.md      # Spec quality checklist
├── research.md              # Phase 0 output (/speckit.plan)
├── data-model.md            # Phase 1 output (/speckit.plan)
├── quickstart.md            # Phase 1 output (/speckit.plan)
├── contracts/               # Phase 1 output (/speckit.plan)
│   └── api-contracts.md
└── tasks.md                 # Phase 2 output (/speckit.tasks - NOT created by /speckit.plan)

Source Code (repository root)

app/
├── Console/Commands/
│   └── TenantpilotBackfillFindingLifecycle.php    # New: OperationRun-backed backfill entrypoint
├── Filament/Pages/Settings/
│   └── WorkspaceSettings.php                      # Updated: Findings SLA policy setting UI
├── Filament/Resources/
│   ├── FindingResource.php                        # Updated: defaults + actions + filters + workflow actions
│   └── AlertRuleResource.php                      # Updated: re-enable sla_due event type (after producer exists)
├── Jobs/
│   ├── BackfillFindingLifecycleJob.php            # New: queued worker for backfill + consolidation
│   └── Alerts/EvaluateAlertsJob.php               # Updated: add SLA due producer
├── Models/Finding.php                             # Updated: v2 statuses + workflow helpers + relationships
├── Policies/FindingPolicy.php                     # Updated: per-action capabilities
├── Services/
│   ├── Findings/
│   │   ├── FindingSlaPolicy.php                   # New: SLA policy resolver
│   │   └── FindingWorkflowService.php             # New: workflow transitions + audit
│   ├── Drift/DriftFindingGenerator.php            # Updated: recurrence_key + lifecycle fields + auto-resolve stale
│   ├── PermissionPosture/PermissionPostureFindingGenerator.php  # Updated: lifecycle fields + due_at semantics
│   └── EntraAdminRoles/EntraAdminRolesFindingGenerator.php      # Updated: lifecycle fields + due_at semantics
└── Support/
    ├── Auth/Capabilities.php                      # Updated: new tenant findings capabilities
    ├── Badges/Domains/FindingStatusBadge.php      # Updated: v2 status mapping
    ├── OperationCatalog.php                       # Updated: label + expected duration for backfill run type
    ├── OpsUx/OperationSummaryKeys.php             # Possibly updated if new summary keys are required
    └── Settings/SettingsRegistry.php              # Updated: findings.sla_days registry + validation

database/migrations/
├── 2026_02_24_160000_add_finding_lifecycle_v2_fields_to_findings_table.php
├── 2026_02_24_160001_add_finding_recurrence_key_and_sla_indexes_to_findings_table.php
└── 2026_02_24_160002_enforce_not_null_on_finding_seen_fields.php

tests/
├── Feature/Findings/
│   ├── FindingsListDefaultsTest.php               # Default list + open statuses across all types
│   ├── FindingsListFiltersTest.php                # Quick filters (Open/Overdue/High severity/My assigned)
│   ├── FindingWorkflowRowActionsTest.php          # Row workflow actions + confirmations
│   ├── FindingWorkflowViewActionsTest.php         # View header workflow actions
│   ├── FindingRbacTest.php                        # 404/403 matrix per capability
│   ├── FindingAuditLogTest.php                    # Audit before/after + reasons + actor
│   ├── FindingRecurrenceTest.php                  # drift recurrence_key + reopen + concurrency gating
│   ├── DriftStaleAutoResolveTest.php              # drift stale auto-resolve reason
│   ├── FindingBackfillTest.php                    # backfill + consolidation behavior
│   └── FindingBulkActionsTest.php                 # bulk actions + audit
├── Feature/Alerts/
│   └── SlaDueAlertTest.php                        # tenant-level sla_due event producer + rule selection
└── Unit/
    ├── Findings/FindingWorkflowServiceTest.php    # transition enforcement + due_at semantics
    └── Settings/FindingsSlaDaysSettingTest.php    # validation + normalization

Structure Decision: Standard Laravel monolith. Changes are concentrated in the Findings model/generators, Alerts evaluation, Filament resources, migrations, and tests.

Complexity Tracking

Fill ONLY if Constitution Check has violations that must be justified

Violation Why Needed Simpler Alternative Rejected Because
None N/A N/A

Filament v5 Agent Output Contract

  1. Livewire v4.0+ compliance: Yes (Filament v5 requires Livewire v4; project is on Livewire v4.1.4).
  2. Provider registration: No new panel required. Existing panel providers remain registered under bootstrap/providers.php.
  3. Global search: FindingResource already has a View page. No new globally-searchable resources are introduced by this feature.
  4. Destructive actions: Resolve/Close/Risk accept/Reopen and bulk equivalents use ->action(...) + ->requiresConfirmation() and server-side authorization.
  5. Asset strategy: No new frontend asset pipeline requirements; standard Filament components only.
  6. Testing plan: Pest coverage for workflow transitions, RBAC 404/403 semantics, drift recurrence_key behavior (reopen + stale auto-resolve), SLA due event producer, and backfill/consolidation.

Phase 0 — Outline & Research (output: research.md)

Phase 0 resolves the key design decisions needed for implementation consistency:

  • v2 status model + timestamp semantics
  • SLA policy storage and defaults
  • drift recurrence_key strategy (stable identity)
  • SLA due event contract (tenant-level, throttling-friendly)
  • backfill/consolidation approach (OperationRun-backed, idempotent)

Outputs:

  • /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/research.md

Phase 1 — Design & Contracts (outputs: data-model.md, contracts/*, quickstart.md)

Design deliverables:

  • Data model changes to findings and supporting settings/capabilities/badges
  • Contracts for workflow actions and SLA due events (alerts)
  • Implementation quickstart to validate locally with Sail

Outputs:

  • /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/data-model.md
  • /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/contracts/api-contracts.md
  • /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/111-findings-workflow-sla/quickstart.md

Phase 2 — Planning (implementation outline; detailed tasks live in tasks.md)

  • Migrations: add lifecycle + workflow columns + recurrence_key + indexes (two-phase: nullable → backfill → enforce not-null where appropriate).
  • Workflow enforcement: server-side transition validation + timestamps + reasons; audit logs for every user-initiated workflow mutation.
  • SLA policy: add findings.sla_days to SettingsRegistry + Workspace Settings UI; due_at set on create + reset on reopen.
  • Generator updates:
    • Drift: stable recurrence_key upsert + reopen semantics + stale auto-resolve reason.
    • Permission posture + Entra roles: lifecycle fields + due_at semantics, keep reopen/auto-resolve behavior.
  • Alerts: implement SLA due producer in EvaluateAlertsJob; re-enable sla_due option in AlertRuleResource event types.
  • Filament UI: remove drift-only default filters; default to Open across all types; quick filters: Open, Overdue, High severity, My assigned; row + bulk actions for v2 workflow and assignment.
  • Backfill operation: tenant-scoped backfill entrypoint and job with OperationRun observability + dedupe; consolidate drift duplicates into a canonical recurrence_key record and mark old duplicates terminal.
  • Tests: workflow transition matrix + RBAC 404/403 behavior; recurrence + stale resolve; SLA due alert producer contract; backfill/consolidation correctness and idempotency.