TenantAtlas/specs/115-baseline-operability-alerts/plan.md
2026-03-01 03:23:39 +01:00

7.8 KiB

Implementation Plan: Baseline Operability & Alert Integration (Spec 115)

Branch: 115-baseline-operability-alerts | Date: 2026-02-28 | Spec: specs/115-baseline-operability-alerts/spec.md Input: Feature specification from specs/115-baseline-operability-alerts/spec.md

Note: This template is filled in by the /speckit.plan command. See .specify/scripts/ for helper scripts.

Summary

  • Implement safe baseline finding auto-close after fully successful baseline compares.
  • Add baseline-specific alert events (baseline_high_drift, baseline_compare_failed) with precise new/reopened-only semantics and existing cooldown handling.
  • Introduce workspace settings for baseline severity mapping, minimum alert severity, and an auto-close kill-switch.
  • Normalize baseline run types via the canonical run type registry.

Technical Context

Language/Version: PHP 8.4.x (Laravel 12) Primary Dependencies: Filament v5, Livewire v4, Laravel Sail Storage: PostgreSQL (Sail) + JSONB Testing: Pest v4 (vendor/bin/sail artisan test) Target Platform: Web application Project Type: Laravel monolith Performance Goals: N/A (ops correctness + low-noise alerting) Constraints: Strict Ops-UX + RBAC-UX compliance; no extra Graph calls; Monitoring render is DB-only Scale/Scope: Workspace-scoped settings + tenant-scoped findings/runs

Constitution Check

GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.

  • Inventory-first: PASS (baseline compare reads inventory as “last observed”, no snapshot semantics changed).
  • Read/write separation: PASS (auto-close is internal lifecycle management; no Graph writes introduced).
  • Graph contract path: PASS (no new Graph calls).
  • Deterministic capabilities: PASS (uses existing capability registry for any UI settings mutations).
  • RBAC-UX: PASS (workspace settings mutations already enforce membership 404 + capability 403 in SettingsWriter; tenant-context compare surfaces remain tenant-scoped).
  • Workspace & tenant isolation: PASS (all findings and runs remain workspace+tenant scoped; alert dispatch validates tenant belongs to workspace).
  • Run observability: PASS (baseline compare/capture already run via OperationRunService; alerts evaluation is an OperationRun).
  • Ops-UX 3-surface feedback: PASS (no new notification surfaces; uses existing Ops UX patterns).
  • Ops-UX lifecycle + summary counts + guards: PASS (all run transitions via OperationRunService; summary keys remain canonical).
  • Filament Action Surface Contract / UX-001: PASS (only adds fields/actions to existing pages; no new resources required).

Project Structure

Documentation (this feature)

specs/115-baseline-operability-alerts/
├── plan.md              # This file (/speckit.plan command output)
├── research.md          # Phase 0 output (/speckit.plan command)
├── data-model.md        # Phase 1 output (/speckit.plan command)
├── quickstart.md        # Phase 1 output (/speckit.plan command)
├── contracts/           # Phase 1 output (/speckit.plan command)
└── tasks.md             # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)

Source Code (repository root)

app/
├── Filament/
│   ├── Pages/
│   │   └── Settings/
│   │       └── WorkspaceSettings.php
│   └── Pages/
│       └── BaselineCompareLanding.php
├── Jobs/
│   ├── CompareBaselineToTenantJob.php
│   └── Alerts/
│       └── EvaluateAlertsJob.php
├── Models/
│   ├── Finding.php
│   ├── AlertRule.php
│   └── WorkspaceSetting.php
├── Services/
│   ├── Baselines/
│   │   ├── BaselineCompareService.php
│   │   └── BaselineCaptureService.php
│   ├── Alerts/
│   │   ├── AlertDispatchService.php
│   │   └── AlertFingerprintService.php
│   └── Settings/
│       ├── SettingsResolver.php
│       └── SettingsWriter.php
└── Support/
    ├── OperationRunType.php
    └── Settings/
        └── SettingsRegistry.php

tests/
└── Feature/
    ├── Alerts/
    └── Baselines/

Structure Decision: Laravel monolith with Filament admin UI. This feature touches Jobs, Services, Settings infrastructure, and adds/updates Pest feature tests.

Phase 0 — Outline & Research (Complete)

Outputs:

  • specs/115-baseline-operability-alerts/research.md

Unknowns resolved:

  • Which summary counters can be used for completeness gating (reuse total/processed/failed).
  • How to implement reopen/resolve stale semantics without breaking workflow status.
  • How alert event dedupe/cooldown works and what keys are used.
  • How workspace settings are stored/validated and how effective values are resolved.

Phase 1 — Design & Contracts (Complete)

Outputs:

  • specs/115-baseline-operability-alerts/data-model.md
  • specs/115-baseline-operability-alerts/contracts/baseline-alert-events.openapi.yaml
  • specs/115-baseline-operability-alerts/quickstart.md

Design highlights:

  • Baseline findings are a filtered subset of drift findings (finding_type=drift, source=baseline.compare).
  • Auto-close resolves stale baseline findings only when the compare run is complete and safe.
  • Baseline alert events are produced only for new/reopened baseline findings within the evaluation window.

Constitution Re-check (Post-Design)

Result: PASS. No Graph calls added, no new authorization planes, and all OperationRun transitions remain service-owned.

Phase 2 — Implementation Plan

  1. Settings registry + UI
  • Add baseline.severity_mapping, baseline.alert_min_severity, baseline.auto_close_enabled to SettingsRegistry with strict validation.
  • Extend WorkspaceSettings Filament page to render and persist these settings using the existing SettingsWriter.
  1. Canonical run types
  • Add baseline_capture and baseline_compare to OperationRunType enum and replace ad-hoc literals where touched in this feature.
  1. Baseline compare finding lifecycle
  • Update CompareBaselineToTenantJob to:
    • apply baseline severity mapping by change_type.
    • preserve existing open finding workflow status.
    • mark previously resolved findings as reopened and set reopened_at.
  1. Safe auto-close
  • At the end of CompareBaselineToTenantJob, if:
    • run outcome is succeeded, and
    • summary_counts.processed == summary_counts.total, and
    • summary_counts.failed == 0, and
    • baseline.auto_close_enabled == true then resolve stale open baseline findings (not in “seen set”) with reason no_longer_drifting.
  1. Alerts integration
  • Extend EvaluateAlertsJob to produce:
    • baseline_high_drift (baseline findings only; new/reopened only; respects baseline.alert_min_severity).
    • baseline_compare_failed (baseline compare runs failed/partially_succeeded; dedupe by run id; cooldown via existing rules).
  • Register new event types in AlertRule and surface them in Filament AlertRuleResource.
  1. Tests (Pest)
  • Add/extend Feature tests to cover:
    • auto-close executes only under the safe gate.
    • auto-close does not run on partially_succeeded/failed/incomplete compares.
    • reopened findings become reopened and trigger baseline drift alerts once.
    • baseline drift alerts do not trigger repeatedly for the same open finding.
    • baseline compare failed alerts trigger and are dedupe/cooldown compatible.

Complexity Tracking

Fill ONLY if Constitution Check has violations that must be justified

Violation Why Needed Simpler Alternative Rejected Because
[e.g., 4th project] [current need] [why 3 projects insufficient]
[e.g., Repository pattern] [specific problem] [why direct DB access insufficient]