TenantAtlas/specs/115-baseline-operability-alerts/plan.md
ahmido fdfb781144 feat(115): baseline operability + alerts (#140)
Implements Spec 115 (Baseline Operability & Alert Integration).

Key changes
- Baseline compare: safe auto-close of stale baseline findings (gated on successful/complete compares)
- Baseline alerts: `baseline_high_drift` + `baseline_compare_failed` with dedupe/cooldown semantics
- Workspace settings: baseline severity mapping + minimum severity threshold + auto-close toggle
- Baseline Compare UX: shared stats layer + landing/widget consistency

Notes
- Livewire v4 / Filament v5 compatible.
- Destructive-like actions require confirmation (no new destructive actions added here).

Tests
- `vendor/bin/sail artisan test --compact tests/Feature/Baselines/ tests/Feature/Alerts/`

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #140
2026-03-01 02:26:47 +00:00

167 lines
7.8 KiB
Markdown

# Implementation Plan: Baseline Operability & Alert Integration (Spec 115)
**Branch**: `115-baseline-operability-alerts` | **Date**: 2026-02-28 | **Spec**: `specs/115-baseline-operability-alerts/spec.md`
**Input**: Feature specification from `specs/115-baseline-operability-alerts/spec.md`
**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/scripts/` for helper scripts.
## Summary
- Implement safe baseline finding auto-close after fully successful baseline compares.
- Add baseline-specific alert events (`baseline_high_drift`, `baseline_compare_failed`) with precise new/reopened-only semantics and existing cooldown handling.
- Introduce workspace settings for baseline severity mapping, minimum alert severity, and an auto-close kill-switch.
- Normalize baseline run types via the canonical run type registry.
## Technical Context
**Language/Version**: PHP 8.4.x (Laravel 12)
**Primary Dependencies**: Filament v5, Livewire v4, Laravel Sail
**Storage**: PostgreSQL (Sail) + JSONB
**Testing**: Pest v4 (`vendor/bin/sail artisan test`)
**Target Platform**: Web application
**Project Type**: Laravel monolith
**Performance Goals**: N/A (ops correctness + low-noise alerting)
**Constraints**: Strict Ops-UX + RBAC-UX compliance; no extra Graph calls; Monitoring render is DB-only
**Scale/Scope**: Workspace-scoped settings + tenant-scoped findings/runs
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
- Inventory-first: PASS (baseline compare reads inventory as “last observed”, no snapshot semantics changed).
- Read/write separation: PASS (auto-close is internal lifecycle management; no Graph writes introduced).
- Graph contract path: PASS (no new Graph calls).
- Deterministic capabilities: PASS (uses existing capability registry for any UI settings mutations).
- RBAC-UX: PASS (workspace settings mutations already enforce membership 404 + capability 403 in `SettingsWriter`; tenant-context compare surfaces remain tenant-scoped).
- Workspace & tenant isolation: PASS (all findings and runs remain workspace+tenant scoped; alert dispatch validates tenant belongs to workspace).
- Run observability: PASS (baseline compare/capture already run via `OperationRunService`; alerts evaluation is an `OperationRun`).
- Ops-UX 3-surface feedback: PASS (no new notification surfaces; uses existing Ops UX patterns).
- Ops-UX lifecycle + summary counts + guards: PASS (all run transitions via `OperationRunService`; summary keys remain canonical).
- Filament Action Surface Contract / UX-001: PASS (only adds fields/actions to existing pages; no new resources required).
## Project Structure
### Documentation (this feature)
```text
specs/115-baseline-operability-alerts/
├── plan.md # This file (/speckit.plan command output)
├── research.md # Phase 0 output (/speckit.plan command)
├── data-model.md # Phase 1 output (/speckit.plan command)
├── quickstart.md # Phase 1 output (/speckit.plan command)
├── contracts/ # Phase 1 output (/speckit.plan command)
└── tasks.md # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)
```
### Source Code (repository root)
```text
app/
├── Filament/
│ ├── Pages/
│ │ └── Settings/
│ │ └── WorkspaceSettings.php
│ └── Pages/
│ └── BaselineCompareLanding.php
├── Jobs/
│ ├── CompareBaselineToTenantJob.php
│ └── Alerts/
│ └── EvaluateAlertsJob.php
├── Models/
│ ├── Finding.php
│ ├── AlertRule.php
│ └── WorkspaceSetting.php
├── Services/
│ ├── Baselines/
│ │ ├── BaselineCompareService.php
│ │ └── BaselineCaptureService.php
│ ├── Alerts/
│ │ ├── AlertDispatchService.php
│ │ └── AlertFingerprintService.php
│ └── Settings/
│ ├── SettingsResolver.php
│ └── SettingsWriter.php
└── Support/
├── OperationRunType.php
└── Settings/
└── SettingsRegistry.php
tests/
└── Feature/
├── Alerts/
└── Baselines/
```
**Structure Decision**: Laravel monolith with Filament admin UI. This feature touches Jobs, Services, Settings infrastructure, and adds/updates Pest feature tests.
## Phase 0 — Outline & Research (Complete)
Outputs:
- `specs/115-baseline-operability-alerts/research.md`
Unknowns resolved:
- Which summary counters can be used for completeness gating (reuse `total/processed/failed`).
- How to implement reopen/resolve stale semantics without breaking workflow status.
- How alert event dedupe/cooldown works and what keys are used.
- How workspace settings are stored/validated and how effective values are resolved.
## Phase 1 — Design & Contracts (Complete)
Outputs:
- `specs/115-baseline-operability-alerts/data-model.md`
- `specs/115-baseline-operability-alerts/contracts/baseline-alert-events.openapi.yaml`
- `specs/115-baseline-operability-alerts/quickstart.md`
Design highlights:
- Baseline findings are a filtered subset of drift findings (`finding_type=drift`, `source=baseline.compare`).
- Auto-close resolves stale baseline findings only when the compare run is complete and safe.
- Baseline alert events are produced only for new/reopened baseline findings within the evaluation window.
## Constitution Re-check (Post-Design)
Result: PASS. No Graph calls added, no new authorization planes, and all `OperationRun` transitions remain service-owned.
## Phase 2 — Implementation Plan
1) Settings registry + UI
- Add `baseline.severity_mapping`, `baseline.alert_min_severity`, `baseline.auto_close_enabled` to `SettingsRegistry` with strict validation.
- Extend `WorkspaceSettings` Filament page to render and persist these settings using the existing `SettingsWriter`.
2) Canonical run types
- Add `baseline_capture` and `baseline_compare` to `OperationRunType` enum and replace ad-hoc literals where touched in this feature.
3) Baseline compare finding lifecycle
- Update `CompareBaselineToTenantJob` to:
- apply baseline severity mapping by `change_type`.
- preserve existing open finding workflow status.
- mark previously resolved findings as `reopened` and set `reopened_at`.
4) Safe auto-close
- At the end of `CompareBaselineToTenantJob`, if:
- run outcome is `succeeded`, and
- `summary_counts.processed == summary_counts.total`, and
- `summary_counts.failed == 0`, and
- `baseline.auto_close_enabled == true`
then resolve stale open baseline findings (not in “seen set”) with reason `no_longer_drifting`.
5) Alerts integration
- Extend `EvaluateAlertsJob` to produce:
- `baseline_high_drift` (baseline findings only; new/reopened only; respects `baseline.alert_min_severity`).
- `baseline_compare_failed` (baseline compare runs failed/`partially_succeeded`; dedupe by run id; cooldown via existing rules).
- Register new event types in `AlertRule` and surface them in Filament `AlertRuleResource`.
6) Tests (Pest)
- Add/extend Feature tests to cover:
- auto-close executes only under the safe gate.
- auto-close does not run on `partially_succeeded`/failed/incomplete compares.
- reopened findings become `reopened` and trigger baseline drift alerts once.
- baseline drift alerts do not trigger repeatedly for the same open finding.
- baseline compare failed alerts trigger and are dedupe/cooldown compatible.
## Complexity Tracking
> **Fill ONLY if Constitution Check has violations that must be justified**
| Violation | Why Needed | Simpler Alternative Rejected Because |
|-----------|------------|-------------------------------------|
| [e.g., 4th project] | [current need] | [why 3 projects insufficient] |
| [e.g., Repository pattern] | [specific problem] | [why direct DB access insufficient] |