TenantAtlas/specs/114-system-console-control-tower/plan.md
ahmido 0cf612826f feat(114): system console control tower (merged) (#139)
Feature branch PR for Spec 114.

This branch contains the merged agent session work (see merge commit on branch).

Tests
- `vendor/bin/sail artisan test --compact tests/Feature/System/Spec114/`

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #139
2026-02-28 00:15:31 +00:00

6.1 KiB

Implementation Plan: System Console Control Tower (Spec 114)

Branch: 114-system-console-control-tower | Date: 2026-02-27

Summary

Implement a platform-only /system Control Tower that provides:

  • Global health KPIs + top offenders (windowed)
  • Cross-workspace Directory (workspaces + tenants) with health signals
  • Global Operations triage (runs + failures + stuck) with canonical run detail
  • Minimal Access Logs (platform auth + break-glass)

Approach: extend the existing Filament System panel and reuse existing read models (OperationRun, AuditLog, Tenant, Workspace) with DB-only queries and strict data minimization/sanitization.

Technical Context

Language/Version: PHP 8.4 (Laravel 12) Primary Dependencies: Filament v5 (Livewire v4), Pest v4, Laravel Sail Storage: PostgreSQL Testing: Pest v4 Target Platform: Web (Filament/Livewire) Project Type: web Performance Goals: p95 < 1.0s for /system list/index pages at typical volumes Constraints: DB-only at render time; strict data minimization; no cross-plane session bridging Scale/Scope: cross-workspace platform operator views; growing operation_runs volumes

Non-negotiables

  • /system is a separate plane from /admin.
  • Wrong plane / unauthenticated: behave as “not found” (404).
  • Platform user missing capability: forbidden (403).
  • DB-only at render time for /system pages (no Microsoft Graph calls while rendering).
  • Data minimization: no secrets/tokens; failures and audit context are sanitized.
  • Mutating actions are confirmed + audited.

Spec source: specs/114-system-console-control-tower/spec.md

Constitution Check (Pre-design)

PASS.

  • Inventory-first + read/write separation: this feature is read-first; v1 manages ops with strict guardrails.
  • Graph contract isolation: no render-time Graph calls; any future sync work goes through existing Graph client contracts.
  • Deterministic capabilities: capability checks use a registry (no raw strings).
  • RBAC-UX semantics: 404 vs 403 behavior preserved.
  • Ops observability: reuse OperationRun lifecycle via OperationRunService.
  • Data minimization: RunFailureSanitizer + AuditContextSanitizer are the contract.
  • Filament action safety: destructive/mutating actions require confirmation.

Project Structure

Documentation (this feature)

specs/114-system-console-control-tower/
├── spec.md
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
└── contracts/
    └── system-console-control-tower.openapi.yaml

Source Code (repository root)

app/
├── Filament/
│   └── System/
│       └── Pages/
├── Models/
├── Services/
└── Support/

config/
database/
routes/
tests/

Structure Decision: Single Laravel web application. System Console features live as Filament Pages under app/Filament/System/Pages using existing Eloquent models.

Phase 0 — Research (Complete)

Output artifact:

  • specs/114-system-console-control-tower/research.md

Resolved items:

  • System panel already exists and is isolated by guard + session cookie middleware.
  • Existing audit stream already captures platform auth and break-glass events.
  • Existing ops primitives (OperationRun, sanitizers, links) are sufficient and should be reused.

Phase 1 — Design & Contracts (Complete)

Output artifacts:

  • specs/114-system-console-control-tower/data-model.md
  • specs/114-system-console-control-tower/contracts/system-console-control-tower.openapi.yaml
  • specs/114-system-console-control-tower/quickstart.md

Post-design Constitution Check:

  • PASS (design remains DB-only, keeps plane separation, uses sanitization contracts, and Spec 114 documents UX-001 empty-state CTA expectations + v1 drilldown scope).

Phase 2 — Implementation Planning (for tasks.md later)

This section outlines the implementation chunks and acceptance criteria that will become tasks.md.

2.1 RBAC + capabilities

  • Extend App\Support\Auth\PlatformCapabilities to include Spec 114 capabilities.
  • Ensure all new /system pages check capabilities via the registry (no raw strings).
  • Keep 404/403 semantics aligned with the spec decisions.

2.2 Information architecture (/system routes)

  • Dashboard (KPIs): global aggregated view, windowed.
  • Directory:
    • Workspaces index + workspace detail.
    • Tenants index + tenant detail.
  • Ops:
    • Runs list.
    • Failures list (prefiltered/saved view).
    • Stuck list (queued + running thresholds).
    • Canonical run detail: remove current runbook-only scoping so it can show any OperationRun (still authorization-checked).
  • Security:
    • Access logs list (platform login + break-glass only for v1).

2.3 Ops triage actions (v1 manage)

  • Implement manage actions with capability gating (platform.operations.manage).
  • Actions:
    • Retry run: only when retryable.
    • Cancel run: only when cancelable.
    • Mark investigated: requires reason.
  • All actions:
    • Execute via Filament Action::make(...)->action(...).
    • Include ->requiresConfirmation().
    • Produce an AuditLog entry with stable action IDs and sanitized context.

2.4 Configuration

  • Add config keys for “stuck” thresholds (queued minutes, running minutes).
  • Ensure defaults are safe and can be overridden per environment.

2.5 Testing (Pest)

  • New page access tests:
    • non-platform users get 404.
    • platform users without capability get 403.
  • System auth/security regression verification:
    • /system login is rate-limited and failed attempts are audited via platform.auth.login (existing coverage in tests/Feature/System/Spec113/SystemLoginThrottleTest.php).
    • break-glass mode renders a persistent banner and audits transitions (platform.break_glass.*) (existing coverage in tests/Feature/Auth/BreakGlassModeTest.php).
  • Access logs surface tests:
    • platform.auth.login and platform.break_glass.* appear.
  • Manage action tests:
    • capability required.
    • audit entries written.
    • non-retryable/non-cancelable runs block with clear feedback.

2.6 Formatting

  • Run vendor/bin/sail bin pint --dirty --format agent before finalizing implementation.