TenantAtlas/docs/performance-guidelines.md

# TenantPilot Performance Guidelines

Status: 2026-05-15
Applies to: Laravel 12, Filament 5, Livewire 4, PostgreSQL 16, Microsoft Graph.

## Performance Target

TenantPilot should keep interactive admin requests short and move remote, large, retryable, or long-running work into queued operations with visible `OperationRun` state.

## Current Performance Risks

| Risk | Evidence | Priority | Mitigation |
|---|---|---:|---|
| Queryable payloads still in `json` | policy versions, backup items, restore runs, audit logs | P1 | Convert to JSONB where queried; add targeted GIN/expression indexes. |
| Large Filament pages/resources | 1,000-5,700 LOC classes | P1 | Extract tables/actions and review N+1 risks per surface. |
| Database queue for all work | `.env.example` and queue config | P2 | Move high-volume Graph/restore work to Redis queue when load grows. |
| Dashboard/widget query cost | multiple KPI/list widgets | P2 | Cache or precompute expensive aggregate metrics. |
| Graph throttling | Microsoft Graph 429/503 behavior | P1 | Honor `Retry-After`, use exponential backoff with jitter, avoid polling. |

## Synchronous vs Asynchronous

Keep synchronous:

- Rendering Filament pages.
- Validating form/action input.
- Creating operation intent records.
- Small DB-only state transitions.
- Showing preview summaries from already persisted data.

Move asynchronous:

- Microsoft Graph reads/writes.
- Backup set item capture.
- Restore execution.
- Bulk export/import.
- Compliance/evidence snapshots.
- Long report generation.
- Notification delivery retries.
- Any workflow likely to exceed 2-5 seconds.

## Filament Table Rules

- Always define a default sort.
- Eager-load relationships used by visible columns.
- Use `withCount()`/aggregate subqueries instead of per-row counts.
- Hide technical columns by default.
- Use session persistence only on investigative resources.
- Avoid computed columns that perform per-row service calls.
- Avoid Graph calls during table render.

## Database Rules

- Prefer `jsonb` for raw Graph snapshots, backup payloads, restore previews/results, evidence summaries, and audit metadata that must be queried.
- Add GIN indexes only when a query path exists; prefer expression indexes for common JSON paths.
- Add composite indexes for workspace/tenant/time/status list filters.
- Add partial unique indexes for active run/idempotency constraints.
- Keep migrations incremental and reversible where practical.

## Queue Strategy

MVP:

- Database queue is acceptable for local and low-volume staging.
- Jobs must be idempotent and observable.
- Worker timeout must be lower than `retry_after`.

Scale-up:

- Move production queues to Redis.
- Split queues: `high`, `default`, `graph`, `restore`, `reports`, `notifications`.
- Run separate worker counts per queue.
- Use process supervision in Dokploy/container runtime.
- Restart/reload workers on every deploy.

## Caching Strategy

- Cache stable config-derived capability maps.
- Cache dashboard aggregates only when invalidation is clear.
- Do not cache tenant authorization decisions across membership changes unless invalidation is proven.
- Avoid caching raw Graph secrets or token payloads.
- Use Redis for locks and cache in production when queue/scheduler scale increases.

## Monitoring Metrics

- HTTP p50/p95/p99 response time by route/panel.
- Livewire request duration and error rate.
- DB query count and slow queries by page/action.
- Queue depth, job latency, failures, retries, max runtime.
- Scheduler last-success timestamp per scheduled command.
- Graph 429/503 count, retry-after seconds, retry exhaustion.
- OperationRun created/running/failed/partial counts.
- Audit log write failures.
- Backup/restore duration and item failure rate.

## Load Test Recommendations

- List 10k policies and 100k policy versions per workspace.
- Render backup and restore tables with 50k backup items.
- Simulate concurrent backup schedule runs for multiple tenants.
- Simulate Graph 429/503 responses and verify retry/backoff budgets.
- Exercise dashboard widgets with realistic operation/finding history.