# TenantPilot Performance Guidelines Status: 2026-05-15 Applies to: Laravel 12, Filament 5, Livewire 4, PostgreSQL 16, Microsoft Graph. ## Performance Target TenantPilot should keep interactive admin requests short and move remote, large, retryable, or long-running work into queued operations with visible `OperationRun` state. ## Current Performance Risks | Risk | Evidence | Priority | Mitigation | |---|---|---:|---| | Queryable payloads still in `json` | policy versions, backup items, restore runs, audit logs | P1 | Convert to JSONB where queried; add targeted GIN/expression indexes. | | Large Filament pages/resources | 1,000-5,700 LOC classes | P1 | Extract tables/actions and review N+1 risks per surface. | | Database queue for all work | `.env.example` and queue config | P2 | Move high-volume Graph/restore work to Redis queue when load grows. | | Dashboard/widget query cost | multiple KPI/list widgets | P2 | Cache or precompute expensive aggregate metrics. | | Graph throttling | Microsoft Graph 429/503 behavior | P1 | Honor `Retry-After`, use exponential backoff with jitter, avoid polling. | ## Synchronous vs Asynchronous Keep synchronous: - Rendering Filament pages. - Validating form/action input. - Creating operation intent records. - Small DB-only state transitions. - Showing preview summaries from already persisted data. Move asynchronous: - Microsoft Graph reads/writes. - Backup set item capture. - Restore execution. - Bulk export/import. - Compliance/evidence snapshots. - Long report generation. - Notification delivery retries. - Any workflow likely to exceed 2-5 seconds. ## Filament Table Rules - Always define a default sort. - Eager-load relationships used by visible columns. - Use `withCount()`/aggregate subqueries instead of per-row counts. - Hide technical columns by default. - Use session persistence only on investigative resources. - Avoid computed columns that perform per-row service calls. - Avoid Graph calls during table render. ## Database Rules - Prefer `jsonb` for raw Graph snapshots, backup payloads, restore previews/results, evidence summaries, and audit metadata that must be queried. - Add GIN indexes only when a query path exists; prefer expression indexes for common JSON paths. - Add composite indexes for workspace/tenant/time/status list filters. - Add partial unique indexes for active run/idempotency constraints. - Keep migrations incremental and reversible where practical. ## Queue Strategy MVP: - Database queue is acceptable for local and low-volume staging. - Jobs must be idempotent and observable. - Worker timeout must be lower than `retry_after`. Scale-up: - Move production queues to Redis. - Split queues: `high`, `default`, `graph`, `restore`, `reports`, `notifications`. - Run separate worker counts per queue. - Use process supervision in Dokploy/container runtime. - Restart/reload workers on every deploy. ## Caching Strategy - Cache stable config-derived capability maps. - Cache dashboard aggregates only when invalidation is clear. - Do not cache tenant authorization decisions across membership changes unless invalidation is proven. - Avoid caching raw Graph secrets or token payloads. - Use Redis for locks and cache in production when queue/scheduler scale increases. ## Monitoring Metrics - HTTP p50/p95/p99 response time by route/panel. - Livewire request duration and error rate. - DB query count and slow queries by page/action. - Queue depth, job latency, failures, retries, max runtime. - Scheduler last-success timestamp per scheduled command. - Graph 429/503 count, retry-after seconds, retry exhaustion. - OperationRun created/running/failed/partial counts. - Audit log write failures. - Backup/restore duration and item failure rate. ## Load Test Recommendations - List 10k policies and 100k policy versions per workspace. - Render backup and restore tables with 50k backup items. - Simulate concurrent backup schedule runs for multiple tenants. - Simulate Graph 429/503 responses and verify retry/backoff budgets. - Exercise dashboard widgets with realistic operation/finding history.