TenantAtlas/docs/deployment-checklist.md
Ahmed Darrazi 6c7a80e275
Some checks failed
PR Fast Feedback / fast-feedback (pull_request) Failing after 5m7s
feat(report): implement management report pdf runtime
Added jobs, controllers, and PDF generation logic for management report runtime as defined in Spec 379. Includes artifact migrations, payload builders, and testing coverage.
2026-06-15 13:24:41 +02:00

5.2 KiB

TenantPilot Deployment Checklist

Status: 2026-05-15 Target: Sail locally, Dokploy-first staging/production, PostgreSQL, container-based deployment.

Production Readiness Checklist

  • Staging environment exists and is the mandatory production gate.
  • APP_ENV=production and APP_DEBUG=false.
  • APP_KEY is stable, secret, and backed up securely.
  • Database is PostgreSQL 16-compatible.
  • Storage volumes/private object storage are persistent.
  • Queue workers and scheduler are explicitly configured.
  • Health check route /up is monitored.
  • Logs are collected outside the container.
  • Backups are encrypted and restore-tested.
  • Dependency audits are clean or exceptions are approved.

Build and Release Checklist

  1. cd apps/platform && composer install --no-dev --optimize-autoloader
  2. cd apps/platform && corepack pnpm install --frozen-lockfile
  3. cd apps/platform && corepack pnpm build
  4. cd apps/platform && php artisan filament:assets
  5. cd apps/platform && php artisan migrate --force
  6. cd apps/platform && php artisan optimize
  7. Restart or reload long-running services with php artisan reload or php artisan queue:restart depending on runtime setup.
  8. Verify /up.
  9. Verify login, tenant selection, queue dispatch, and audit write on staging.

Queue Worker Checklist

Do not use queue:listen for production workers.

Recommended baseline:

php artisan queue:work database --queue=high,default,graph,restore,reports,notifications --sleep=3 --tries=3 --timeout=300

When Redis is enabled:

php artisan queue:work redis --queue=high,default,graph,restore,reports,notifications --sleep=3 --tries=3 --timeout=300

Rules:

  • Use process supervision so exited workers restart.
  • Keep worker --timeout lower than queue retry_after.
  • Reload/restart workers on deploy.
  • Track queue depth and failed jobs.
  • Run destructive restore/backups in separate queues when volume grows.

PDF Renderer Checklist

Spec 378 approves Gotenberg 8 Chromium only as an internal renderer service for report-style PDFs.

  • Use a pinned gotenberg/gotenberg:8.34.0-chromium image or an explicitly reviewed immutable digest.
  • Keep the renderer on the internal Dokploy/container network only; do not publish a public port.
  • Configure /health monitoring for the renderer service.
  • Set TENANTPILOT_PDF_RENDERER_BASE_URL to the internal service URL, for example http://gotenberg:3000.
  • Set explicit renderer timeouts and limits: TENANTPILOT_PDF_RENDERER_TIMEOUT_SECONDS, TENANTPILOT_PDF_RENDERER_CONNECT_TIMEOUT_SECONDS, TENANTPILOT_PDF_RENDERER_MAX_HTML_BYTES, TENANTPILOT_PDF_RENDERER_MAX_ASSET_BYTES, and TENANTPILOT_PDF_RENDERER_MAX_OUTPUT_BYTES.
  • Set matching Gotenberg service controls: API_TIMEOUT, API_BODY_LIMIT, API_CORRELATION_ID_HEADER, CHROMIUM_START_TIMEOUT, CHROMIUM_MAX_QUEUE_SIZE, and CHROMIUM_MAX_CONCURRENCY.
  • Keep API_DISABLE_DOWNLOAD_FROM=true, WEBHOOK_DISABLE=true, CHROMIUM_ALLOW_FILE_ACCESS_FROM_FILES=true, CHROMIUM_ALLOW_LIST=^file:///tmp/.*$, CHROMIUM_DENY_PRIVATE_IPS=true, and CHROMIUM_DENY_PUBLIC_IPS=true unless a later spec approves external asset fetches. The file allow-list is required for Gotenberg's uploaded index.html conversion path and must not be widened to external URLs.
  • Do not install Node, Puppeteer, Chrome, Chromium, or browser binaries in Laravel web or queue containers for production PDF rendering.

Scheduler Checklist

  • One scheduler instance per environment.
  • Use Laravel scheduler with withoutOverlapping() for recurring jobs.
  • Monitor last successful scheduler tick and per-command failures.
  • Long-running scheduled work dispatches jobs rather than doing Graph work inline.

Migration Checklist

  • Review locks and table size before staging.
  • Backfill in chunks where needed.
  • Avoid irreversible destructive schema changes after production unless forward-only rollback is documented.
  • JSON to JSONB conversions need staging timing proof.
  • Composite FK and partial index changes need PostgreSQL CI/staging validation.

Rollback Checklist

  • Keep previous image available.
  • Know whether rollback is code-only or code+schema.
  • For forward-only migrations, ship a forward fix instead of unsafe down migration.
  • Pause workers before risky rollback if queued payload formats changed.
  • Verify audit logs and operation runs remain readable.

Backup/Restore Checklist

  • Database backups encrypted.
  • Storage backups encrypted.
  • Provider credentials excluded from logs and exports.
  • Restore tested on staging from a real backup.
  • Backup retention and deletion documented.
  • Restore runbook includes queue/scheduler coordination.

Monitoring Checklist

  • /up uptime check.
  • Laravel logs and container logs centralized.
  • Queue failures and long-running jobs alerted.
  • Scheduler missed-run alert.
  • Database connections, slow queries, disk, and backup freshness monitored.
  • Graph 429/503 rates visible.
  • Error tracking integrated before production.

Dokploy Notes

  • Treat Dokploy as the process/orchestration layer, not as application governance.
  • Ensure web, queue, and scheduler processes are separate service definitions or entrypoints.
  • Persist storage/, database volumes, and uploaded/private files.
  • Do not bake .env into images.