lms/specs/001-add-dokploy-deploy/runbooks/failed-deploy.md

1.0 KiB

Runbook: Investigating a failed Dokploy deployment

  1. Open Dokploy project and find the failed run. Note commit_sha, run id, and timestamp.
  2. Stream or download the run logs. Identify service-level errors (image pull failures, healthcheck failures, compose syntax errors).
  3. Common checks:
    • Image pull/auth errors → verify REGISTRY_USERNAME/REGISTRY_PASSWORD Dokploy secrets.
    • Compose parsing errors → inspect docker-compose.yml in the referenced commit.
    • Healthcheck failures → check application logs and health endpoint behavior.
  4. If the failure is due to the new commit, consider rollback: trigger manual deploy for the last successful commit SHA via Dokploy UI or the manual deploy payload.
  5. If credentials are incorrect, rotate secrets in Dokploy and re-run the deployment.
  6. Document root cause in the deployment run notes and open an incident if production impact is high.

Escalation:

  • If unable to resolve within 30 minutes and service is degraded, notify on-call operator and provide run id and logs.