lms/specs/001-add-dokploy-deploy/runbooks/failed-deploy.md

15 lines
1.0 KiB
Markdown

# Runbook: Investigating a failed Dokploy deployment
1. Open Dokploy project and find the failed run. Note `commit_sha`, run id, and timestamp.
2. Stream or download the run logs. Identify service-level errors (image pull failures, healthcheck failures, compose syntax errors).
3. Common checks:
- Image pull/auth errors → verify `REGISTRY_USERNAME`/`REGISTRY_PASSWORD` Dokploy secrets.
- Compose parsing errors → inspect `docker-compose.yml` in the referenced commit.
- Healthcheck failures → check application logs and health endpoint behavior.
4. If the failure is due to the new commit, consider rollback: trigger manual deploy for the last successful commit SHA via Dokploy UI or the manual deploy payload.
5. If credentials are incorrect, rotate secrets in Dokploy and re-run the deployment.
6. Document root cause in the deployment run notes and open an incident if production impact is high.
Escalation:
- If unable to resolve within 30 minutes and service is degraded, notify on-call operator and provide run id and logs.