TenantAtlas/specs/406-governance-artifact-lifecycle-retention/plan.md
ahmido bd6f59bb7c feat: add governance artifact lifecycle retention contracts (#477)
Automated PR provided by Codex via Gitea API.

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #477
2026-06-24 08:29:30 +00:00

253 lines
20 KiB
Markdown

# Implementation Plan: Spec 406 - Governance Artifact Lifecycle & Retention
**Branch**: `406-governance-artifact-lifecycle-retention` | **Date**: 2026-06-23 | **Spec**: `specs/406-governance-artifact-lifecycle-retention/spec.md`
**Input**: Feature specification from `specs/406-governance-artifact-lifecycle-retention/spec.md`
## Summary
Prepare a bounded runtime hardening slice for existing governance artifacts. Implementation must inventory review packs, stored reports/management PDFs, evidence snapshots, customer-review outputs, OperationRun proof packages where exposed, and finding/exception artifacts; classify lifecycle behavior; add only necessary current-owner metadata/actions/jobs; prove hold/delete/export/download/file consistency; and produce a final lifecycle matrix. This is not a new artifact portal, purge platform, export center, compliance product, or full browser audit.
## Technical Context
**Language/Version**: PHP 8.4.15, Laravel 12.52.0, Filament 5.2.1, Livewire 4.1.4.
**Primary Dependencies**: Laravel models, migrations, policies/gates, storage disks, queues/scheduler where already used, Filament v5 actions/resources/pages, Pest 4, browser lane.
**Storage**: PostgreSQL; private `exports` disk for generated artifacts; existing artifact tables such as `review_packs`, `stored_reports`, `evidence_snapshots`, `operation_runs`, `finding_exceptions`, and related review/output tables.
**Testing**: Pest 4 Feature/Filament/Livewire/action tests, focused storage/file tests, targeted scheduler/command tests, focused browser proof.
**Validation Lanes**: fast-feedback, confidence, browser; PostgreSQL lane if migrations/constraints/indexes are added.
**Target Platform**: TenantPilot Laravel monolith under `apps/platform`, deployed through Sail locally and Dokploy for staging/production.
**Constraints**: no new major UI surface, no new panel/provider, no global artifact registry, no broad purge/export framework, no compliance claim, no evidence/currentness rewrite, no PDF layout/template rewrite, no JSONB/data-layer scope, and no completed-spec rewrite.
## Inherited Baseline
- Spec 267 delivered the shared read-only lifecycle/retention contract on evidence, tenant-review, review-pack, stored-report, and accepted-risk seams, then explicitly deferred hold and deletion-request persistence.
- `ReviewPack` has status, fingerprint, file disk/path/size/SHA, generated/expiry timestamps, evidence/review links, and retention config through `tenantpilot.review_pack`.
- `StoredReport` has report type/format/status/profile, payload, file disk/path/size/SHA, generated timestamp, operation/source links, and JSONB payload indexing.
- `EvidenceSnapshot` has status, completeness state, summary, fingerprints, generated/expiry timestamps, and links to review packs/environment reviews.
- `OperationRun` has separate execution truth through status/outcome/context/summary counts and must not become artifact lifecycle truth.
- `PruneReviewPacksCommand` and `PruneStoredReportsCommand` already provide retention-like behavior, but require lifecycle/hold/file consistency proof before broader claims.
- Specs 403, 404, and 405 are proof lineage for evidence/currentness, management-report PDF runtime, and JSONB storage. Their contracts must not regress.
- Specs 404 and 405 are `PASS WITH CONDITIONS` because external Staging/Dokploy proof remains unavailable. Spec 406 may harden local runtime semantics, but must not claim production/staging readiness for PDF, storage, deployment, or download paths unless new proof is collected or the carry-forward condition is explicitly not applicable.
## Technical Approach
1. Record branch, HEAD, dirty state, and `git diff --check`.
2. Create `implementation-report.md` before runtime changes and populate the lifecycle matrix as inventory proceeds.
3. Inventory in-scope artifact families, actions, routes, policies, commands/jobs, file dependencies, retention config, audit paths, and tests.
4. Classify each artifact family as `PASS`, `PASS WITH EXCEPTION`, `MISSING PROOF`, `DEFECT FOUND`, `PRODUCT DECISION REQUIRED`, or `DEFERRED`, and separately classify hold/delete support as `SUPPORTED_NOW`, `DEFERRED`, or `PRODUCT_DECISION_REQUIRED`.
5. Add tests before fixes where feasible for hold, delete/archive/expire, export/download, direct authorization, customer-safe output, and file/database consistency.
6. Implement only confirmed in-scope lifecycle behavior on existing artifact owners.
7. Run focused tests and browser proof.
8. Complete implementation-report gate result and next recommendation.
## Likely Affected Repository Surfaces
Implementation must verify exact paths before editing.
```text
apps/platform/app/Models/ReviewPack.php
apps/platform/app/Models/StoredReport.php
apps/platform/app/Models/EvidenceSnapshot.php
apps/platform/app/Models/OperationRun.php
apps/platform/app/Models/FindingException.php
apps/platform/app/Models/FindingExceptionDecision.php
apps/platform/app/Filament/Resources/ReviewPackResource.php
apps/platform/app/Filament/Resources/ReviewPackResource/Pages/ViewReviewPack.php
apps/platform/app/Filament/Resources/EvidenceSnapshotResource.php
apps/platform/app/Filament/Resources/EvidenceSnapshotResource/Pages/ViewEvidenceSnapshot.php
apps/platform/app/Filament/Pages/Reviews/CustomerReviewWorkspace.php
apps/platform/app/Http/Controllers/ReviewPackDownloadController.php
apps/platform/app/Http/Controllers/ManagementReportPdfDownloadController.php
apps/platform/app/Support/Ui/GovernanceArtifactTruth/ArtifactTruthPresenter.php
apps/platform/app/Support/Ui/GovernanceArtifactTruth/ArtifactTruthEnvelope.php
apps/platform/app/Support/Badges/BadgeCatalog.php
apps/platform/app/Support/Badges/BadgeRenderer.php
apps/platform/app/Support/Badges/Domains/GovernanceArtifactLifecycleBadge.php
apps/platform/app/Support/Badges/Domains/GovernanceArtifactRetentionBadge.php
apps/platform/app/Support/Audit/AuditActionId.php
apps/platform/app/Services/Audit/WorkspaceAuditLogger.php
apps/platform/app/Services/ReviewPackService.php
apps/platform/app/Services/Evidence/EvidenceSnapshotService.php
apps/platform/app/Services/ManagementReports/
apps/platform/app/Console/Commands/PruneReviewPacksCommand.php
apps/platform/app/Console/Commands/PruneStoredReportsCommand.php
apps/platform/config/tenantpilot.php
apps/platform/config/filesystems.php
apps/platform/database/migrations/
apps/platform/tests/
```
## UI / Surface Guardrail Plan
- **Guardrail scope**: existing lifecycle/download/action/status surfaces.
- **Affected routes/pages/actions/states/navigation/panel/provider surfaces**: existing review-pack, evidence, stored-report/PDF, customer-review, and finding/exception artifact surfaces only.
- **No-impact class**: N/A.
- **Native vs custom classification summary**: native Filament resources/pages plus existing controllers.
- **Shared-family relevance**: evidence/report viewers, status messaging, download actions, dangerous lifecycle actions, audit links.
- **State layers in scope**: artifact lifecycle, retention, file availability, evidence/currentness, customer-safe availability, execution proof.
- **Audience modes in scope**: operator/MSP, customer/read-only, support/platform internal.
- **Decision/diagnostic/raw hierarchy plan**: decision-first; diagnostics secondary; support/raw evidence third.
- **Raw/support gating plan**: raw evidence, provider payloads, source keys, OperationRun internals, file paths, and technical errors remain hidden or capability-gated.
- **One-primary-action / duplicate-truth control**: each surface shows one lifecycle summary and one dominant next action; secondary lifecycle actions move to More/danger placement.
- **Handling modes by drift class or surface**: review-mandatory for all changed existing surfaces; exception-required for any new route/nav/page or broad lifecycle framework.
- **Repository-signal treatment**: hard-stop-candidate if implementation adds new panel/provider, generic artifact table, portal, export center, or purge framework without spec update.
- **Special surface test profiles**: standard-native-filament, shared-detail-family, download/controller, browser.
- **Required tests or manual smoke**: feature/action tests plus focused browser proof.
- **Exception path and spread control**: none planned. Any product surface exception must be recorded in the implementation report.
- **Active feature PR close-out entry**: Guardrail / Product Surface.
## Product Surface Contract Plan
- **Product Surface Contract reference**: `docs/product/standards/product-surface-contract.md`.
- **No-legacy posture**: canonical lifecycle behavior; no fallback readers, duplicate old labels, hidden compatibility routes, or old lifecycle copy.
- **Page archetype and surface budget plan**: Report Page, Receipt Page, Decision Page, Technical Annex, Search/Index Page; budgets expected to pass.
- **Technical Annex and deep-link demotion plan**: raw IDs, provider payloads, source keys, detector names, OperationRun internals, file paths, and low-level logs are demoted from customer/product defaults.
- **Canonical status vocabulary plan**: map to Ready, Needs attention, Blocked, Running, Failed, Expired, Historical, Superseded, Unknown, and severity vocabulary where visible.
- **Product Surface exceptions**: none planned.
- **Browser verification plan**: focused browser proof for representative lifecycle state, allowed download/export, blocked customer/internal access, `SUPPORTED_NOW` held-delete block where applicable, and missing-file/not-downloadable state.
- **Human Product Sanity plan**: review changed rendered behavior and final implementation report.
- **Visible complexity outcome target**: neutral or decreased.
- **Implementation report target**: `specs/406-governance-artifact-lifecycle-retention/implementation-report.md`.
## Filament / Livewire / Deployment Posture
- **Livewire v4 compliance**: Livewire 4.1.4 confirmed; any changed Filament page/resource remains Livewire v4 compatible.
- **Panel provider registration location**: Laravel 12 panel providers remain in `apps/platform/bootstrap/providers.php`; no provider registration change is planned.
- **Global search posture**: no new globally searchable resource is planned. Existing affected resources must either remain non-searchable or have safe View/Edit pages and scoped queries.
- **Destructive/high-impact action posture**: delete, hard-delete, purge-like, hold release that enables deletion, archive/expire where availability changes, export, and regeneration-affecting lifecycle actions are high-impact. Filament actions must use `->action(...)`, `->requiresConfirmation()`, policy/gate authorization, and audit proof.
- **Asset strategy**: no new assets or `FilamentAsset::register()` work planned. Existing deployment baseline for `php artisan filament:assets` remains unchanged unless implementation adds registered assets, which should stop for spec update.
- **Testing plan**: Feature/Filament/Livewire action tests, storage/file tests, command/job tests, direct-route authorization tests, customer-safe output tests, focused browser proof.
- **Deployment impact**: possible migrations, config/env additions for retention defaults, scheduler/command behavior, queue/storage implications, and Dokploy staging validation. No Graph scopes or provider credentials expected.
## RBAC / Security / Audit Plan
- Workspace membership and managed-environment entitlement remain isolation boundaries.
- Wrong workspace/environment/non-member access returns 404.
- In-scope capability denial returns 403.
- Customer reviewers remain read-only and customer-safe bounded.
- UI visibility is not authorization; direct routes/actions must enforce policies/gates.
- Audit events must redact secrets, raw provider payloads, stack traces, internal exception bodies, and sensitive file paths.
- Lifecycle audit metadata should include actor, workspace, managed environment, artifact family, safe artifact reference, old state, new state, result, and failure reason.
## Data / Migration Plan
- Reuse existing columns first: `status`, `expires_at`, file metadata, fingerprints, source links, and generated timestamps.
- Add current-table fields only when required for behavior, such as `held_at`, `held_by_user_id`, `hold_reason`, `archived_at`, `deleted_at`, `deleted_by_user_id`, or deletion-request fields.
- Hold/delete metadata requires per-family `SUPPORTED_NOW` classification before migration. `DEFERRED` or `PRODUCT_DECISION_REQUIRED` means no partial column, action, retention branch, or controller behavior is added for that family in Spec 406.
- Do not add a generic artifact table by default.
- If migrations are needed, make them reversible where possible and document PostgreSQL/staging risk.
- Add indexes only for proven query paths such as retention scans or held-state lookup.
- Do not create compatibility shims unless the spec is updated with an explicit exception.
## Retention / File Consistency Plan
- Review `PruneReviewPacksCommand`, `PruneStoredReportsCommand`, `tenantpilot.review_pack.*`, `tenantpilot.stored_reports.*`, and `filesystems.exports`.
- File/database consistency must be proven before any ready/downloadable state is shown.
- Retention cleanup must skip held artifacts only for families with hold support classified `SUPPORTED_NOW`; deferred families must preserve current behavior and record explicit follow-up/no-runtime-mutation rationale.
- Missing/zero-byte/corrupt/wrong-disk files must fail closed.
- Delete/archive/expire behavior must define whether files are removed, blocked, or retained.
- Signed URLs must re-check current artifact state and file availability at request time.
## OperationRun / Observability Plan
- Artifact lifecycle state must not replace `OperationRun.status` or `OperationRun.outcome`.
- Existing review-pack/PDF generation OperationRun paths remain shared.
- Long-running export/delete/purge-class work should reuse the existing OperationRun start/completion UX. If that work exceeds this slice, split it.
- Direct local lifecycle mutations may stay audit-only when they are bounded DB/file operations and not queued/cross-resource.
- Final report must classify which lifecycle actions created OperationRuns and which produced audit-only proof.
## Test Governance Check
- **Test purpose / classification by changed surface**: Feature and Filament/Livewire action tests for behavior; browser for rendered action/status proof; PostgreSQL lane when migrations/indexes are added.
- **Affected validation lanes**: fast-feedback, confidence, browser, pgsql conditional.
- **Why this lane mix is sufficient**: lifecycle correctness is behavioral and authorization-sensitive, and rendered/browser proof is needed only for representative product-surface and action-state behavior.
- **Narrowest proving commands**:
- `cd apps/platform && ./vendor/bin/sail artisan test --compact --filter=Spec406`
- targeted existing suites for ReviewPack, StoredReport/management PDF, Evidence, CustomerReviewWorkspace, Findings, and OperationRun.
- `cd apps/platform && ./vendor/bin/sail php vendor/bin/pest tests/Browser --filter=Spec406 --compact` if a focused browser test is added.
- `cd apps/platform && ./vendor/bin/sail pint --dirty`
- `git diff --check`
- **Fixture/helper cost risks**: moderate; use existing factories and focused fixtures, not a broad artifact matrix harness unless implementation report justifies it.
- **Heavy-family additions**: no heavy-governance family planned.
- **Browser proof**: required and focused.
- **Budget/baseline impact**: record any material browser or retention-job runtime in implementation report.
- **Escalation path**: `document-in-feature` for bounded family-local exceptions; `follow-up-spec` for purge/export-before-delete/portal; `reject-or-split` for broad registry/framework scope.
## Implementation Phases
### Phase 1 - Inventory and lifecycle matrix
Inventory each artifact family, existing states, file dependencies, customer-safe boundaries, policy/capability checks, actions, retention config, commands/jobs, audit proof, tests, and browser coverage. Populate the implementation-report matrix before code edits.
### Phase 2 - Tests for current gaps
Add failing tests for confirmed gaps: `SUPPORTED_NOW` held delete block where applicable, unauthorized direct action, cross-workspace access, missing-file download, failed-to-released prevention, expired/current distinction, customer-safe export boundary, and retention job idempotency.
### Phase 3 - Bounded runtime hardening
Implement only the smallest confirmed changes on existing artifact owners, services, commands, controllers, and Filament actions. Use current-table metadata where necessary. Stop if a change requires a portal, registry, purge engine, or export center.
### Phase 4 - UI/browser proof and product sanity
Run focused browser proof on representative artifact surfaces. Complete Human Product Sanity and record visible complexity outcome.
### Phase 5 - Final gate and handoff
Run focused validation, complete implementation-report sections, classify remaining findings, and recommend `PASS`, `PASS WITH CONDITIONS`, or `FAIL`.
## Stop Conditions
- A new generic artifact registry table appears necessary.
- Irreversible purge or export-before-delete workflow is required for correctness.
- A new route/navigation/panel/customer portal is needed.
- Legal/compliance claims are required to define behavior.
- Lifecycle behavior depends on rewriting completed specs.
- Staging/production data compatibility requires shims not approved by this spec.
- Spec 404/405 carry-forward conditions are needed for a Spec 406 readiness claim and cannot be proven or ruled not applicable.
## Rollout Considerations
- Migrations must be staged and reversible where possible.
- Scheduler/command changes need staging validation before production.
- Storage changes must account for private `exports` persistence and Dokploy volumes.
- Queue/OperationRun changes require worker deployment notes.
- Config/env additions must be documented in `.env.example` or deployment docs only if implementation adds them.
- Production promotion requires staging validation of lifecycle actions, downloads, retention job behavior, and browser proof.
## Deferred Follow-ups
- Retention & Purge Governance v1.
- Data Export Before Deletion v1.
- Workspace & Tenant Closure Lifecycle v1.
- External/customer artifact portal.
- Full Browser/UX Runtime Audit after Spec 406 gate.
## Complexity Tracking
| Violation | Why Needed | Simpler Alternative Rejected Because |
|---|---|---|
| BLOAT-001 - lifecycle/action semantics across several artifact families | Existing artifacts are already customer/audit proof and unsafe lifecycle behavior can delete, expose, or overstate retained proof | Family-local one-off labels and tests cannot prove cross-artifact hold/export/delete/file consistency; a generic artifact registry is wider than current-release truth |
## Project Structure
```text
specs/406-governance-artifact-lifecycle-retention/
├── checklists/
│ └── requirements.md
├── spec.md
├── plan.md
└── tasks.md
```
Implementation may add:
```text
specs/406-governance-artifact-lifecycle-retention/implementation-report.md
apps/platform/database/migrations/
apps/platform/app/... existing artifact owner/service/controller/resource paths
apps/platform/tests/... focused Spec406 and existing family tests
```
## Why This Plan Is Narrow Enough
The repo already has the artifact families, generated-file metadata, signed downloads, retention commands, customer-output gates, evidence/currentness semantics, OperationRun proof, and audit model. Spec 406 uses those seams to prove lifecycle behavior instead of introducing a new product surface or lifecycle platform. Broader purge/export/closure/customer-portal work remains explicitly deferred.