TenantAtlas/specs/405-json-to-jsonb-data-layer-hardening/plan.md
ahmido 686947d26c feat: harden json to jsonb data layer for trust payloads (#476)
Automated PR provided by Codex via Gitea API.

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #476
2026-06-23 21:36:35 +00:00

402 lines
22 KiB
Markdown

# Implementation Plan: Spec 405 - JSON-to-JSONB Data-layer Hardening
**Branch**: `405-json-to-jsonb-data-layer-hardening` | **Date**: 2026-06-23 | **Spec**: `specs/405-json-to-jsonb-data-layer-hardening/spec.md`
**Input**: Feature specification from `specs/405-json-to-jsonb-data-layer-hardening/spec.md`
## Summary
Prepare and later implement a database-hardening slice that inventories every live PostgreSQL `json` and `jsonb` column, classifies all `json` columns, converts only appropriate queryable trust-layer payload columns to `jsonb`, adds only query-backed indexes, proves semantic payload preservation, and records local plus staging-like validation. The implementation must not add product behavior, UI surfaces, broad abstractions, normalized replacement tables, or speculative indexes.
## Technical Context
**Language/Version**: PHP 8.4.15, Laravel 12.52.0, Filament 5.2.1, Livewire 4.1.4.
**Primary Dependencies**: Laravel migrations/schema builder, PostgreSQL, Pest 4, Filament/Livewire test helpers where focused browser or rendered-surface proof is needed.
**Storage**: PostgreSQL via Sail/Dokploy; target change is existing column type conversion from `json` to `jsonb` where classified `CONVERT`.
**Testing**: Pest 4, PostgreSQL lane, targeted feature tests, focused browser smoke.
**Validation Lanes**: pgsql, confidence/feature, focused browser, optional profiling/explain for new indexes.
**Target Platform**: TenantPilot Laravel monolith under `apps/platform`; Spec Kit artifacts under `specs/405-json-to-jsonb-data-layer-hardening/`.
**Project Type**: Laravel web application plus Spec Kit workflow.
**Performance Goals**: Avoid speculative indexes; each new index must have an existing query path and bounded write-overhead rationale.
**Constraints**: No new product concepts, UI surfaces, Graph calls, authorization model, lifecycle behavior, normalized replacement tables, provider semantics, broad abstractions, or completed-spec rewrites.
**Scale/Scope**: All live PostgreSQL `json` and `jsonb` columns; conversion limited to selected existing columns with proof.
## Repository Truth And Initial Signals
Laravel Boost confirmed PostgreSQL and current packages:
```text
PHP 8.4.15
Laravel 12.52.0
Filament 5.2.1
Livewire 4.1.4
Pest 4.3.1
PostgreSQL
```
Laravel 12 migrations support `$table->jsonb('column')`; implementation should use Laravel migrations where possible and raw PostgreSQL `ALTER TABLE ... ALTER COLUMN ... TYPE jsonb USING ...::jsonb` where column type alteration requires explicit SQL.
Live PostgreSQL schema inspection on 2026-06-23 found active `json` columns requiring classification in:
```text
alert_deliveries.payload
alert_rules.tenant_allowlist
audit_logs.metadata
backup_items.assignments
backup_items.metadata
backup_items.payload
backup_schedules.days_of_week
backup_schedules.policy_types
backup_sets.metadata
managed_environment_onboarding_sessions.state
managed_environment_permissions.details
managed_environments.metadata
managed_environments.rbac_canary_results
managed_environments.rbac_last_warnings
policies.metadata
policy_versions.assignments
policy_versions.metadata
policy_versions.scope_tags
policy_versions.secret_fingerprints
policy_versions.snapshot
restore_runs.group_mapping
restore_runs.metadata
restore_runs.preview
restore_runs.requested_items
restore_runs.results
tenant_settings.value
workspace_settings.value
```
Newer trust-layer paths already use `jsonb`, including `operation_runs.summary_counts`, `operation_runs.failure_summary`, `operation_runs.context`, baseline `scope_jsonb`/`summary_jsonb`/`meta_jsonb`, evidence summaries, findings evidence, review pack summaries/options, stored report payloads, provider connection metadata/scopes, and review publication resolution payloads.
## Technical Approach
1. Record branch, HEAD, dirty state, and `git diff --check`.
2. Query PostgreSQL `information_schema.columns` for all `json` and `jsonb` columns and collect row counts, null counts, defaults, indexes, and constraints.
3. Map each column to model casts, factories/fixtures, query usages, Filament/rendered usage, tests, and sensitive-data boundaries.
4. Classify each column as `CONVERT`, `KEEP_JSON`, `ALREADY_JSONB`, `DEPRECATED`, or `DECISION_REQUIRED`.
5. Write one or more focused migrations converting only `CONVERT` columns.
6. Add only query-backed JSONB indexes with explicit proof and rollback/drop strategy.
7. Update casts or query code only if required by tests after conversion.
8. Add PostgreSQL/feature tests proving type conversion, semantic preservation, model read/write behavior, scope/authorization boundaries, and representative domain regressions.
9. Run focused browser proof over existing payload-backed surfaces.
10. Produce `implementation-report.md` with the inventory matrix, validation results, remaining findings, staging status, and final gate result.
## Likely Affected Repository Surfaces
Preparation identifies likely inspection and implementation surfaces; implementation must verify exact paths before editing.
```text
apps/platform/database/migrations/
apps/platform/app/Models/
apps/platform/database/factories/
apps/platform/tests/
apps/platform/app/Support/
apps/platform/app/Services/
apps/platform/app/Jobs/
apps/platform/app/Filament/
apps/platform/resources/views/
specs/405-json-to-jsonb-data-layer-hardening/implementation-report.md
```
Likely model/cast inspection targets include:
```text
App\Models\AlertDelivery
App\Models\AlertRule
App\Models\AuditLog
App\Models\BackupItem
App\Models\BackupSchedule
App\Models\BackupSet
App\Models\ManagedEnvironment
App\Models\ManagedEnvironmentOnboardingSession
App\Models\ManagedEnvironmentPermission
App\Models\Policy
App\Models\PolicyVersion
App\Models\RestoreRun
App\Models\TenantSetting
App\Models\WorkspaceSetting
```
## UI / Surface Guardrail Plan
- **Guardrail scope**: backend data-layer hardening with focused rendered regression proof.
- **Affected routes/pages/actions/states/navigation/panel/provider surfaces**: none changed by implementation unless the spec is updated first.
- **No-impact class**: backend-only schema/storage conversion.
- **Native vs custom classification summary**: N/A.
- **Shared-family relevance**: evidence/report viewers, provider readiness, OperationRun proof, backup/restore proof, and audit metadata are regression consumers only.
- **State layers in scope**: persistence and existing rendered state regression only.
- **Audience modes in scope**: operator/MSP and customer/read-only only for regression proof; no disclosure behavior changes.
- **Decision/diagnostic/raw hierarchy plan**: unchanged; raw payloads remain technical/audit detail.
- **Raw/support gating plan**: unchanged.
- **One-primary-action / duplicate-truth control**: unchanged.
- **Handling modes by drift class or surface**: report-only unless conversion causes a confirmed regression, then minimal in-scope fix.
- **Repository-signal treatment**: review-mandatory for changed database/runtime paths.
- **Special surface test profiles**: backend storage conversion plus focused browser proof.
- **Required tests or manual smoke**: PostgreSQL migration tests, feature regressions, focused browser smoke.
- **Exception path and spread control**: any required UI edit is out of scope and must stop implementation for spec/plan update.
- **Active feature PR close-out entry**: Guardrail / database hardening.
- **UI/Productization coverage decision**: No UI surface impact.
- **Coverage artifacts to update**: none unless implementation unexpectedly changes rendered UI; then stop and update spec/plan first.
- **No-impact rationale**: storage type conversion and proof only.
- **Navigation / Filament provider-panel handling**: no panel/provider changes.
- **Screenshot or page-report need**: focused browser proof may produce screenshots/logs as evidence; no page report required without UI changes.
## Product Surface Contract Plan
- **Product Surface Contract reference**: `docs/product/standards/product-surface-contract.md` as regression lens only.
- **No-legacy posture**: canonical replacement; no compatibility shims or dual-write paths.
- **Page archetype and surface budget plan**: N/A for changed code; browser proof names existing inspected archetypes.
- **Technical Annex and deep-link demotion plan**: unchanged; conversion must not expose raw payloads, internal IDs, OperationRun links, or evidence deep links by default.
- **Canonical status vocabulary plan**: unchanged.
- **Product Surface exceptions**: none.
- **Browser verification plan**: focused existing-surface regression proof required.
- **Human Product Sanity plan**: final implementation report confirms unchanged trust semantics.
- **Visible complexity outcome target**: neutral.
- **Implementation report target**: `specs/405-json-to-jsonb-data-layer-hardening/implementation-report.md`.
## Filament / Livewire / Deployment Posture
- **Livewire v4 compliance**: Livewire 4.1.4 confirmed; no Livewire code change planned.
- **Panel provider registration location**: Laravel 12 panel providers remain in `apps/platform/bootstrap/providers.php`; no panel change.
- **Global search posture**: no resource global search posture changed. If implementation touches a resource unexpectedly, verify View/Edit/global search safety or stop for spec update.
- **Destructive/high-impact action posture**: no actions added or changed. Existing destructive/high-impact action proof from Specs 401-404 must not regress.
- **Asset strategy**: no assets, no `FilamentAsset` registration, no new `filament:assets` requirement beyond existing deployment baseline.
- **Testing plan**: database/migration tests, feature/domain regressions, focused browser proof for existing payload-backed surfaces.
- **Deployment impact**: migrations and staging/Dokploy validation. No env vars, queues, scheduler, storage, assets, routes, provider scopes, or panel providers planned.
## Shared Pattern & System Fit
- **Cross-cutting feature marker**: yes at data storage level.
- **Systems touched**: database schema, existing casts, existing query paths, tests, implementation report.
- **Shared abstractions reused**: Laravel migrations, existing models/casts, existing scoped query paths, existing tests/browser fixtures.
- **New abstraction introduced? why?**: none planned.
- **Why the existing abstraction was sufficient or insufficient**: existing models and services already own payload meaning; storage conversion does not require a new runtime layer.
- **Bounded deviation / spread control**: any new helper must be justified as a narrow test/support helper, not a runtime framework.
## OperationRun UX Impact
- **Touches OperationRun start/completion/link UX?**: no.
- **Central contract reused**: N/A.
- **Delegated UX behaviors**: N/A.
- **Surface-owned behavior kept local**: N/A.
- **Queued DB-notification policy**: N/A.
- **Terminal notification path**: N/A.
- **Exception path**: none.
## Provider Boundary & Portability Fit
- **Shared provider/platform boundary touched?**: storage inspection only.
- **Provider-owned seams**: provider raw/permission/readiness payload keys remain unchanged.
- **Platform-core seams**: storage type, workspace/managed-environment scope, audit/report/evidence ownership, migration safety.
- **Neutral platform terms / contracts preserved**: workspace, managed environment, provider, connection, operation, evidence, report, backup, restore.
- **Retained provider-specific semantics and why**: existing Microsoft/Intune payload keys remain provider-owned payload content.
- **Bounded extraction or follow-up path**: none for this spec.
## Domain / Model Implications
- No new model, table, persisted artifact, enum, status family, route, action, provider type, or source of truth is introduced.
- Existing model casts may remain unchanged because Laravel treats `json` and `jsonb` similarly at the application layer; implementation must update casts only if tests prove current behavior breaks.
- Existing raw payload keys and product meaning remain unchanged.
- Columns with unclear ownership or product semantics must be classified `DECISION_REQUIRED`, not converted by assumption.
## Data / Migration Implications
Migration pattern for direct conversions:
```sql
ALTER TABLE table_name
ALTER COLUMN column_name TYPE jsonb
USING column_name::jsonb;
```
Rollback pattern where feasible:
```sql
ALTER TABLE table_name
ALTER COLUMN column_name TYPE json
USING column_name::json;
```
Implementation must preserve:
- nullable state
- defaults
- constraints
- existing indexes unless intentionally replaced
- row counts
- application read/write behavior
Implementation must document rollback limitations:
- `jsonb` normalizes key order
- duplicate JSON object keys may be normalized
- semantic content must remain preserved, but textual representation may differ
## Index Strategy
Allowed only with existing query proof:
- GIN index for existing containment/query usage.
- Expression index for existing frequently queried JSONB key.
- Partial index for existing filtered JSONB key access.
Rejected:
- index every converted column
- index because `jsonb` supports it
- index for future lifecycle/export/dashboard guesses
- index without write-overhead note
Each index requires a row in the implementation report:
```text
Index | Table/Column | Query Path | Reason | Expected Benefit | Write Overhead Risk | Proof
```
## RBAC / Security / Audit Implications
- No RBAC behavior changes.
- Any changed query involving payload keys must retain existing workspace and managed-environment scoping.
- Non-member and wrong-scope access must remain deny-as-not-found where applicable.
- Customer-safe report/review/evidence paths must not expose raw payloads because of conversion or debug output.
- Reports, logs, screenshots, and test fixtures must not include secrets, tokens, raw credential payloads, sensitive raw provider payloads, or customer-sensitive raw payloads.
- Audit metadata conversion must preserve actor/context fields.
## OperationRun / Evidence / Result Truth Implications
The plan distinguishes:
- **Execution truth**: existing `OperationRun` status/outcome/summary/context; live columns are already `jsonb`, but regression proof still checks rendering.
- **Artifact truth**: `ReviewPack`, `StoredReport`, evidence snapshots/items, backup sets/items.
- **Backup/snapshot truth**: policy versions, backup payloads, baseline/evidence payloads.
- **Recovery/evidence truth**: restore previews/results, evidence currentness, report receipts.
- **Operator next action**: unchanged existing UI states and actions.
## Test Strategy
Required test groups:
1. PostgreSQL schema/type tests for every converted column.
2. Migration semantic preservation tests for representative non-sensitive payloads.
3. Model cast/read-write tests where converted columns are written by Eloquent models.
4. Query-path tests for any changed JSON key query and any new index.
5. Evidence/currentness regression tests for converted evidence/report/review payloads.
6. OperationRun/audit regression tests where summary/context/audit metadata is in scope.
7. Provider readiness/freshness/permission regression tests where provider/environment payloads are converted.
8. Backup/restore payload regression tests for backup items/sets/schedules and restore preview/results.
9. Review/report receipt regression tests for review pack/stored report/customer output.
10. Authorization/scope tests for changed payload queries.
11. Focused browser smoke for representative existing payload-backed pages.
Preferred validation commands:
```bash
cd apps/platform && ./vendor/bin/sail php vendor/bin/pest -c phpunit.pgsql.xml --filter=Spec405
cd apps/platform && ./vendor/bin/sail artisan test --filter=Evidence
cd apps/platform && ./vendor/bin/sail artisan test --filter=OperationRun
cd apps/platform && ./vendor/bin/sail artisan test --filter=Provider
cd apps/platform && ./vendor/bin/sail artisan test --filter=Backup
cd apps/platform && ./vendor/bin/sail artisan test --filter=Restore
cd apps/platform && ./vendor/bin/sail artisan test --filter=ReviewPack
cd apps/platform && ./vendor/bin/sail artisan test --filter=StoredReport
```
Use narrower commands where implementation creates focused Spec 405 tests.
## Rollout And Deployment Considerations
- Local: run through Laravel Sail against PostgreSQL.
- Staging: validate migration execution, app boot, queue/browser relevant proof, representative pages, and rollback/forward notes where safe.
- Production: do not claim production readiness unless staging-like validation passes or the final report explicitly records `PASS WITH CONDITIONS`.
- Migrations: assess table size/row count and lock risk before direct conversion.
- Env vars: none expected.
- Queues/scheduler/storage/assets: none expected, but affected flows may rely on existing workers/storage and must not regress.
- Dokploy: database migration and app boot validation required before production promotion.
## Risk Controls
- Stop before migration if inventory is incomplete.
- Stop before conversion if a high-risk column has unclear product ownership.
- Use semantic JSON comparisons, not raw string comparisons.
- Classify large-table conversion risk before direct `ALTER COLUMN`.
- Do not add speculative indexes.
- Do not print sensitive payload samples.
- Do not rewrite completed specs.
- If a conversion causes UI/rendered regression, fix the data-layer cause if bounded; otherwise stop for spec update.
## Constitution Check
- Inventory-first: PASS; inventory and source-of-truth classification precede conversion.
- Read/write separation: PASS; no Graph/write product behavior is added.
- Graph contract path: PASS; no Graph calls are added.
- Deterministic capabilities: N/A; no capability resolver changes.
- RBAC-UX: PASS; authorization must not change and changed payload queries require scope tests.
- Workspace isolation: PASS; scoped query proof required.
- Tenant/managed-environment isolation: PASS; managed-environment scope remains enforced.
- Run observability: N/A; no new OperationRun.
- OperationRun start UX: N/A.
- Ops-UX lifecycle/summary counts: no changes to lifecycle; current `operation_runs` JSONB posture is inspected.
- Data minimization: PASS; sensitive payloads are not dumped.
- Test governance: PASS; PostgreSQL and focused browser proof are explicit.
- Proportionality: PASS; storage conversion only, no new product truth.
- No premature abstraction: PASS; no new framework.
- Persisted truth: PASS; no new persisted entity/table.
- Behavioral state: PASS; no new state.
- UI semantics: PASS; no UI semantics added.
- Shared pattern first: PASS; existing models/services/tests are reused.
- Provider boundary: PASS; provider payload keys unchanged.
- V1 explicitness / few layers: PASS.
- Spec discipline / bloat check: PASS; one coherent data-layer hardening package.
- Product Surface Contract Gate: PASS as no rendered surface change plus focused regression proof.
## Test Governance Check
- **Test purpose / classification by changed surface**: PostgreSQL migration/type proof, feature/domain regression, focused browser smoke.
- **Affected validation lanes**: pgsql, confidence, browser.
- **Why this lane mix is the narrowest sufficient proof**: storage type is PostgreSQL-specific and payload-backed pages require rendered regression proof, but no full browser audit is needed.
- **Narrowest proving command(s)**: focused Spec 405 pgsql/feature/browser tests once created.
- **Fixture / helper / factory / seed / context cost risks**: minimal explicit fixtures only; no shared default broadening.
- **Expensive defaults or shared helper growth introduced?**: no planned.
- **Heavy-family additions, promotions, or visibility changes**: focused browser proof only.
- **Surface-class relief / special coverage rule**: no UI code changes; browser is regression proof.
- **Closing validation and reviewer handoff**: verify inventory coverage, migration proof, no sensitive payload output, no speculative indexes, and final gate result.
- **Budget / baseline / trend follow-up**: record if pgsql/browser runtime materially increases.
- **Review-stop questions**: Is every `json` column classified? Is every conversion justified? Does every new index have existing query proof? Did any UI or product behavior change? Was staging-like validation completed or properly conditioned?
- **Escalation path**: document-in-feature for bounded findings; follow-up-spec for unsafe conversion, online migration need, or unresolved product/schema decision.
- **Active feature PR close-out entry**: Guardrail / Database Hardening.
- **Why no dedicated follow-up spec is needed**: the slice is bounded unless implementation finds large-table online migration or unresolved product/schema ownership.
## Implementation Phases
### Phase 1 - Inventory and Classification
Build schema inventory from PostgreSQL, migrations, models, casts, factories, query paths, and tests. Produce the draft inventory matrix before any migration.
### Phase 2 - Conversion and Index Design
Write focused migration(s) only for `CONVERT` columns. Add no index unless tied to an existing query path. Keep rollback explicit.
### Phase 3 - Regression and Preservation Proof
Add PostgreSQL and feature tests proving column type, semantic preservation, read/write behavior, scope/authorization, and representative domain behavior.
### Phase 4 - Browser, Staging, and Report
Run focused browser proof, staging-like validation where available, final dirty-state checks, and complete the implementation report with gate result.
## Complexity Tracking
| Violation | Why Needed | Simpler Alternative Rejected Because |
|-----------|------------|--------------------------------------|
| None planned | N/A | N/A |
## Proportionality Review
- **Current operator problem**: trust-layer payload storage is inconsistent and can undermine future evidence/report/lifecycle work.
- **Existing structure is insufficient because**: older `json` payload columns conflict with current PostgreSQL/jsonb posture and query/index expectations.
- **Narrowest correct implementation**: classify all columns, convert only selected existing columns, add only query-backed indexes, and prove behavior.
- **Ownership cost created**: migration review, PostgreSQL tests, focused browser proof, and implementation report.
- **Alternative intentionally rejected**: blanket conversion plus blanket indexes.
- **Release truth**: current-release data-layer readiness.