TenantAtlas/specs/093-scope-001-workspace-id-isolation/plan.md
ahmido 92a36ab89e SCOPE-001: DB-level workspace isolation via workspace_id (#112)
Implements Spec 093 (SCOPE-001) workspace isolation at the data layer.

What changed
- Adds `workspace_id` to 12 tenant-owned tables and enforces correct binding.
- Model write-path enforcement derives workspace from tenant + rejects mismatches.
- Prevents `tenant_id` changes (immutability) on tenant-owned records.
- Adds queued backfill command + job (`tenantpilot:backfill-workspace-ids`) with OperationRun + AuditLog observability.
- Enforces DB constraints (NOT NULL + FK `workspace_id` → `workspaces.id` + composite FK `(tenant_id, workspace_id)` → `tenants(id, workspace_id)`), plus audit_logs invariant.

UI / operator visibility
- Monitor backfill runs in **Monitoring → Operations** (OperationRun).

Tests
- `vendor/bin/sail artisan test --compact tests/Feature/WorkspaceIsolation`

Notes
- Backfill is queued: ensure a queue worker is running (`vendor/bin/sail artisan queue:work`).

Spec package
- `specs/093-scope-001-workspace-id-isolation/` (plan, tasks, contracts, quickstart, research)

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #112
2026-02-14 22:34:02 +00:00

156 lines
7.3 KiB
Markdown

# Implementation Plan: Spec 093 — SCOPE-001 Workspace ID Isolation
**Branch**: `093-scope-001-workspace-id-isolation` | **Date**: 2026-02-14
**Spec**: `specs/093-scope-001-workspace-id-isolation/spec.md`
**Spec (absolute)**: `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/093-scope-001-workspace-id-isolation/spec.md`
**Input**: `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/093-scope-001-workspace-id-isolation/spec.md`
## Summary
Enforce DB-level workspace isolation for tenant-owned data by adding `workspace_id` to 12 tenant-owned tables, safely backfilling legacy rows, and then enforcing NOT NULL + referential integrity.
Additionally, fix the audit trail invariant: if an `audit_logs` entry references a tenant, it must also reference a workspace.
Rollout is staged to avoid downtime:
1) Add nullable `workspace_id` columns.
2) Enforce write-path derivation + mismatch rejection.
3) Backfill in batches with resumability, locking, and observability (`OperationRun` + `AuditLog`).
4) Enforce constraints and add final indexes.
## Technical Context
**Language/Version**: PHP 8.4 (Laravel 12)
**Primary Dependencies**: Filament v5, Livewire v4, Laravel Sail, Tailwind CSS v4
**Storage**: PostgreSQL (primary), with SQLite support patterns used in migrations for tests/CI
**Testing**: Pest v4 (`vendor/bin/sail artisan test --compact`)
**Target Platform**: Web (admin SaaS)
**Project Type**: Laravel monolith (Filament panels + Livewire + Artisan commands)
**Performance Goals**:
- Backfill updates run in batches to avoid long locks.
- Postgres uses `CONCURRENTLY` for large index creation where applicable.
**Constraints**:
- No new HTTP routes/pages.
- No planned downtime; staged rollout.
- Backfill is idempotent, resumable, and aborts on tenant→workspace mapping failures.
**Scale/Scope**: Potentially large datasets (unknown upper bound); plan assumes millions of rows are possible across inventory/backup/history tables.
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
- Inventory-first / snapshots: PASS (schema-only + backfill; no changes to inventory/snapshot semantics).
- Read/write separation: PASS (writes are limited to migrations + operator backfill; no UI “write surfaces” are added).
- Graph contract path: PASS (no Graph calls).
- Deterministic capabilities: PASS (no capability resolver changes).
- Workspace isolation: PASS (strengthens isolation by enforcing workspace binding at the data layer).
- Tenant isolation: PASS (tenant-owned tables remain tenant-scoped; DB constraints prevent cross-workspace mismatches).
- RBAC-UX / planes: PASS (no changes to `/admin` vs `/system`; no new access surfaces).
- Run observability: PASS (backfill is operationally relevant and will be tracked via `OperationRun` + `AuditLog`).
- Filament Action Surface Contract: N/A (no Filament Resource/Page changes).
## Project Structure
### Documentation (this feature)
```text
specs/093-scope-001-workspace-id-isolation/
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│ ├── openapi.yaml
│ └── cli.md
└── tasks.md
```
### Source Code (repository root)
```text
app/
├── Console/
│ └── Commands/
├── Models/
└── Support/ (or Services/)
database/
└── migrations/
tests/
└── Feature/
```
**Structure Decision**: Implement as Laravel migrations + an Artisan operator command + model-level enforcement helpers, with Pest feature tests.
## Phase Plan
### Phase 0 — Research (complete)
Outputs:
- `specs/093-scope-001-workspace-id-isolation/research.md`
Key decisions captured:
- Tenant↔workspace consistency will be enforced with composite FKs on Postgres/MySQL.
- Audit invariant enforced with a check constraint.
### Phase 1 — Design & Contracts (complete)
Outputs:
- `specs/093-scope-001-workspace-id-isolation/data-model.md`
- `specs/093-scope-001-workspace-id-isolation/contracts/openapi.yaml` (no new routes)
- `specs/093-scope-001-workspace-id-isolation/contracts/cli.md` (Artisan backfill contract)
- `specs/093-scope-001-workspace-id-isolation/quickstart.md`
**Post-design constitution re-check**: PASS (no new external calls; operational backfill is observable).
### Phase 2 — Implementation Planning (next)
Implementation will be delivered as small, test-driven slices aligned to the staged rollout.
1) Phase 1 migrations — add nullable `workspace_id`
- Add `workspace_id` (nullable) + index to the 12 tenant-owned tables.
- Add baseline scoping indexes for expected query patterns (at minimum `workspace_id` and `(workspace_id, tenant_id)` where useful).
- Ensure migrations follow existing multi-driver patterns (SQLite fallbacks where needed).
2) Phase 1.5 — write-path enforcement (application)
- For each affected model/write path:
- On create: derive `workspace_id` from `tenant.workspace_id`.
- On update: reject changes to `tenant_id` (immutability) and reject explicit workspace mismatches.
- Ensure audit log writer sets `workspace_id` when `tenant_id` is present.
3) Phase 2 — backfill command (operator-only)
- Add `tenantpilot:backfill-workspace-ids`.
- Safety requirements:
- Acquire lock to prevent concurrent execution.
- Batch updates per table and allow resume/checkpoint.
- Abort and report table + sample IDs if a tenant→workspace mapping cannot be resolved.
- Observability:
- Create/reuse an `OperationRun` describing the backfill run.
- Write `AuditLog` summary entries for start/end/outcome.
- Execution strategy (queued):
- The command MUST be a lightweight start surface: authorize → acquire lock → create/reuse OperationRun → dispatch queued jobs → print a “View run” pointer.
- The actual backfill mutations MUST execute inside queued jobs (batch/table scoped) so large datasets do not require a single long-running synchronous CLI process.
- Implementation maps to `app/Console/Commands/TenantpilotBackfillWorkspaceIds.php` + `app/Jobs/BackfillWorkspaceIdsJob.php`.
- Jobs MUST update OperationRun progress/counters and record failures with stable reason codes + sanitized messages.
4) Phase 3 — constraints + validation + final indexes
- Tenant-owned tables:
- Set `workspace_id` to NOT NULL (after validation).
- Add FK `workspace_id → workspaces.id`.
- Add composite FK `(tenant_id, workspace_id) → tenants(id, workspace_id)` on Postgres/MySQL.
- For Postgres, prefer `NOT VALID` then `VALIDATE CONSTRAINT` to reduce lock time.
- Tenants:
- Add a unique constraint/index on `(id, workspace_id)` to support composite FKs.
- Audit logs:
- Backfill `workspace_id` for rows where `tenant_id` is present.
- Add check constraint: `tenant_id IS NULL OR workspace_id IS NOT NULL`.
- Index strategy:
- Use `CREATE INDEX CONCURRENTLY` on Postgres for large tables (migrations must not run in a transaction).
5) Pest tests (minimal, high-signal)
- Backfill correctness on a representative table (seed missing `workspace_id`, run backfill, assert set).
- DB constraint tests (where supported by test DB):
- `tenant_id` + mismatched `workspace_id` cannot be persisted after Phase 3 constraints.
- audit invariant: tenant-scoped audit requires workspace; workspace-only and platform-only are allowed.