2026-01-07 17:10:57 +00:00
4 changed files with 236 additions and 37 deletions
--- a/specs/040-inventory-core/checklists/requirements.md
+++ b/specs/040-inventory-core/checklists/requirements.md
@ -0,0 +1,45 @@
 # Requirements Checklist — Inventory Core (040)
 ## Scope
 - [x] This checklist applies only to Spec 040 (Inventory Core).
 ## Constitution Gates
 - [x] Inventory-first: inventory stores “last observed” only (no snapshot/backup side effects)
 - [x] Read/write separation: no Intune write paths introduced
 - [x] Single contract path to Graph: Graph access only via GraphClientInterface + contracts (if used)
 - [x] Tenant isolation: all reads/writes tenant-scoped
 - [x] Automation is idempotent & observable: run records + locks + stable error codes
 - [x] Data minimization & safe logging: no secrets/tokens; DB run records are the source of truth
 ## Functional Requirements Coverage
 - [x] FR-001 Inventory catalog exists (inventory_items)
 - [x] FR-002 Stable upsert identity prevents duplicates
 - [x] FR-003 Sync runs recorded (inventory_sync_runs)
 - [x] FR-004 Tenant isolation enforced
 - [x] FR-005 Deterministic selection_hash implemented and tested
 - [x] FR-006 Sync does NOT create snapshots/backups (explicitly tested)
 - [x] FR-007 Missing is derived (not persisted) relative to latestRun(tenant_id, selection_hash)
 - [x] FR-008 meta_jsonb whitelist enforced; unknown keys dropped; never fails sync
 - [x] FR-009 Locks/idempotency/observable failures implemented
 ## Non-Functional Requirements Coverage
 - [x] NFR-001 Global + per-tenant concurrency limits (source + defaults + behavior defined)
 - [x] NFR-002 Throttling resilience (429/503 backoff+jitter) with stable error codes
 - [x] NFR-003 Deterministic behavior is testable (Pest)
 - [x] NFR-004 Data minimization enforced (no raw payload storage)
 - [x] NFR-005 Safe logging: no secrets/tokens; rely on run records not log parsing
 ## Tests (Pest)
 - [x] Upsert prevents duplicates + updates last_seen fields
 - [x] selection_hash determinism (order invariant)
 - [x] Missing semantics per latest completed run and selection isolation
 - [x] Low-confidence missing when latest run partial/failed/had_errors
 - [x] meta whitelist drops unknown keys
 - [x] Lock prevents overlapping runs for same tenant+selection
 - [x] No snapshot/backup tables are written during sync
 - [x] Error reporting uses stable error codes; no secrets/tokens persisted in run error details
--- a/specs/040-inventory-core/plan.md
+++ b/specs/040-inventory-core/plan.md
@ -1,32 +1,126 @@
 # Implementation Plan: Inventory Core (040)
-**Branch**: `feat/040-inventory-core` | **Date**: 2026-01-07 | **Spec**: `specs/040-inventory-core/spec.md`
+**Branch**: `spec/040-inventory-core` | **Date**: 2026-01-07 | **Spec**: `specs/040-inventory-core/spec.md`
 **Scope (this step)**: Produce a clean, implementable `plan.md` + consistent `tasks.md` for Spec 040 only.
 ## Summary
-Implement a tenant-scoped inventory catalog (“last observed”) and an observable sync run system with deterministic selection scoping. Ensure no snapshots/backups are created by sync.
+Implement tenant-scoped Inventory + Sync Run tracking as the foundational substrate for later Inventory UI and higher-order features.
 Key outcomes:
 - Inventory is “last observed” (not backup), stored as metadata + whitelisted `meta_jsonb`.
 - Sync runs are observable, selection-scoped via deterministic `selection_hash`.
 - “Missing” is derived relative to latest completed run for the same `(tenant_id, selection_hash)`.
 - Automation is safe: locks, idempotency, throttling handling, global+per-tenant concurrency limits.
 ## Technical Context
 - **Language/Version**: PHP 8.4
 - **Framework**: Laravel 12
 - **Admin UI**: Filament v4 + Livewire v3
 - **Storage**: PostgreSQL (JSONB available)
 - **Queue/Locks**: Laravel queue + cache/Redis locks (as configured)
 - **Testing**: Pest v4 (`php artisan test`)
 - **Target Platform**: Sail-first local dev; container deploy (Dokploy)
 ## Constitution Check
- Inventory-first, snapshots-second (sync never creates snapshots)
+- Inventory-first: inventory stores last observed state only (no snapshot/backup side effects).
- Read/write separation (sync is read-only; any future writes require preview/confirmation/audit/tests)
+- Read/write separation: this feature introduces no Intune write paths.
- Single contract path to Graph (Graph access only through existing abstractions/contracts)
+- Single contract path to Graph: Graph reads (if needed) go via Graph abstraction and contracts.
- Deterministic capabilities (capabilities resolver output testable)
+- Tenant isolation: all reads/writes tenant-scoped; no cross-tenant shortcuts.
- Tenant isolation (non-negotiable)
+- Automation: locked + idempotent + observable; handle 429/503 with backoff+jitter.
- Automation observable + idempotent (locks, run records, stable error codes, 429/503 handling)
+- Data minimization: no payload-heavy storage; safe logs.
 - Data minimization + safe logging
-## Deliverables (Phase-friendly)
+No constitution violations expected.
- Data model for inventory items and sync runs
+## Project Structure (Impacted Areas)
 - Sync engine orchestration and locking strategy
 - Deterministic selection hashing
 - Capabilities resolver output snapshot tests
 - Minimal Filament/CLI surface to trigger and observe sync runs (if required by tasks)
-## Out of Scope
+```text
 specs/040-inventory-core/
 ├── spec.md
 ├── plan.md
 └── tasks.md
- Dependency graph hydration (spec 042)
+app/
- Cross-tenant promotion (spec 043)
+├── Models/
- Drift reporting (spec 044)
+├── Jobs/
- Lifecycle “deleted” semantics (feature 900)
+├── Services/
 └── Support/
 database/migrations/
 tests/Feature/
 tests/Unit/
 ```
 ## Implementation Approach
 ### Phase A — Data Model + Migrations
 1. Add `inventory_items` table
   - Identity: unique constraint to prevent duplicates, recommended:
     - `(tenant_id, policy_type, external_id)`
   - Fields: `display_name`, `platform`/`category` (if applicable), `meta_jsonb`, `last_seen_at`, `last_seen_run_id`.
   - Indexing: indexes supporting tenant/type listing; consider partials as needed.
 2. Add `inventory_sync_runs` table
   - Identity: `tenant_id`, `selection_hash`
   - Status fields: `status`, `started_at`, `finished_at`, `had_errors`
   - Counters: `items_observed_count`, `items_upserted_count`, `errors_count`
   - Error reporting: stable error code(s) list or summary field.
 ### Phase B — Selection Hash (Deterministic)
 Implement canonicalization exactly as Spec Appendix:
 - Only include scope-affecting keys in `selection_payload`.
 - Sort object keys; sort `policy_types[]` and `categories[]` arrays.
 - Compute `selection_hash = sha256(canonical_json(selection_payload))`.
 ### Phase C — Sync Run Lifecycle + Upsert
 - Create a service that:
  - acquires a lock for `(tenant_id, selection_hash)`
  - creates a run record
  - enumerates selected policy types
  - upserts inventory items by identity key
  - updates `last_seen_at` and `last_seen_run_id` per observed item
  - finalizes run status + counters
  - never creates/modifies snapshot/backup records (`policy_versions`, `backup_*`)
 ### Phase D — Derived “Missing” Semantics
 - Implement “missing” as a computed state relative to `latestRun(tenant_id, selection_hash)`.
 - Do not persist “missing” or “deleted”.
 - Mark missing as low-confidence when `latestRun.status != success` or `latestRun.had_errors = true`.
 ### Phase E — Meta Whitelist
 - Define a whitelist of allowed `meta_jsonb` keys.
 - Enforce by dropping unknown keys (never fail sync).
 ### Phase F — Concurrency Limits
 - Enforce global concurrency (across tenants) and per-tenant concurrency.
 - The implementation may be via queue worker limits, semaphore/lock strategy, or both; the behavior must be testable.
 - When limits are hit, create an observable run record with `status=skipped`, `had_errors=true`, and stable error code(s).
 ## Test Plan (Pest)
 Minimum required coverage aligned to Spec test cases:
 - Upsert identity prevents duplicates; `last_seen_*` updates.
 - `selection_hash` determinism (array ordering invariant).
 - Missing derived per latest completed run for same `(tenant_id, selection_hash)`.
 - Low-confidence missing when latest run is partial/failed or had_errors.
 - Meta whitelist drops unknown keys.
 - Lock prevents overlapping runs per tenant+selection.
 - No snapshot/backup rows are created/modified by inventory sync.
 - Error reporting uses stable `error_codes` and stores no secrets/tokens.
 ## Out of Scope (Explicit)
 - Any UI (covered by Spec 041)
 - Any snapshot/backup creation
 - Any restore/promotion/remediation write paths
--- a/specs/040-inventory-core/spec.md
+++ b/specs/040-inventory-core/spec.md
@ -140,11 +140,64 @@ ### Meta Whitelist (Fail-safe)
 - `meta_jsonb` has a documented whitelist of allowed keys.
 - **AC:** Unknown `meta_jsonb` keys are dropped (not persisted) and MUST NOT cause sync to fail.
 #### Initial `meta_jsonb` whitelist (v1)
 Allowed keys (all optional; if not applicable for a type, omit):
 - `odata_type`: string (copied from Graph `@odata.type`)
 - `etag`: string|null (Graph etag if available; never treated as a secret)
 - `scope_tag_ids`: array<string> (IDs only; no display names required)
 - `assignment_target_count`: int|null (count only; no target details)
 - `warnings`: array<string> (bounded, human-readable, no secrets)
 **AC:** Any other key is dropped silently (not persisted) and MUST NOT fail sync.
 ### Observed Run
 - `inventory_items.last_seen_run_id` and `inventory_items.last_seen_at` are updated when an item is observed.
 - `last_seen_run_id` implies the selection via `sync_runs.selection_hash`; no per-item selection hash is required for core.
 ### Run Error Codes (taxonomy)
 Sync runs record:
 - `status`: one of `success|partial|failed|skipped`
 - `had_errors`: bool (true if any non-ideal condition occurred)
 - `error_codes[]`: array of stable machine-readable codes (no secrets)
 Minimal taxonomy (3–8 codes):
 - `lock_contended` (a run could not start because the per-tenant+selection lock is held)
 - `concurrency_limit_global` (global concurrency limit reached; run skipped)
 - `concurrency_limit_tenant` (per-tenant concurrency limit reached; run skipped)
 - `graph_throttled` (429 encountered; run partial/failed depending on recovery)
 - `graph_transient` (503/timeout/other transient errors)
 - `graph_forbidden` (403/insufficient permission)
 - `unexpected_exception` (unexpected failure; message must be safe/redacted)
 **Rule:** Run records MUST store codes (and safe, bounded context) rather than raw exception dumps or tokens.
 ### Concurrency Limits (source, defaults, behavior)
 **Source:** Config (recommended keys):
 - `tenantpilot.inventory_sync.concurrency.global_max`
 - `tenantpilot.inventory_sync.concurrency.per_tenant_max`
 **Defaults (if not configured):**
 - global_max = 2
 - per_tenant_max = 1
 **Behavior when limits are hit:**
 - The system MUST create a Sync Run record with:
  - `status = skipped`
  - `had_errors = true` (so missing stays low-confidence for that selection)
  - `error_codes[]` includes `concurrency_limit_global` or `concurrency_limit_tenant`
  - `started_at`/`finished_at` set (observable)
 - No inventory items are mutated in a skipped run.
 ## Testing Guidance (non-implementation)
 These are test cases expressed in behavior terms (not code).
@ -154,6 +207,7 @@ ### Test Cases — Sync and Upsert
 - **TC-001**: Sync creates or updates inventory items and sets `last_seen_at`.
 - **TC-002**: Re-running sync for the same tenant+selection updates existing records and does not create duplicates.
 - **TC-003**: Inventory queries scoped to Tenant A never return Tenant B’s items.
 - **TC-004**: Inventory sync does not create or modify snapshot/backup records (e.g., no new rows in `policy_versions`, `backup_sets`, `backup_items`, `backup_schedules`, `backup_schedule_runs`).
 ### Test Cases — Selection Hash Determinism
--- a/specs/040-inventory-core/tasks.md
+++ b/specs/040-inventory-core/tasks.md
@ -4,29 +4,35 @@ # Tasks: Inventory Core (040)
 ## P1 — MVP (US1/US2)
- [ ] T001 [US1] Define Inventory Item data model (tenant-scoped identity + last_seen fields)
+- [ ] T001 [US1] Add migrations: `inventory_items` (unique: tenant_id+policy_type+external_id; indexes; last_seen fields)
- [ ] T002 [US1] Define Sync Run data model (tenant_id, selection_hash, status, timestamps, counts, stable error codes)
+- [ ] T002 [US1] Add migrations: `inventory_sync_runs` (tenant_id, selection_hash, status, started/finished, counts, stable error codes)
- [ ] T003 [US1] Implement deterministic selection hashing (canonical json + sha256)
+- [ ] T003 [US1] Implement deterministic `selection_hash` (canonical JSON: sorted keys + sorted arrays; sha256)
- [ ] T004 [US1] Implement inventory upsert semantics (no duplicates)
+- [ ] T004 [US1] Implement inventory upsert semantics (idempotent, no duplicates)
- [ ] T005 [US1] Enforce tenant isolation in all inventory/run queries
+- [ ] T005 [US1] Enforce tenant isolation for all inventory + run read/write paths
- [ ] T006 [US2] Implement derived “missing” computation relative to latest completed run (tenant_id + selection_hash)
+- [ ] T006 [US2] Implement derived “missing” query semantics vs latest completed run for same (tenant_id, selection_hash)
- [ ] T007 [US2] Ensure low-confidence missing when latestRun is partial/failed or had_errors
+- [ ] T007 [US2] Missing confidence rule: partial/failed or had_errors => low confidence
- [ ] T008 [US2] Implement meta_jsonb whitelist enforcement (drop unknown keys, never fail sync)
+- [ ] T008 [US2] Enforce `meta_jsonb` whitelist (drop unknown keys; never fail sync)
 - [ ] T009 [US1] Guardrail: inventory sync must not create snapshots/backups (no writes to `policy_versions`/`backup_*`)
 ## P2 — Observability & Safety (US3 + NFR)
- [ ] T009 [US3] Ensure run records include stable error codes and counts
+- [ ] T010 [US3] Run lifecycle: ensure run records include counts + stable error codes (visible and actionable)
- [ ] T010 [NFR] Add idempotency + locks to prevent overlapping runs per tenant+selection
+- [ ] T011 [NFR] Locking/idempotency: prevent overlapping runs per (tenant_id, selection_hash)
- [ ] T011 [NFR] Add global + per-tenant concurrency limiting strategy
+- [ ] T012 [NFR] Concurrency: enforce global + per-tenant limits (queue/semaphore strategy)
- [ ] T012 [NFR] Implement throttling handling strategy (backoff + jitter for transient Graph failures)
+- [ ] T013 [NFR] Throttling resilience: backoff + jitter for transient Graph failures (429/503)
 - [ ] T014 [NFR] Safe logging & safe persistence: store only stable `error_codes` + bounded safe context in run records (no secrets/tokens; no log parsing required)
 ## Tests (Required for runtime behavior)
- [ ] T020 [US1] Tests: upsert does not create duplicates; last_seen updated
+- [ ] T020 [US1] Pest: upsert prevents duplicates; `last_seen_at` and `last_seen_run_id` update
- [ ] T021 [US2] Tests: missing derived per latestRun(selection_hash); selection isolation
+- [ ] T021 [US2] Pest: missing derived per latest completed run for same (tenant_id, selection_hash)
- [ ] T022 [US2] Tests: partial/failed run => low confidence missing
+- [ ] T022 [US2] Pest: selection isolation (run for selection Y does not affect selection X)
- [ ] T023 [US2] Tests: meta whitelist drops unknown keys without failing
+- [ ] T023 [US2] Pest: partial/failed/had_errors => missing is low confidence
- [ ] T024 [NFR] Tests: selection_hash determinism (array ordering)
+- [ ] T024 [US2] Pest: meta whitelist drops unknown keys (no exception; no persistence)
 - [ ] T025 [NFR] Pest: selection_hash determinism (array ordering invariant)
 - [ ] T026 [NFR] Pest: lock prevents overlapping runs for same (tenant_id, selection_hash)
 - [ ] T027 [NFR] Pest: run error persistence contains no secrets/tokens (assert error context is bounded; no “Bearer ” / access token patterns; prefer error_codes)
 - [ ] T028 [US1] Pest: inventory sync creates no rows in `policy_versions` and `backup_*` tables (assert counts unchanged)
 ## Notes