TenantAtlas/specs/001-rbac-onboarding/plan.md
Ahmed Darrazi 79636c13c5 docs(speckit): add constitution evidence ledger, FR→Task traceability, and measurable NFR thresholds
- Add Constitution Evidence Ledger with discovery + verification for Phases 1-15
- Add FR → Tasks Traceability Matrix (FR-001 to FR-035 → Task IDs)
- Add Measurable Thresholds (NFR/UX): rendering limits, Graph timeouts, retention policies
- Annotate tasks with explicit Implements: FR-XXX tags (100% FR coverage: 35/35)
- Consolidate spec artifacts into specs/001-rbac-onboarding/ per speckit workflow
- Add FR-019 Settings Normalization sub-requirements (FR-019.1 to FR-019.4)

Constitution VII (Spec-Driven Development) compliance achieved:
- Discovery notes present for all completed phases
- Verification commands documented per phase
- Explicit FR→Task mapping for traceability
- No unmapped FRs; no placeholders (TODO/TBD)

Ready for /speckit.implement or further iteration.
2025-12-13 19:12:32 +01:00

11 KiB
Raw Blame History

Implementation Plan: TenantPilot v1

Branch: tenantpilot-v1
Date: 2025-12-12
Spec Source: .specify/spec.md (scope/restore matrix unchanged)

Summary

TenantPilot v1 already delivers tenant-scoped Intune inventory, immutable backups, version history with diffs, defensive restore flows, tenant setup, permissions/health, settings normalization/display, and Highlander enforcement. Remaining priority work is the delegated Intune RBAC onboarding wizard (US7) and afterwards the Graph Contract Registry & Drift Guard (US8). All Graph calls stay behind the abstraction with audit logging; snapshots remain JSONB with safety gates (preview-only for high-risk types).

Status Snapshot (tasks.md is source of truth)

  • Done: US1 inventory, US2 backups, US3 versions/diffs, US4 restore preview/exec, scope config, soft-deletes/housekeeping, Highlander single current tenant, tenant setup & verify (US6), permissions/health overview (US6), table ActionGroup UX, settings normalization/display (US1b), Dokploy/Sail runbooks.
  • Next up: US7 Intune RBAC onboarding wizard (delegated, synchronous Filament flow).
  • Upcoming: US8 Graph Contract Registry & Drift Guard (contract registry, type-family handling, verification command, fallback strategies).

Technical Baseline

  • Laravel 12, Filament 4, PHP 8.4; Sail-first with PostgreSQL.
  • JSONB for policy/backup/version payloads; FK/time indexes, GIN where needed.
  • Graph abstraction with standardized error mapping/retries; no secrets in logs.
  • Audit trail across backup/restore/version/tenant/permission/wizard steps; tenant isolation enforced.
  • Restore matrix and supported types remain config-driven single sources of truth.
  • Safety: preview/dry-run, confirmation gates, warnings for high-risk types; no implicit tenants (Highlander).

Constitution Check

This plan is checked against the TenantPilot Constitution (see .specify/memory/constitution.md). Below are the principles with a short mapping to where the plan enforces them.

  • I. Safety-First Operations: Covered by "Safety: preview/dry-run, confirmation gates, warnings for high-risk types" and by restore safety gates (see "Restore Safety Gate" section).
  • II. Immutable Versioning: Implemented via JSONB policy_versions and immutable writes (see Completed Workstreams: US3).
  • III. Defensive Restore: Plan requires preview/dry-run, conflict detection and explicit confirmation before apply (Execution Plan: US4/US7).
  • IV. Auditability: Audit logging mandated in each service (RbacOnboardingService, BackupService, RestoreService) and recorded in audit_logs.
  • V. Tenant-Aware Architecture: Tenant-scoped data and Highlander enforcement are included (Completed Workstreams & Data section).
  • VI. Graph Abstraction: All Graph calls go through GraphClientInterface and config/graph_contracts.php is used for contract handling.
  • VII. Spec-Driven Development: This plan references .specify/* artifacts and the tasks are produced in specs/001-rbac-onboarding/tasks.md (this file); remaining gap: ensure each FR has explicit task mapping (see Next Actions).

GATE: The above mappings satisfy the constitution's required checks for this plan. Any future change to scope or implementation that affects these principles must include an updated Constitution Check note here.

Completed Workstreams (no new action needed)

  • US1 Inventory (Phase 3): Filament policy listing with type/category/platform filters; tenant-scoped.
  • US2 Backups (Phase 4): Backup sets/items in JSONB, immutable snapshots, audit logging, relation manager UX for attaching policies, soft-delete rules with restore-run guard.
  • US3 Versions/Diffs (Phase 5): Version capture, timelines, human+JSON diffs, soft-deletes with audit.
  • US4 Restore (Phase 6): Preview, selective execution, conflict warnings, per-type restore level (enabled vs preview-only), PowerShell decode/encode respected, audit of outcomes.
  • US6 Tenant Setup & Highlander (Phases 8 & 12): Tenant CRUD/verify, INTUNE_TENANT_ID override, is_current unique enforcement, “Make current” action, block deactivated tenants.
  • US6 Permissions/Health (Phase 9): Required permissions list, compare/check service, Verify action updates status and audit, permissions panel in Tenant detail.
  • US1b Settings Display (Phase 13): PolicyNormalizer + SnapshotValidator, warnings for malformed snapshots, normalized settings and pretty JSON on policy/version detail, list badges, README section.
  • Housekeeping/UX (Phases 1012): Soft/force deletes for tenants/backups/versions/restore runs with guards; table actions in ActionGroup per UX guideline.
  • Ops (Phase 7): Sail runbook and Dokploy staging→prod guidance captured.

Execution Plan: US7 Intune RBAC Onboarding Wizard (Phase 14)

  • Objectives: deliver delegated, tenant-scoped wizard that safely converges the Intune RBAC state for the configured service principal; fully audited, idempotent, least-privilege by default.
  • Scope alignment: FR-023FR-030, constitution (Safety-First, Auditability, Tenant-Aware, Graph Abstraction). No secret/token persistence; delegated tokens stay request-local and are not stored in DB/cache.
  • Design decisions:
    • Service: RbacOnboardingService orchestrates steps using GraphClientInterface; reuse RbacHealthService for verification; all calls through abstraction with error mapping.
  • Data: use existing tenant RBAC columns (rbac_group_id, rbac_group_name, rbac_role_assignment_id, rbac_role_key, rbac_scope_mode, rbac_scope_id, status fields). No new entities; ensure casts + guards.
  • Audit: log start, delegated login outcome, group ensure, membership ensure, role assignment ensure/update, verify results. No payload logging; only IDs/status codes.
  • Wizard flow (Filament, Tenant detail ActionGroup):
    1. Preconditions/config step with review screen: show tenant/app info, required permissions, least-privilege warning; inputs for role (default Policy/Profile Manager; Intune Administrator shows warning), scope (global default; optional group picker), group mode (create default TenantPilot-Intune-RBAC vs pick existing security-enabled group). Summarize planned changes before proceeding.
    2. Delegated auth step: initiate login; on failure stop with actionable message + audit; do not store token beyond request.
    3. Execute (synchronous): resolve service principal by app_client_id; on missing SP stop with consent-required hint + audit reason sp_not_found; ensure/create security group (validate securityEnabled=true); ensure SP membership (idempotent “already exists” OK); ensure/create/patch Intune role assignment for chosen role/scope; persist discovered IDs on tenant for idempotency.
    4. Post-verify: force fresh token acquisition; run canary reads (deviceConfigurations, deviceCompliancePolicies, conditionalAccess if enabled); update RBAC/permission health; surface warnings if scope-limited; audit verify result.
    5. Summary: show IDs (group, role assignment), role/scope used, verify status, CTA to retry policy sync.
  • UX rules: action only for active tenants with app_client_id; keep in ActionGroup with Admin consent/Verify; show badge/hint if RBAC missing; warnings on selecting Intune Administrator role; block execution if tenant inactive or missing consent/SP.
  • Safety/idempotency: handle “already exists” as success; no self-heal jobs; retry-safe writes; no queue usage to avoid token expiry; timeouts surfaced clearly; no delegated token persistence.
  • Tests: happy path, rerun idempotent, SP missing, insufficient privileges, non-security-enabled group failure, scope-limited warning, delegated auth failure path; Filament wizard visibility + summary rendering; health prompts to run wizard when RBAC missing.
  • Documentation: add wizard behavior, least-privilege defaults, audit expectations, “no token storage”, and how to rerun safely; note CTA to retry policy sync.
  • Operational note: After admin-consent or RBAC changes, force a fresh token acquisition (e.g., clear app token cache) before re-trying sync/backup/restore; Verify should run with a non-stale token. Optional CHECK/REPORT jobs only (no grant) remain out-of-scope for this phase.
  • Testing plan (Pest):
    • Service unit tests: happy path, rerun idempotent, SP missing, insufficient privileges, scope-limited warning, group exists/not security-enabled failure.
    • Filament feature: wizard visibility gating, delegated failure path, successful run shows summary and updates health, warnings rendered.
    • Health integration: Verify reflects RBAC status and prompts to run wizard when missing.
  • Deployment/ops: no new env vars; ensure migrations for tenant RBAC columns are applied; run targeted tests php artisan test tests/Unit/RbacOnboardingServiceTest.php tests/Feature/Filament/TenantRbacWizardTest.php; Pint on touched files.

Upcoming: US8 Graph Contract Registry & Drift Guard (Phase 15)

  • Objectives: centralize Graph contract assumptions per supported type/endpoint and provide drift detection + safe fallbacks so preview/restore remain stable on Graph shape/capability changes.
  • Scope alignment: FR-031FR-034 (spec), constitution (Safety-First, Auditability, Graph Abstraction, Tenant-Aware).
  • Approach:
    • Artifact: config/graph_contracts.php (or similar) with per-type contract data:
      • resource paths (collection + single item)
      • allowed $select / allowed $expand
      • type families / allowed @odata.type values
      • create/update methods, id field
      • hydration strategy (member expansion vs follow-up fetch vs unavailable)
    • Service: registry + checker; integrate with Graph client to enforce allowed capabilities and downgrade on capability errors (retry without expands/selects), recording warnings/audit entries.
    • Type families: treat derived @odata.type values within a declared family as compatible (no odata_mismatch) for routing preview/restore.
    • Verification: php artisan graph:contract:check (staging/CI) to probe endpoints and surface actionable diffs when Graph changes; opt-in/guarded for prod.
    • Docs: explain registry format and update process when Graph changes.
  • Testing outline: unit for registry lookups/type-family matching/fallback selection; integration/Pest to simulate capability errors and ensure downgrade path + correct routing for derived types.

Testing & Quality Gates

  • Continue using targeted Pest runs per change set; add/extend tests for US7 wizard now, and for US8 contracts when implemented.
  • Run Pint on touched files before finalizing.
  • Maintain tenant isolation, audit logging, and restore safety gates; validate snapshot shape and type-family compatibility prior to restore execution.

Restore Safety Gate

  • Restore execution MUST be blocked if a snapshots @odata.type is outside the declared type family for the target policy type (prevent cross-type/platform restores).
  • Restore preview MAY still render details + warnings for out-of-family snapshots, but MUST NOT offer an apply action.

Coordination

  • Update .specify/tasks.md to reflect progress on US7 wizard and future US8 contract tasks; no new entities or scope changes introduced here.
  • Stage validation required before production for any migration or restore-impacting change.
  • Keep Graph integration behind abstraction; no secrets in logs; follow existing UX patterns (ActionGroup, warnings for risky ops).