TenantAtlas/.specify/plan.md
2025-12-14 20:23:18 +01:00

9.8 KiB
Raw Blame History

Implementation Plan: TenantPilot v1

Branch: tenantpilot-v1
Date: 2025-12-12
Spec Source: .specify/spec.md (scope/restore matrix unchanged)

Summary

TenantPilot v1 already delivers tenant-scoped Intune inventory, immutable backups, version history with diffs, defensive restore flows, tenant setup, permissions/health, settings normalization/display, and Highlander enforcement. Remaining priority work is the delegated Intune RBAC onboarding wizard (US7) and afterwards the Graph Contract Registry & Drift Guard (US8). All Graph calls stay behind the abstraction with audit logging; snapshots remain JSONB with safety gates (preview-only for high-risk types).

Status Snapshot (tasks.md is source of truth)

  • Done: US1 inventory, US2 backups, US3 versions/diffs, US4 restore preview/exec, scope config, soft-deletes/housekeeping, Highlander single current tenant, tenant setup & verify (US6), permissions/health overview (US6), table ActionGroup UX, settings normalization/display (US1b), Dokploy/Sail runbooks.
  • Next up: US7 Intune RBAC onboarding wizard (delegated, synchronous Filament flow).
  • Upcoming: US8 Graph Contract Registry & Drift Guard (contract registry, type-family handling, verification command, fallback strategies).

Technical Baseline

  • Laravel 12, Filament 4, PHP 8.4; Sail-first with PostgreSQL.
  • JSONB for policy/backup/version payloads; FK/time indexes, GIN where needed.
  • Graph abstraction with standardized error mapping/retries; no secrets in logs.
  • Audit trail across backup/restore/version/tenant/permission/wizard steps; tenant isolation enforced.
  • Restore matrix and supported types remain config-driven single sources of truth.
  • Safety: preview/dry-run, confirmation gates, warnings for high-risk types; no implicit tenants (Highlander).

Completed Workstreams (no new action needed)

  • US1 Inventory (Phase 3): Filament policy listing with type/category/platform filters; tenant-scoped.
  • US2 Backups (Phase 4): Backup sets/items in JSONB, immutable snapshots, audit logging, relation manager UX for attaching policies, soft-delete rules with restore-run guard.
  • US3 Versions/Diffs (Phase 5): Version capture, timelines, human+JSON diffs, soft-deletes with audit.
  • US4 Restore (Phase 6): Preview, selective execution, conflict warnings, per-type restore level (enabled vs preview-only), PowerShell decode/encode respected, audit of outcomes.
  • US6 Tenant Setup & Highlander (Phases 8 & 12): Tenant CRUD/verify, INTUNE_TENANT_ID override, is_current unique enforcement, “Make current” action, block deactivated tenants.
  • US6 Permissions/Health (Phase 9): Required permissions list, compare/check service, Verify action updates status and audit, permissions panel in Tenant detail.
  • US1b Settings Display (Phase 13): PolicyNormalizer + SnapshotValidator, warnings for malformed snapshots, normalized settings and pretty JSON on policy/version detail, list badges, README section.
  • Housekeeping/UX (Phases 1012): Soft/force deletes for tenants/backups/versions/restore runs with guards; table actions in ActionGroup per UX guideline.
  • Ops (Phase 7): Sail runbook and Dokploy staging→prod guidance captured.

Execution Plan: US7 Intune RBAC Onboarding Wizard (Phase 14)

  • Objectives: deliver delegated, tenant-scoped wizard that safely converges the Intune RBAC state for the configured service principal; fully audited, idempotent, least-privilege by default.
  • Scope alignment: FR-023FR-030, constitution (Safety-First, Auditability, Tenant-Aware, Graph Abstraction). No secret/token persistence; delegated tokens stay request-local and are not stored in DB/cache.
  • Design decisions:
    • Service: RbacOnboardingService orchestrates steps using GraphClientInterface; reuse RbacHealthService for verification; all calls through abstraction with error mapping.
  • Data: use existing tenant RBAC columns (rbac_group_id, rbac_group_name, rbac_role_assignment_id, rbac_role_key, rbac_scope_mode, rbac_scope_id, status fields). No new entities; ensure casts + guards.
  • Audit: log start, delegated login outcome, group ensure, membership ensure, role assignment ensure/update, verify results. No payload logging; only IDs/status codes.
  • Wizard flow (Filament, Tenant detail ActionGroup):
    1. Preconditions/config step with review screen: show tenant/app info, required permissions, least-privilege warning; inputs for role (default Policy/Profile Manager; Intune Administrator shows warning), scope (global default; optional group picker), group mode (create default TenantPilot-Intune-RBAC vs pick existing security-enabled group). Summarize planned changes before proceeding.
    2. Delegated auth step: initiate login; on failure stop with actionable message + audit; do not store token beyond request.
    3. Execute (synchronous): resolve service principal by app_client_id; on missing SP stop with consent-required hint + audit reason sp_not_found; ensure/create security group (validate securityEnabled=true); ensure SP membership (idempotent “already exists” OK); ensure/create/patch Intune role assignment for chosen role/scope; persist discovered IDs on tenant for idempotency.
    4. Post-verify: force fresh token acquisition; run canary reads (deviceConfigurations, deviceCompliancePolicies, conditionalAccess if enabled); update RBAC/permission health; surface warnings if scope-limited; audit verify result.
    5. Summary: show IDs (group, role assignment), role/scope used, verify status, CTA to retry policy sync.
  • UX rules: action only for active tenants with app_client_id; keep in ActionGroup with Admin consent/Verify; show badge/hint if RBAC missing; warnings on selecting Intune Administrator role; block execution if tenant inactive or missing consent/SP.
  • Safety/idempotency: handle “already exists” as success; no self-heal jobs; retry-safe writes; no queue usage to avoid token expiry; timeouts surfaced clearly; no delegated token persistence.
  • Tests: happy path, rerun idempotent, SP missing, insufficient privileges, non-security-enabled group failure, scope-limited warning, delegated auth failure path; Filament wizard visibility + summary rendering; health prompts to run wizard when RBAC missing.
  • Documentation: add wizard behavior, least-privilege defaults, audit expectations, “no token storage”, and how to rerun safely; note CTA to retry policy sync.
  • Operational note: After admin-consent or RBAC changes, force a fresh token acquisition (e.g., clear app token cache) before re-trying sync/backup/restore; Verify should run with a non-stale token. Optional CHECK/REPORT jobs only (no grant) remain out-of-scope for this phase.
  • Testing plan (Pest):
    • Service unit tests: happy path, rerun idempotent, SP missing, insufficient privileges, scope-limited warning, group exists/not security-enabled failure.
    • Filament feature: wizard visibility gating, delegated failure path, successful run shows summary and updates health, warnings rendered.
    • Health integration: Verify reflects RBAC status and prompts to run wizard when missing.
  • Deployment/ops: no new env vars; ensure migrations for tenant RBAC columns are applied; run targeted tests php artisan test tests/Unit/RbacOnboardingServiceTest.php tests/Feature/Filament/TenantRbacWizardTest.php; Pint on touched files.

Upcoming: US8 Graph Contract Registry & Drift Guard (Phase 15)

  • Objectives: centralize Graph contract assumptions per supported type/endpoint and provide drift detection + safe fallbacks so preview/restore remain stable on Graph shape/capability changes.
  • Scope alignment: FR-031FR-034 (spec), constitution (Safety-First, Auditability, Graph Abstraction, Tenant-Aware).
  • Approach:
    • Artifact: config/graph_contracts.php (or similar) with per-type contract data:
      • resource paths (collection + single item)
      • allowed $select / allowed $expand
      • type families / allowed @odata.type values
      • create/update methods, id field
      • hydration strategy (member expansion vs follow-up fetch vs unavailable)
    • Service: registry + checker; integrate with Graph client to enforce allowed capabilities and downgrade on capability errors (retry without expands/selects), recording warnings/audit entries.
    • Type families: treat derived @odata.type values within a declared family as compatible (no odata_mismatch) for routing preview/restore.
    • Verification: php artisan graph:contract:check (staging/CI) to probe endpoints and surface actionable diffs when Graph changes; opt-in/guarded for prod.
    • Docs: explain registry format and update process when Graph changes.
  • Testing outline: unit for registry lookups/type-family matching/fallback selection; integration/Pest to simulate capability errors and ensure downgrade path + correct routing for derived types.

Testing & Quality Gates

  • Continue using targeted Pest runs per change set; add/extend tests for US7 wizard now, and for US8 contracts when implemented.
  • Run Pint on touched files before finalizing.
  • Maintain tenant isolation, audit logging, and restore safety gates; validate snapshot shape and type-family compatibility prior to restore execution.

Restore Safety Gate

  • Restore execution MUST be blocked if a snapshots @odata.type is outside the declared type family for the target policy type (prevent cross-type/platform restores).
  • Restore preview MAY still render details + warnings for out-of-family snapshots, but MUST NOT offer an apply action.

Coordination

  • Update .specify/tasks.md to reflect progress on US7 wizard and future US8 contract tasks; no new entities or scope changes introduced here.
  • Stage validation required before production for any migration or restore-impacting change.
  • Keep Graph integration behind abstraction; no secrets in logs; follow existing UX patterns (ActionGroup, warnings for risky ops).