spec: 108 provider access hardening v1 — write-path RBAC gate (Intune)
This commit is contained in:
parent
8bee824966
commit
2a8d3c3bc4
@ -0,0 +1,36 @@
|
||||
# Specification Quality Checklist: Provider Access Hardening v1
|
||||
|
||||
**Purpose**: Validate specification completeness and quality before proceeding to planning
|
||||
**Created**: 2026-02-22
|
||||
**Feature**: [spec.md](../spec.md)
|
||||
|
||||
## Content Quality
|
||||
|
||||
- [x] No implementation details (languages, frameworks, APIs)
|
||||
- [x] Focused on user value and business needs
|
||||
- [x] Written for non-technical stakeholders
|
||||
- [x] All mandatory sections completed
|
||||
|
||||
## Requirement Completeness
|
||||
|
||||
- [x] No [NEEDS CLARIFICATION] markers remain
|
||||
- [x] Requirements are testable and unambiguous
|
||||
- [x] Success criteria are measurable
|
||||
- [x] Success criteria are technology-agnostic (no implementation details)
|
||||
- [x] All acceptance scenarios are defined
|
||||
- [x] Edge cases are identified
|
||||
- [x] Scope is clearly bounded
|
||||
- [x] Dependencies and assumptions identified
|
||||
|
||||
## Feature Readiness
|
||||
|
||||
- [x] All functional requirements have clear acceptance criteria
|
||||
- [x] User scenarios cover primary flows
|
||||
- [x] Feature meets measurable outcomes defined in Success Criteria
|
||||
- [x] No implementation details leak into specification
|
||||
|
||||
## Notes
|
||||
|
||||
- Spec is ready for `/speckit.clarify` or `/speckit.plan`.
|
||||
- No [NEEDS CLARIFICATION] markers — all decisions were informed by the detailed user input and existing codebase context.
|
||||
- The spec references existing codebase concepts (OperationRun, ProviderOperationStartGate, rbac_status fields) as domain terms, not implementation details.
|
||||
184
specs/108-provider-access-hardening/spec.md
Normal file
184
specs/108-provider-access-hardening/spec.md
Normal file
@ -0,0 +1,184 @@
|
||||
# Feature Specification: Provider Access Hardening v1 — Write-Path RBAC Gate (Intune)
|
||||
|
||||
**Feature Branch**: `108-provider-access-hardening`
|
||||
**Created**: 2026-02-22
|
||||
**Status**: Draft
|
||||
**Input**: Server-side write gate for Intune operations requiring RBAC hardening to be configured and healthy before any Graph mutations can execute.
|
||||
|
||||
## Spec Scope Fields *(mandatory)*
|
||||
|
||||
- **Scope**: tenant
|
||||
- **Primary Routes**: Tenant View page (RBAC card), Restore Run start actions, any Intune write-trigger actions
|
||||
- **Data Ownership**: tenant-owned (`tenants.rbac_status`, `tenants.rbac_last_checked_at`, `operation_runs`)
|
||||
- **RBAC**: workspace membership required + tenant-context access; write operations additionally gated by Intune RBAC hardening status (DB-persisted)
|
||||
|
||||
## User Scenarios & Testing *(mandatory)*
|
||||
|
||||
### User Story 1 — Write Operations Blocked When RBAC Not Configured (Priority: P1)
|
||||
|
||||
An operator attempts to restore an Intune policy (or restore assignments) on a tenant where Intune RBAC hardening has not been configured. The system blocks the operation at the server level before any Graph write occurs. The operator receives a clear explanation and a call-to-action directing them to configure Intune RBAC.
|
||||
|
||||
**Why this priority**: This is the core safety gate. Without it, write operations can execute with full app-only permissions and no blast-radius control — the primary compliance and trust risk this feature addresses.
|
||||
|
||||
**Independent Test**: Can be fully tested by attempting a restore start on a tenant with `rbac_status = null` and verifying the operation is blocked with the correct reason code.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a tenant with `rbac_status` not set (null), **When** an operator triggers a restore action, **Then** the system blocks the operation, does not enqueue any job, and returns a reason code `intune_rbac.not_configured` with a human-readable message.
|
||||
2. **Given** a tenant with `rbac_status = not_configured`, **When** an operator triggers a restore assignments action, **Then** the system blocks the operation with reason code `intune_rbac.not_configured` and provides a CTA to "Setup Intune RBAC".
|
||||
3. **Given** a tenant with `rbac_status = ok` and a fresh `rbac_last_checked_at`, **When** an operator triggers a restore action, **Then** the operation proceeds normally without any gate interference.
|
||||
|
||||
---
|
||||
|
||||
### User Story 2 — Write Operations Blocked When RBAC Unhealthy or Stale (Priority: P1)
|
||||
|
||||
An operator attempts a write operation on a tenant where RBAC hardening was previously configured but is now in a degraded, failed, or stale state. The system blocks the operation and explains why, offering relevant recovery actions.
|
||||
|
||||
**Why this priority**: Equally critical as US1 — a configured-but-broken RBAC state is arguably more dangerous because operators may assume it is safe.
|
||||
|
||||
**Independent Test**: Can be tested by setting `rbac_status = degraded` or `rbac_last_checked_at` to a date older than the freshness threshold, then attempting a write operation.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a tenant with `rbac_status = degraded`, **When** a write operation is attempted, **Then** the system blocks it with reason code `intune_rbac.unhealthy` and a CTA to "Run health check".
|
||||
2. **Given** a tenant with `rbac_status = failed`, **When** a write operation is attempted, **Then** the system blocks it with reason code `intune_rbac.unhealthy`.
|
||||
3. **Given** a tenant with `rbac_status = ok` but `rbac_last_checked_at` older than the configured freshness threshold, **When** a write operation is attempted, **Then** the system blocks it with reason code `intune_rbac.stale` and a CTA to "Run health check".
|
||||
|
||||
---
|
||||
|
||||
### User Story 3 — Defense in Depth: Job-Level Gate (Priority: P1)
|
||||
|
||||
Even if a write operation is somehow enqueued (race condition, direct dispatch, future code path), the job itself must re-check the gate before executing any Graph write call. If blocked, the job marks its OperationRun as failed with a stable reason code and does not attempt any Graph mutation.
|
||||
|
||||
**Why this priority**: Defense-in-depth is a non-negotiable for enterprise SaaS. The job-level gate is the last line of defense before actual Graph writes.
|
||||
|
||||
**Independent Test**: Can be tested by directly instantiating a restore job with a tenant in blocked state and verifying the OperationRun is marked failed without any Graph calls.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a tenant with `rbac_status = not_configured`, **When** `ExecuteRestoreRunJob` runs, **Then** the job marks the OperationRun as failed with reason code `intune_rbac.not_configured` and performs zero Graph write calls.
|
||||
2. **Given** a tenant with `rbac_status = ok` but stale `rbac_last_checked_at`, **When** `RestoreAssignmentsJob` runs, **Then** the job marks the OperationRun as failed with `intune_rbac.stale`.
|
||||
3. **Given** a tenant with `rbac_status = ok` and fresh health check, **When** a restore job runs, **Then** the gate passes and the job proceeds to execute Graph writes.
|
||||
|
||||
---
|
||||
|
||||
### User Story 4 — UI: Disabled Actions with Reason and CTA (Priority: P2)
|
||||
|
||||
When the Intune write gate would block an operation, the UI should proactively disable write-trigger actions (e.g., "Execute restore", "Restore assignments") and show the operator why the action is unavailable, along with a relevant CTA.
|
||||
|
||||
**Why this priority**: Good UX prevents confusion and reduces support burden. However, server-side enforcement (US1–US3) is the security boundary; UI is an affordance.
|
||||
|
||||
**Independent Test**: Can be tested by rendering a restore action on a tenant with blocked RBAC status and verifying the action is disabled with the correct tooltip/helper text.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a tenant with `rbac_status = null`, **When** the operator views restore actions, **Then** write-trigger actions are visible but disabled, with a helper explaining "Intune RBAC not configured" and a link to the setup wizard.
|
||||
2. **Given** a tenant with `rbac_status = degraded`, **When** the operator views write actions, **Then** actions are disabled with a helper explaining the degraded state and a CTA to run a health check.
|
||||
3. **Given** a tenant with `rbac_status = ok` and fresh health, **When** the operator views write actions, **Then** actions are enabled normally.
|
||||
|
||||
---
|
||||
|
||||
### User Story 5 — Tenant RBAC Status Card (Progressive Disclosure) (Priority: P2)
|
||||
|
||||
On the tenant view page, the RBAC hardening status is displayed as a compact card with a badge, short explanation, and contextual actions — replacing the current approach of showing many individual RBAC fields.
|
||||
|
||||
**Why this priority**: Improves operator understanding of RBAC posture at a glance. Supports the write gate UX by making status visible before operators attempt writes.
|
||||
|
||||
**Independent Test**: Can be tested by viewing a tenant page with various `rbac_status` values and verifying the card renders the correct badge, text, and actions.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a tenant with `rbac_status = ok`, **When** the operator views the tenant page, **Then** a card displays "Intune Access Hardening" with a "Healthy" badge and a "Run health check" action.
|
||||
2. **Given** a tenant with `rbac_status = null`, **When** the operator views the tenant page, **Then** a card displays a "Not Configured" badge and a "Setup Intune RBAC" action.
|
||||
3. **Given** a tenant with `rbac_status = degraded`, **When** the operator views the tenant page, **Then** a card displays a "Degraded" badge and both "Run health check" and "View details" actions.
|
||||
|
||||
---
|
||||
|
||||
### User Story 6 — Auditable Blocked Write Attempts (Priority: P3)
|
||||
|
||||
When a write operation is blocked by the gate, the event is recorded for audit and compliance purposes. At the job level this is captured via the OperationRun failure. At the UI level, an optional AuditLog entry records the blocked attempt.
|
||||
|
||||
**Why this priority**: Important for compliance and post-incident review but not a functional blocker for the gate itself.
|
||||
|
||||
**Independent Test**: Can be tested by triggering a blocked write and verifying the OperationRun or AuditLog contains the expected reason code and metadata.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a blocked write attempt in a job, **When** the gate blocks execution, **Then** the OperationRun is marked failed with `reason_code`, `reason_message`, and no sensitive data.
|
||||
2. **Given** a blocked write attempt at the UI start surface, **When** the gate prevents operation start, **Then** an AuditLog entry is created with the action `intune_rbac.write_blocked`, the tenant ID, and the operation type.
|
||||
|
||||
---
|
||||
|
||||
### Edge Cases
|
||||
|
||||
- What happens when `rbac_status` is transitioning (health check running concurrently with a write attempt)? The gate reads the persisted status at gate-check time; an in-flight health check does not change the outcome. The operator must wait for the health check to complete and then retry.
|
||||
- What happens when the freshness threshold configuration changes? The gate uses the threshold value at evaluation time. Lowering the threshold may immediately block previously-allowed operations if `rbac_last_checked_at` is now considered stale.
|
||||
- What happens when a tenant has `rbac_status = ok` but the underlying RBAC artifacts were removed externally in Entra/Intune? The gate will allow the write (status is persisted). The next health check will detect the problem and update `rbac_status` to `degraded` or `failed`. This is by design — no live Graph calls in the gate path.
|
||||
- How does the gate interact with `ProviderOperationStartGate`? The write hardening gate runs as an additional check within or alongside the existing start gate. If the provider connection is unresolved, that blocking reason takes precedence. If the connection is resolved but RBAC is unhealthy, the write hardening gate blocks.
|
||||
|
||||
## Requirements *(mandatory)*
|
||||
|
||||
**Constitution alignment (required):** This feature does not introduce new Microsoft Graph calls or new OperationRun types. It adds a server-side gate that blocks existing write operations. Blocked writes at the job level are recorded in the existing OperationRun (failed status + reason code). Blocked writes at the UI level optionally record an AuditLog entry. No new contract registry entries are required; this feature gates operations that already have registered contracts.
|
||||
|
||||
**Constitution alignment (RBAC-UX):**
|
||||
- Authorization plane: tenant-context `/admin/t/{tenant}/...`
|
||||
- The write gate is not an RBAC capability check — it is a tenant-health prerequisite check that applies regardless of the operator's role.
|
||||
- Existing RBAC capability checks (workspace membership, manage permission) remain enforced before the write gate is evaluated.
|
||||
- 404 vs 403 semantics: The write gate returns a 422-family response (operation precondition not met), not 403 or 404, since the operator is authorized but the tenant's RBAC posture is insufficient.
|
||||
- No new capability strings are introduced.
|
||||
- Destructive actions (restore) already require confirmation; the gate adds a pre-check before the confirmation flow even applies.
|
||||
|
||||
**Constitution alignment (BADGE-001):** The tenant RBAC status card uses the existing `TenantRbacStatus` badge domain from `BadgeDomain`. New badge values (`stale`) must be registered in the centralized badge semantics. Tests will cover all badge states.
|
||||
|
||||
**Constitution alignment (Filament Action Surfaces):** This feature modifies existing Filament action surfaces (restore start actions on RestoreRunResource and tenant view page). No new Resources or Pages are created. The UI Action Matrix below covers the changes.
|
||||
|
||||
**Constitution alignment (UX-001):** The tenant RBAC card is placed within the existing tenant View page infolist, inside a Section. No naked inputs. Badge semantics use BADGE-001.
|
||||
|
||||
### Functional Requirements
|
||||
|
||||
- **FR-001**: System MUST evaluate Intune RBAC hardening status before allowing any Intune write operation to start or execute.
|
||||
- **FR-002**: System MUST block write operations when the tenant's RBAC status is `null`, `not_configured`, `degraded`, or `failed`.
|
||||
- **FR-003**: System MUST block write operations when `rbac_last_checked_at` is older than a configurable freshness threshold (default: 24 hours).
|
||||
- **FR-004**: The write gate MUST use only persisted database state (no synchronous Graph calls during evaluation).
|
||||
- **FR-005**: When a write operation is blocked at the job level, the system MUST mark the associated OperationRun as failed with a stable reason code (`intune_rbac.not_configured`, `intune_rbac.unhealthy`, or `intune_rbac.stale`) and a sanitized message.
|
||||
- **FR-006**: When a write operation is blocked at the UI start surface, the system MUST prevent job enqueue and display the reason with a CTA to the operator.
|
||||
- **FR-007**: Blocked write operations MUST NOT perform any Microsoft Graph mutations — zero write calls.
|
||||
- **FR-008**: The write gate MUST be enforced at both the start surface (UI/command) and the job execution layer (defense in depth).
|
||||
- **FR-009**: The tenant view page MUST display a compact RBAC hardening status card with badge, explanation, and contextual actions (setup wizard, run health check).
|
||||
- **FR-010**: Write-trigger Filament actions MUST be disabled with a reason tooltip when the write gate would block the operation.
|
||||
- **FR-011**: The gate design MUST be provider-agnostic in its interface, even though v1 only implements the Intune check. Future providers can plug in without redesign.
|
||||
- **FR-012**: A "Refresh Intune RBAC status" action on the tenant page MUST start an OperationRun that runs the health check asynchronously.
|
||||
- **FR-013**: The gate MUST be toggleable via configuration (`tenantpilot.hardening.intune_write_gate.enabled`, default: `true`) for rollback safety.
|
||||
|
||||
## UI Action Matrix *(mandatory when Filament is changed)*
|
||||
|
||||
| Surface | Location | Header Actions | Inspect Affordance | Row Actions | Bulk Actions | Empty-State CTA(s) | View Header Actions | Create/Edit Save+Cancel | Audit log? | Notes / Exemptions |
|
||||
|---|---|---|---|---|---|---|---|---|---|---|
|
||||
| TenantResource ViewTenant | Tenant View page | — | RBAC status card with badge | — | — | — | "Refresh RBAC status" (OperationRun), "Setup RBAC" (wizard link) | — | Yes (blocked writes) | Card replaces raw field list for RBAC section |
|
||||
| RestoreRunResource | Restore Run actions | "Execute" action: disabled when gate blocks | — | — | — | — | "Execute": disabled + tooltip when blocked | — | Yes (OperationRun failure) | Gate check added before existing confirmation |
|
||||
|
||||
### Key Entities
|
||||
|
||||
- **IntuneRbacWriteGate**: Central service responsible for evaluating whether an Intune write operation is allowed. Reads tenant RBAC status fields, returns allowed or throws a domain exception with a stable reason code.
|
||||
- **ProviderAccessHardeningRequired**: Domain exception carrying tenant ID, operation identifier, reason code, and a safe human-readable message. Used by both start surfaces and job-level enforcement.
|
||||
- **Tenant (existing)**: Existing entity with `rbac_status`, `rbac_status_reason`, `rbac_last_checked_at` fields used as the gate's data source.
|
||||
- **OperationRun (existing)**: Existing entity that captures job outcomes. Blocked write operations store the reason code in the run's failure metadata.
|
||||
|
||||
## Assumptions
|
||||
|
||||
- The existing `rbac_status`, `rbac_status_reason`, and `rbac_last_checked_at` fields on the Tenant model are sufficient for gate evaluation — no schema migrations are required.
|
||||
- The existing periodic health check job (or `ProviderConnectionHealthCheckJob`) already updates RBAC status fields, or will be extended to do so as part of this feature.
|
||||
- The freshness threshold defaults to 24 hours and is configured via `config('tenantpilot.hardening.intune_write_gate.freshness_threshold_hours')`.
|
||||
- `ProviderOperationStartGate` is the existing entry point for starting provider-backed operations and can be extended to invoke the write hardening gate for write-classified operations.
|
||||
- The gate applies to all Intune write operations: restore, restore assignments, and any future write operation types.
|
||||
|
||||
## Success Criteria *(mandatory)*
|
||||
|
||||
### Measurable Outcomes
|
||||
|
||||
- **SC-001**: 100% of Intune write operations are blocked when the tenant's RBAC hardening status is not "ok" or is stale — verified by automated tests covering all three reason codes.
|
||||
- **SC-002**: Zero Graph write calls occur when the gate blocks an operation — verified by mocking the Graph client and asserting zero invocations in gate-blocked scenarios.
|
||||
- **SC-003**: Operators see a clear reason and CTA within the UI when a write action is blocked — verified by Livewire component tests asserting disabled state and helper text.
|
||||
- **SC-004**: OperationRun failures from gate blocks contain stable, parseable reason codes — verified by asserting the `reason_code` field in job-level tests.
|
||||
- **SC-005**: The gate adds no synchronous Graph calls to any UI render or action request — verified by architectural tests asserting no HTTP calls during gate evaluation.
|
||||
- **SC-006**: Full test suite remains green with the gate enabled (default) and with the gate disabled (config toggle) — regression safety.
|
||||
Loading…
Reference in New Issue
Block a user