Implements LIST `$expand` parity with GET by forwarding caller-provided, contract-allowlisted expands. Key changes: - Entra Admin Roles scan now requests `expand=principal` for role assignments so `principal.displayName` can render. - `$expand` normalization/sanitization: top-level comma split (commas inside balanced parentheses preserved), trim, dedupe, allowlist exact match, caps (max 10 tokens, max 200 chars/token). - Diagnostics when expands are removed/truncated (non-prod warning, production low-noise). Tests: - Adds/extends unit coverage for Graph contract sanitization, list request shaping, and the EntraAdminRolesReportService. Spec artifacts included under `specs/112-list-expand-parity/`. Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de> Reviewed-on: #136
87 lines
4.3 KiB
Markdown
87 lines
4.3 KiB
Markdown
# Research — Graph Contracts: LIST `$expand` Parity Fix
|
||
|
||
## Context
|
||
This feature fixes a capability mismatch between LIST vs GET Graph calls:
|
||
- `MicrosoftGraphClient::getPolicy()` already supports `$expand` (via query sanitization).
|
||
- `MicrosoftGraphClient::listPolicies()` currently does **not** forward `$expand` at all.
|
||
|
||
Downstream impact: `EntraAdminRolesReportService` lists `entraRoleAssignments` and then reads `principal.displayName`. The contract allowlist permits `principal`, but it is never requested on LIST today, so the report frequently falls back to `"Unknown"`.
|
||
|
||
## Decisions
|
||
|
||
### 1) `$expand` input normalization
|
||
**Decision**: Accept `expand` as either:
|
||
- comma-separated string (e.g. `"principal,roleDefinition($select=id,displayName)"`)
|
||
- array of strings (e.g. `["principal", "roleDefinition($select=id,displayName)"]`)
|
||
|
||
Normalization steps (applied consistently inside `GraphContractRegistry::sanitizeQuery()`):
|
||
- Split on **top-level commas only** for string input (commas inside balanced parentheses are not separators).
|
||
- Trim whitespace.
|
||
- Drop empty tokens.
|
||
- Dedupe while preserving first-seen order.
|
||
|
||
**Rationale**: This matches the spec clarifications and keeps all LIST/GET query-shape logic in the contract registry (single source of truth).
|
||
|
||
Note: For subquery/select-style expansions, prefer passing an array form (one token per expansion) to avoid ambiguity.
|
||
|
||
**Alternatives considered**:
|
||
- Normalize in `MicrosoftGraphClient` only: rejected because sanitization must remain centralized.
|
||
|
||
### 2) Capability allowlist enforcement
|
||
**Decision**: Keep exact-match allowlist behavior via `array_intersect()`.
|
||
|
||
**Rationale**: Prevents partial matches and blocks accidental subquery injection. Existing registry behavior is already exact-match.
|
||
|
||
### 3) Performance/safety caps for `$expand`
|
||
**Decision**: Apply a hard cap during normalization:
|
||
- `maxItems = 10` allowed expand tokens (keep first 10 allowed after allowlist)
|
||
- `maxTokenLen = 200` per token (tokens exceeding are treated as invalid and dropped)
|
||
|
||
**Rationale**:
|
||
- Contract allowlists are small in this repo (most policy types allow 0–2 expands).
|
||
- A small cap prevents pathological caller inputs from bloating URLs and logs.
|
||
- `200` aligns with the existing bounded-string pattern used in `InventoryMetaSanitizer::boundedStringList($values, 25, 200)`.
|
||
|
||
**Alternatives considered**:
|
||
- `maxItems = 25` to match `InventoryMetaSanitizer`: rejected as unnecessary for Graph query params.
|
||
- Only cap total string length: rejected because token count is easier to reason about and test.
|
||
|
||
### 4) Diagnostics when sanitize removes/truncates `$expand`
|
||
**Decision**: Emit a structured log record when:
|
||
- disallowed expand tokens are dropped
|
||
- tokens are dropped due to maxTokenLen
|
||
- allowed tokens are truncated due to maxItems
|
||
|
||
Logging behavior:
|
||
- Non-production: `Log::warning('Graph query sanitized', [...])`
|
||
- Production: `Log::debug('Graph query sanitized', [...])`
|
||
|
||
Context fields to include:
|
||
- `policy_type`
|
||
- `query_key` = `'$expand'`
|
||
- `requested` (normalized tokens)
|
||
- `allowed` (filtered tokens)
|
||
- `removed` (dropped tokens)
|
||
- `reason` (e.g. `disallowed`, `too_long`, `max_items`)
|
||
|
||
**Rationale**:
|
||
- The spec requires that misuse is detectable in non-production.
|
||
- Production should be low-noise.
|
||
- This matches existing patterns where services log structured `type`/`reason` context (e.g. inventory dependency extraction warnings).
|
||
|
||
**Alternatives considered**:
|
||
- Rely on `GraphResponse->warnings` only: rejected because `GraphLogger` logs warnings at info level, which can be noisy in production and doesn’t guarantee visibility in non-prod.
|
||
- Introduce a new event/telemetry system: rejected as out of scope for this fix.
|
||
|
||
## Testing approach (selected)
|
||
- Unit test `GraphContractRegistry::sanitizeQuery()` for `$expand` normalization:
|
||
- comma-separated string is split
|
||
- disallowed tokens dropped
|
||
- max-items behavior keeps first N allowed
|
||
- emits `Log::warning` in non-prod when removal occurs
|
||
- Unit test `MicrosoftGraphClient::listPolicies()` request shape using `Http::fake()` + `Http::assertSent()`:
|
||
- when `expand` is provided and allowlisted, outbound URL includes `%24expand=...`
|
||
- Unit/feature test for `EntraAdminRolesReportService`:
|
||
- ensures role assignment list call requests `expand=principal`
|
||
- ensures payload uses `principal.displayName` when present
|