Implements Spec 116 baseline drift engine v1 (meta fidelity) with coverage guard, stable finding identity, and Filament UI surfaces. Highlights - Baseline capture/compare jobs and supporting services (meta contract hashing via InventoryMetaContract + DriftHasher) - Coverage proof parsing + compare partial outcome behavior - Filament pages/resources/widgets for baseline compare + drift landing improvements - Pest tests for capture/compare/coverage guard and UI start surfaces - Research report: docs/research/golden-master-baseline-drift-deep-analysis.md Validation - `vendor/bin/sail bin pint --dirty` - `vendor/bin/sail artisan test --compact --filter="Baseline"` Notes - No destructive user actions added; compare/capture remain queued jobs. - Provider registration unchanged (Laravel 11+/12 uses bootstrap/providers.php for panel providers; not touched here). Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de> Reviewed-on: #141
665 lines
42 KiB
Markdown
665 lines
42 KiB
Markdown
# Golden Master / Baseline Drift — Deep Settings-Drift (Content-Fidelity) Analysis
|
||
|
||
> Enterprise Research Report for TenantAtlas / TenantPilot
|
||
> Date: 2025-07-15
|
||
> Scope: Architecture, code evidence, implementation proposal
|
||
|
||
---
|
||
|
||
## Table of Contents
|
||
|
||
1. [Executive Summary](#1-executive-summary)
|
||
2. [System Map — Side-by-Side Comparison](#2-system-map)
|
||
3. [Architecture Decision Record (ADR-001): Unify vs Separate](#3-adr-001)
|
||
4. [Deep-Dive: Why Settings Changes Don't Produce Baseline Drift](#4-deep-dive)
|
||
5. [Code Evidence Table](#5-code-evidence)
|
||
6. [Type Coverage Matrix](#6-type-coverage-matrix)
|
||
7. [Proposal: Deep Drift Implementation Plan](#7-deep-drift-plan)
|
||
8. [Test Plan (Enterprise)](#8-test-plan)
|
||
9. [Open Questions / Assumptions](#9-open-questions)
|
||
10. [Key Questions Answered (KQ-01 through KQ-06)](#10-key-questions)
|
||
|
||
---
|
||
|
||
## 1. Executive Summary
|
||
|
||
1. **Two parallel drift systems exist**: *Baseline Compare* (meta fidelity, inventory-sourced) and *Backup Drift* (content fidelity, PolicyVersion-sourced). They share `DriftHasher` but are otherwise separate data paths with separate finding generators.
|
||
|
||
2. **The core gap**: `CompareBaselineToTenantJob` hashes `InventoryMetaContract` v1 — which contains only `odata_type`, `etag`, `scope_tag_ids`, `assignment_target_count` — never actual policy settings. When an admin changes a Wi-Fi password or a compliance threshold in Intune, _none of these meta signals necessarily change_.
|
||
|
||
3. **Inventory sync uses Graph LIST endpoints**, which return metadata and display fields only. Per-item GET (which fetches settings, assignments, scope tags) is only performed during _Backup_ via `PolicyCaptureOrchestrator`.
|
||
|
||
4. **`DriftFindingGenerator`** (the backup drift system) _does_ detect settings changes — it normalizes `PolicyVersion.snapshot` via `SettingsNormalizer` → `PolicyNormalizer::flattenForDiff()` → type-specific normalizers, then hashes with `DriftHasher`.
|
||
|
||
5. **Spec 116 already designs v2** with a provider precedence chain (`PolicyVersion → Inventory content → Meta fallback`), which is the correct architectural direction. The v1 meta baseline shipped first as a deliberate, safe-to-ship initial milestone.
|
||
|
||
6. **Unification is recommended** (provider chain approach) — not merging the two jobs, but enabling `CompareBaselineToTenantJob` to optionally consume `PolicyVersion` snapshots as a content-fidelity provider, falling back to InventoryMetaContract when no PolicyVersion is available.
|
||
|
||
7. **28 supported policy types** are registered in `tenantpilot.php`, plus 3 foundation types. Of these, 10+ have complex hydration (settings catalog, group policy, security baselines, compliance actions) and would benefit most from deep-drift detection.
|
||
|
||
8. **The `etag` signal** is unreliable as a settings-change proxy: Microsoft Graph etag semantics vary per resource type, and etag may or may not change when settings are modified. It is useful as a _hint_ but not a _guarantee_.
|
||
|
||
9. **API cost is the primary constraint**: content-fidelity compare requires per-item GET calls (or a recent Backup that already captured PolicyVersions). The hybrid provider chain avoids this by opportunistically _reusing_ existing PolicyVersions without requiring a full backup before every compare.
|
||
|
||
10. **Coverage Guard is critical for v2**: the baseline system must know _which types have fresh PolicyVersions_ and suppress content-fidelity findings for types where no recent version exists (falling back to meta fidelity).
|
||
|
||
11. **Risk profile**: Shipping deep-drift for wrong types (without proper per-type normalization) could produce false positives. Type-specific normalizers already exist for the backup drift path; reusing them is safe.
|
||
|
||
12. **Recommended phasing**: v1.5 (current sprint) = add `content_hash_source` column to `baseline_snapshot_items` + provider chain in compare job. v2.0 = on-demand per-item GET during baseline capture for types lacking recent PolicyVersions.
|
||
|
||
---
|
||
|
||
## 2. System Map
|
||
|
||
### Side-by-Side Comparison Table
|
||
|
||
| Dimension | System A: Baseline Compare | System B: Backup Drift |
|
||
|---|---|---|
|
||
| **Entry point** | `CompareBaselineToTenantJob` | `GenerateDriftFindingsJob` → `DriftFindingGenerator` |
|
||
| **Data source (current)** | `InventoryItem` (from LIST sync) | `PolicyVersion` (from per-item GET backup) |
|
||
| **Data source (baseline)** | `BaselineSnapshotItem` (captured from inventory) | Earlier `PolicyVersion` from prior `OperationRun` |
|
||
| **Hash contract** | `InventoryMetaContract` v1 → `DriftHasher::hashNormalized()` | `SettingsNormalizer` → `PolicyNormalizer::flattenForDiff()` → `DriftHasher::hashNormalized()` |
|
||
| **Hash inputs** | `version`, `policy_type`, `subject_external_id`, `odata_type`, `etag`, `scope_tag_ids`, `assignment_target_count` | Full `PolicyVersion.snapshot` JSON (with volatile key removal) |
|
||
| **Fidelity** | `meta` (persisted as `fidelity='meta'` in snapshot context) | `content` (settings + assignments + scope_tags) |
|
||
| **Dimensions detected** | `missing_policy`, `different_version`, `unexpected_policy` | `policy_snapshot` (added/removed/modified), `policy_assignments` (modified), `policy_scope_tags` (modified) |
|
||
| **Finding identity** | `recurrence_key = sha256(tenantId\|snapshotId\|policyType\|extId\|changeType)` | `recurrence_key = sha256(drift:tenantId:scopeKey:subjectType:extId:dimension)` |
|
||
| **Scope key** | `baseline_profile:{profileId}` | `DriftScopeKey::fromSelectionHash()` |
|
||
| **Auto-close** | `BaselineAutoCloseService` (stale finding resolution) | `resolveStaleDriftFindings()` within `DriftFindingGenerator` |
|
||
| **Coverage guard** | `InventoryCoverage::fromContext()` → uncovered types → partial outcome | None (trusts backup captured all types) |
|
||
| **Graph API calls** | Zero at compare time (reads from DB) | Zero at compare time (reads PolicyVersions from DB) |
|
||
| **Graph API calls (capture)** | Zero (inventory sync did LIST) | Per-item GET via `PolicyCaptureOrchestrator` |
|
||
| **Normalizer pipeline** | None (meta contract is the normalization) | `SettingsNormalizer` → `PolicyNormalizer` → type normalizers |
|
||
| **Shared components** | `DriftHasher`, `Finding` model | `DriftHasher`, `Finding` model |
|
||
| **Trigger** | After inventory sync, on schedule/manual | After backup, on schedule/manual |
|
||
|
||
### Data Flow Diagrams
|
||
|
||
```
|
||
SYSTEM A — Baseline Compare (Meta Fidelity)
|
||
============================================
|
||
Graph LIST ──► InventorySyncService ──► InventoryItem (meta_jsonb)
|
||
│
|
||
▼
|
||
CaptureBaselineSnapshotJob
|
||
├─ InventoryMetaContract.build()
|
||
├─ DriftHasher.hashNormalized()
|
||
└─► BaselineSnapshotItem (baseline_hash)
|
||
│
|
||
▼
|
||
CompareBaselineToTenantJob
|
||
├─ loadCurrentInventory() → InventoryItem
|
||
├─ BaselineSnapshotIdentity.hashItemContent()
|
||
│ └─ InventoryMetaContract.build()
|
||
│ └─ DriftHasher.hashNormalized()
|
||
├─ computeDrift() → hash compare
|
||
└─ upsertFindings() → Finding records
|
||
|
||
|
||
SYSTEM B — Backup Drift (Content Fidelity)
|
||
============================================
|
||
Graph GET ──► PolicySnapshotService.fetch() ──► full JSON snapshot
|
||
│
|
||
▼
|
||
PolicyCaptureOrchestrator.capture()
|
||
├─ assignments GET
|
||
├─ scope tags resolve
|
||
└─► VersionService.captureVersion() ──► PolicyVersion
|
||
│
|
||
▼
|
||
DriftFindingGenerator.generate()
|
||
├─ versionForRun() → baseline/current PV
|
||
├─ SettingsNormalizer.normalizeForDiff()
|
||
│ └─ PolicyNormalizer.flattenForDiff()
|
||
├─ DriftHasher.hashNormalized() × 3
|
||
│ (snapshot, assignments, scope_tags)
|
||
└─ upsertDriftFinding() → Finding records
|
||
```
|
||
|
||
---
|
||
|
||
## 3. ADR-001: Unify vs Separate
|
||
|
||
### Title
|
||
ADR-001: Golden Master Baseline Compare — Provider Chain for Content Fidelity
|
||
|
||
### Status
|
||
PROPOSED
|
||
|
||
### Context
|
||
|
||
TenantPilot has two drift detection systems that evolved independently:
|
||
|
||
- **System A (Baseline Compare)**: Designed for "does the tenant still match the golden master?" Use case. Ships with meta-fidelity (v1) — fast, cheap, zero additional Graph calls at compare time. Detects structural drift (policy added/removed/meta-changed) but is blind to _settings_ changes.
|
||
|
||
- **System B (Backup Drift)**: Designed for "what changed between two backup points?" Use case. Content-fidelity — full PolicyVersion snapshots with per-type normalization. Detects settings, assignments, and scope tag changes.
|
||
|
||
The two systems cannot be merged into one without fundamentally changing their triggering, scoping, and API cost models. However, System A's accuracy can be dramatically improved by _consuming_ data already produced by System B.
|
||
|
||
### Decision
|
||
|
||
**Adopt the Provider Chain pattern** as already designed in Spec 116 v2:
|
||
|
||
```
|
||
ContentProvider = PolicyVersion → InventoryContent → MetaFallback
|
||
```
|
||
|
||
Specifically:
|
||
1. `CompareBaselineToTenantJob` gains a `ContentProviderChain` that, for each `(policy_type, external_id)`:
|
||
- **First**: Looks for a `PolicyVersion` captured since the last baseline snapshot timestamp. If found, normalizes via `SettingsNormalizer` → `DriftHasher` → returns `content` fidelity hash.
|
||
- **Second (future)**: Looks for enriched inventory content if inventory sync is upgraded to capture settings (v2.0+).
|
||
- **Fallback**: Builds `InventoryMetaContract` v1 → `DriftHasher` → returns `meta` fidelity hash.
|
||
|
||
2. Each baseline snapshot item records its `fidelity` (`meta` | `content`) and `content_hash_source` (`inventory_meta_v1` | `policy_version:{id}` | `inventory_content_v2`).
|
||
|
||
3. Compare findings carry `fidelity` in evidence, enabling UI to display confidence level.
|
||
|
||
4. Coverage Guard is extended: a type is `content-covered` only if PolicyVersions exist for ≥N% of items. Below that threshold, fallback to meta fidelity (do not suppress).
|
||
|
||
### Consequences
|
||
|
||
- **Positive**: No new Graph API calls needed (reuses existing PolicyVersions from backups). Zero additional infrastructure. Incremental rollout per policy type. Existing meta-fidelity behavior preserved as fallback.
|
||
- **Negative**: Content fidelity depends on backup recency. If a tenant hasn't been backed up, only meta fidelity is available. Could create "mixed fidelity" findings within a single compare run.
|
||
- **Rejected Alternative**: Full merge of System A and B into a single system. Rejected because they serve different use cases (golden master comparison vs point-in-time drift), have different scoping models (BaselineProfile vs selection_hash), and different triggering models (post-inventory-sync vs post-backup).
|
||
- **Rejected Alternative**: Always-GET during baseline compare. Rejected due to API cost (30+ types × 100s of policies = 1000s of GET calls per tenant per compare run).
|
||
|
||
### Compliance Notes
|
||
- Livewire v4.0+ / Filament v5: no UI changes in core ADR; provider chain is purely backend.
|
||
- Provider registration: n/a (backend services only).
|
||
- No destructive actions.
|
||
- Asset strategy: no new assets.
|
||
|
||
---
|
||
|
||
## 4. Deep-Dive: Why Settings Changes Don't Produce Baseline Drift
|
||
|
||
### The Root Cause Chain
|
||
|
||
**Step 1: Inventory Sync captures only LIST metadata**
|
||
|
||
`InventorySyncService::executeSelectionUnderLock()` (line ~340-450) calls Graph LIST endpoints. For each policy, it extracts:
|
||
- `display_name`, `category`, `platform` (display fields)
|
||
- `odata_type`, `etag`, `scope_tag_ids`, `assignment_target_count` (meta signals)
|
||
|
||
These are stored in `InventoryItem.meta_jsonb`. **No settings values are fetched or stored.**
|
||
|
||
**Step 2: Baseline Capture hashes only the Meta Contract**
|
||
|
||
`CaptureBaselineSnapshotJob::collectSnapshotItems()` reads from `InventoryItem`, then calls `BaselineSnapshotIdentity::hashItemContent()`:
|
||
|
||
```php
|
||
// BaselineSnapshotIdentity.php, line 56-67
|
||
public function hashItemContent(string $policyType, string $subjectExternalId, array $metaJsonb): string
|
||
{
|
||
$contract = $this->metaContract->build(
|
||
policyType: $policyType,
|
||
subjectExternalId: $subjectExternalId,
|
||
metaJsonb: $metaJsonb,
|
||
);
|
||
return $this->hasher->hashNormalized($contract);
|
||
}
|
||
```
|
||
|
||
The `InventoryMetaContract::build()` output is:
|
||
```php
|
||
[
|
||
'version' => 1,
|
||
'policy_type' => 'settingsCatalogPolicy',
|
||
'subject_external_id' => '<guid>',
|
||
'odata_type' => '#microsoft.graph.deviceManagementConfigurationPolicy',
|
||
'etag' => '"abc..."', // ← unreliable change indicator
|
||
'scope_tag_ids' => ['0'],
|
||
'assignment_target_count' => 3,
|
||
]
|
||
```
|
||
|
||
**This is ALL that gets hashed.** Actual policy settings (the Wi-Fi password, the compliance threshold, the firewall rule) are _nowhere_ in this contract.
|
||
|
||
**Step 3: Baseline Compare re-computes the same meta hash**
|
||
|
||
`CompareBaselineToTenantJob::loadCurrentInventory()` (line 367-409) reads current `InventoryItem` records and calls the same `BaselineSnapshotIdentity::hashItemContent()` with the same `InventoryMetaContract`, producing the same hash structure.
|
||
|
||
`computeDrift()` (line 435-500) then compares `baseline_hash` vs `current_hash`:
|
||
|
||
```php
|
||
if ($baselineItem['baseline_hash'] !== $currentItem['current_hash']) {
|
||
$drift[] = ['change_type' => 'different_version', ...];
|
||
}
|
||
```
|
||
|
||
**If the admin changed a policy setting but the meta signals (etag, scope_tag_ids, assignment_target_count) stayed the same, `baseline_hash === current_hash` and NO drift is detected.**
|
||
|
||
### Why etag is unreliable
|
||
|
||
Microsoft Graph etag behavior varies by resource type:
|
||
- **Some types** update etag on any property change (including settings)
|
||
- **Some types** update etag only on top-level property changes (not nested settings)
|
||
- **Settings Catalog policies** may or may not update the parent resource etag when child `settings` are modified (the settings are a separate subresource at `/configurationPolicies/{id}/settings`)
|
||
- **Group Policy Configurations** have settings in `definitionValues` → `presentationValues` (multi-level nesting); etag at root level may not reflect these changes
|
||
|
||
### The Contrast: How Backup Drift _Does_ Detect Settings Changes
|
||
|
||
`DriftFindingGenerator::generate()` (line 32-80) operates on `PolicyVersion.snapshot` — the full JSON captured via per-item GET:
|
||
|
||
```php
|
||
$baselineSnapshot = $baselineVersion->snapshot; // Full JSON from Graph GET
|
||
$currentSnapshot = $currentVersion->snapshot;
|
||
|
||
$baselineNormalized = $this->settingsNormalizer->normalizeForDiff($baselineSnapshot, $policyType, $platform);
|
||
$currentNormalized = $this->settingsNormalizer->normalizeForDiff($currentSnapshot, $policyType, $platform);
|
||
|
||
$baselineSnapshotHash = $this->hasher->hashNormalized($baselineNormalized);
|
||
$currentSnapshotHash = $this->hasher->hashNormalized($currentNormalized);
|
||
|
||
if ($baselineSnapshotHash !== $currentSnapshotHash) {
|
||
// → Drift finding with change_type = 'modified'
|
||
}
|
||
```
|
||
|
||
This pipeline captures actual settings values, normalizes them per policy type, strips volatile metadata, and hashes the result. If a setting changed, the hash changes, and drift is detected.
|
||
|
||
### Summary Visualization
|
||
|
||
```
|
||
Admin changes Wi-Fi password in Intune
|
||
│
|
||
▼
|
||
┌─────────────────────────────────┐
|
||
│ Graph LIST (inventory sync) │
|
||
│ returns: displayName, etag, ... │
|
||
│ │
|
||
│ etag MAY change, settings NOT │
|
||
│ returned by LIST endpoint │
|
||
└────────────┬────────────────────┘
|
||
│
|
||
┌───────┴────────┐
|
||
▼ ▼
|
||
InventoryItem PolicyVersion
|
||
(meta only) (if backup ran)
|
||
│ │
|
||
▼ ▼
|
||
Meta Contract Full Snapshot
|
||
hash unchanged hash CHANGED
|
||
│ │
|
||
▼ ▼
|
||
Baseline Backup Drift:
|
||
Compare: "modified" ✅
|
||
NO DRIFT ❌
|
||
```
|
||
|
||
---
|
||
|
||
## 5. Code Evidence Table
|
||
|
||
| # | Class / File | Lines | Role | Key Finding |
|
||
|---|---|---|---|---|
|
||
| 1 | `app/Jobs/CompareBaselineToTenantJob.php` | 785 total; L367-409 (loadCurrentInventory), L435-500 (computeDrift) | Core baseline compare job | Reads from `InventoryItem` only; hashes via `InventoryMetaContract` → **blind to settings** |
|
||
| 2 | `app/Services/Baselines/InventoryMetaContract.php` | 75 total; L30-57 (build) | Meta hash contract builder | Hashes only: version, policy_type, external_id, odata_type, etag, scope_tag_ids, assignment_target_count — **no settings content** |
|
||
| 3 | `app/Services/Baselines/BaselineSnapshotIdentity.php` | 73 total; L56-67 (hashItemContent) | Per-item hash via meta contract | Delegates to `InventoryMetaContract.build()` → `DriftHasher.hashNormalized()` |
|
||
| 4 | `app/Jobs/CaptureBaselineSnapshotJob.php` | 305 total | Captures snapshot from inventory | Reads `InventoryItem`, stores `fidelity='meta'` and `source='inventory'` |
|
||
| 5 | `app/Services/Drift/DriftFindingGenerator.php` | 484 total; L32-80 (generate), L250-267 (recurrenceKey) | Backup drift finding generator | Uses `PolicyVersion.snapshot` with `SettingsNormalizer` → **detects settings changes** |
|
||
| 6 | `app/Services/Drift/DriftHasher.php` | 100 total; L13-24 (hashNormalized) | Shared hasher | `sha256(json_encode(normalized))` with volatile key removal. **SHARED by both systems.** |
|
||
| 7 | `app/Services/Drift/Normalizers/SettingsNormalizer.php` | 22 total | Thin wrapper | Delegates to `PolicyNormalizer::flattenForDiff()`. Used by System B only. |
|
||
| 8 | `app/Services/Intune/PolicyNormalizer.php` | 67 total | Type-specific normalizer router | Routes to per-type normalizers for diff operations |
|
||
| 9 | `app/Services/Inventory/InventorySyncService.php` | 652 total; L340-450 (executeSelectionUnderLock) | LIST-based sync | Fetches from Graph LIST endpoints; extracts meta signals only; upserts `InventoryItem` |
|
||
| 10 | `app/Services/Intune/BackupService.php` | 438 total | Backup orchestration | Creates `BackupSet`, uses `PolicyCaptureOrchestrator` for per-item GET → PolicyVersion |
|
||
| 11 | `app/Services/Intune/PolicyCaptureOrchestrator.php` | 429 total | Per-item GET + hydration | Fetches full snapshot, assignments, scope tags; creates PolicyVersion with all content |
|
||
| 12 | `app/Services/Intune/PolicySnapshotService.php` | 852 total | Per-item Graph GET | Type-specific hydration (hydrateConfigurationPolicySettings, hydrateGroupPolicyConfiguration, etc.) |
|
||
| 13 | `app/Services/Intune/VersionService.php` | 312 total; L1-150 (captureVersion) | PolicyVersion persistence | Transactional, locking, consecutive version_number |
|
||
| 14 | `app/Models/PolicyVersion.php` | Model | PolicyVersion model | Casts: snapshot(array), assignments(array), scope_tags(array), plus hash columns |
|
||
| 15 | `app/Models/InventoryItem.php` | Model | Inventory item model | Casts: meta_jsonb(array) — **no settings content** |
|
||
| 16 | `app/Models/BaselineSnapshotItem.php` | Model | Snapshot item model | Has `baseline_hash(64)`, `meta_jsonb` |
|
||
| 17 | `app/Support/Inventory/InventoryCoverage.php` | 173 total | Coverage parser | `fromContext()` extracts per-type status from sync run context |
|
||
| 18 | `app/Services/Drift/DriftRunSelector.php` | ~60 total | Run pair selector | Selects 2 most recent sync runs with same `selection_hash` (System B only) |
|
||
| 19 | `app/Jobs/GenerateDriftFindingsJob.php` | ~200 total | Dispatcher for System B | Dispatches `DriftFindingGenerator` for policy-version-based drift |
|
||
| 20 | `config/graph_contracts.php` | 867 total | Policy type registry | Defines endpoints, hydration strategies, subresources, type families per policy type |
|
||
| 21 | `config/tenantpilot.php` | 385 total; L18-293 (supported_policy_types) | Application config | 28 supported policy types + 3 foundation types |
|
||
| 22 | `specs/116-baseline-drift-engine/spec.md` | 237 total | Feature spec | Defines v1 (meta) and v2 (content fidelity) requirements |
|
||
| 23 | `specs/116-baseline-drift-engine/research.md` | 200 total | Phase 0 research | 6 key decisions including v2 architecture strategy |
|
||
| 24 | `specs/116-baseline-drift-engine/plan.md` | 259 total | Implementation plan | Steps 1-7 for v1; v2 deferred |
|
||
|
||
---
|
||
|
||
## 6. Type Coverage Matrix
|
||
|
||
Coverage assessment for deep-drift feasibility: **which types have per-type normalization and hydration support?**
|
||
|
||
| # | `policy_type` | Label | Hydration | Subresources | Per-Type Normalizer | Deep-Drift Feasible | Notes |
|
||
|---|---|---|---|---|---|---|---|
|
||
| 1 | `settingsCatalogPolicy` | Settings Catalog Policy | `configurationPolicies` | `settings` (list) | Yes (via PolicyNormalizer) | **YES** | Most impactful — complex nested settings |
|
||
| 2 | `endpointSecurityPolicy` | Endpoint Security Policies | `configurationPolicies` | `settings` (list) | Yes (shared with settings catalog) | **YES** | Same endpoint family as settings catalog |
|
||
| 3 | `securityBaselinePolicy` | Security Baselines | `configurationPolicies` | `settings` (list) | Yes (shared) | **YES** | Same pipeline |
|
||
| 4 | `groupPolicyConfiguration` | Administrative Templates | `groupPolicyConfigurations` | `definitionValues` → `presentationValues` | Yes (via PolicyNormalizer) | **YES** | Multi-level nesting; hydration required |
|
||
| 5 | `deviceConfiguration` | Device Configuration | `deviceConfigurations` | None (properties on root) | Yes (via PolicyNormalizer) | **YES** | Properties directly on resource |
|
||
| 6 | `deviceCompliancePolicy` | Device Compliance | `deviceCompliancePolicies` | `scheduledActionsForRule` (expand) | Yes (via PolicyNormalizer) | **YES** | Actions subresource needs expand |
|
||
| 7 | `windowsUpdateRing` | Software Update Ring | `deviceConfigurations` (filtered) | None (properties on root) | Yes (shared with deviceConfig) | **YES** | Subset of deviceConfiguration |
|
||
| 8 | `appProtectionPolicy` | App Protection (MAM) | `managedAppPolicies` | None (properties) | Partial (via PolicyNormalizer) | **YES** | Mobile-focused |
|
||
| 9 | `conditionalAccessPolicy` | Conditional Access | `identity/conditionalAccess/policies` | None (properties) | Yes (via PolicyNormalizer) | **YES** | High-risk, preview-only restore |
|
||
| 10 | `deviceManagementScript` | PowerShell Scripts | `deviceManagementScripts` | None (scriptContent base64) | Partial | **PARTIAL** | Script content is base64 in snapshot |
|
||
| 11 | `deviceShellScript` | macOS Shell Scripts | `deviceShellScripts` | None (scriptContent base64) | Partial | **PARTIAL** | Same pattern as PS scripts |
|
||
| 12 | `deviceHealthScript` | Proactive Remediations | `deviceHealthScripts` | None | Partial | **PARTIAL** | Detection + remediation scripts |
|
||
| 13 | `deviceComplianceScript` | Custom Compliance Scripts | `deviceComplianceScripts` | None | Partial | **PARTIAL** | Script content |
|
||
| 14 | `windowsFeatureUpdateProfile` | Feature Updates | `windowsFeatureUpdateProfiles` | None | Yes | **YES** | Simple properties |
|
||
| 15 | `windowsQualityUpdateProfile` | Quality Updates | `windowsQualityUpdateProfiles` | None | Yes | **YES** | Simple properties |
|
||
| 16 | `windowsDriverUpdateProfile` | Driver Updates | `windowsDriverUpdateProfiles` | None | Yes | **YES** | Simple properties |
|
||
| 17 | `mamAppConfiguration` | App Config (MAM) | `targetedManagedAppConfigurations` | None | Partial | **YES** | Properties-based |
|
||
| 18 | `managedDeviceAppConfiguration` | App Config (Device) | `mobileAppConfigurations` | None | Partial | **YES** | Properties-based |
|
||
| 19 | `windowsAutopilotDeploymentProfile` | Autopilot Profiles | `windowsAutopilotDeploymentProfiles` | None | Minimal | **YES** | Properties-based |
|
||
| 20 | `windowsEnrollmentStatusPage` | Enrollment Status Page | `deviceEnrollmentConfigurations` | None | Minimal | **META-ONLY** | Enrollment types have limited settings |
|
||
| 21 | `deviceEnrollmentLimitConfiguration` | Enrollment Limits | `deviceEnrollmentConfigurations` | None | Minimal | **META-ONLY** | Numeric limit only |
|
||
| 22 | `deviceEnrollmentPlatformRestrictionsConfiguration` | Platform Restrictions | `deviceEnrollmentConfigurations` | None | Minimal | **META-ONLY** | Nested restriction config |
|
||
| 23 | `deviceEnrollmentNotificationConfiguration` | Enrollment Notifications | `deviceEnrollmentConfigurations` | None | Minimal | **META-ONLY** | Template snapshots nested |
|
||
| 24 | `enrollmentRestriction` | Enrollment Restrictions | `deviceEnrollmentConfigurations` | None | Minimal | **META-ONLY** | Mixed config type |
|
||
| 25 | `termsAndConditions` | Terms & Conditions | `termsAndConditions` | None | Yes | **YES** | bodyText, acceptanceStatement |
|
||
| 26 | `endpointSecurityIntent` | Endpoint Security Intents | `intents` | categories/settings (legacy) | Partial | **PARTIAL** | Legacy intent API; migrating to configPolicies |
|
||
| 27 | `mobileApp` | Applications | `mobileApps` | None | Minimal | **META-ONLY** | Metadata-only backup per config |
|
||
| 28 | `policySet` | Policy Sets | (if supported) | assignments | Minimal | **META-ONLY** | Container for other policies |
|
||
|
||
**Foundation Types:**
|
||
|
||
| # | `foundation_type` | Label | Deep-Drift | Notes |
|
||
|---|---|---|---|---|
|
||
| F1 | `assignmentFilter` | Assignment Filter | **YES** | `rule` property is key content |
|
||
| F2 | `roleScopeTag` | Scope Tag | **META-ONLY** | displayName + description only |
|
||
| F3 | `notificationMessageTemplate` | Notification Template | **PARTIAL** | Localized messages are subresource |
|
||
|
||
**Summary:**
|
||
- **Full content-fidelity feasible**: 16 types (settingsCatalog, endpointSecurity, securityBaseline, groupPolicy, deviceConfig, compliance, updateRings/profiles, appProtection, conditionalAccess, appConfigs, autopilot, termsAndConditions, assignmentFilter)
|
||
- **Partial** (script content / legacy APIs): 5 types
|
||
- **Meta-only sufficient**: 7 types (enrollment configs, mobileApp, roleScopeTag)
|
||
|
||
---
|
||
|
||
## 7. Proposal: Deep Drift Implementation Plan
|
||
|
||
### Phase v1.5 — Provider Chain (Opportunistic Content Fidelity)
|
||
|
||
**Goal**: Enable baseline compare to use existing PolicyVersions for content-fidelity hash when available, with meta-fidelity fallback.
|
||
|
||
**Estimated effort**: 3-5 days
|
||
|
||
#### Step 1: ContentHashProvider Interface
|
||
```php
|
||
// app/Contracts/Baselines/ContentHashProvider.php
|
||
interface ContentHashProvider
|
||
{
|
||
/**
|
||
* @return array{hash: string, fidelity: string, source: string}|null
|
||
*/
|
||
public function resolve(string $policyType, string $externalId, int $tenantId, CarbonImmutable $since): ?array;
|
||
}
|
||
```
|
||
|
||
#### Step 2: PolicyVersionContentProvider
|
||
```php
|
||
// app/Services/Baselines/PolicyVersionContentProvider.php
|
||
// Looks up the latest PolicyVersion for (tenant_id, external_id, policy_type)
|
||
// captured_at >= $since (baseline snapshot timestamp)
|
||
// Returns SettingsNormalizer → DriftHasher hash with fidelity='content'
|
||
```
|
||
|
||
#### Step 3: MetaFallbackProvider (existing logic)
|
||
```php
|
||
// Wraps InventoryMetaContract → DriftHasher → fidelity='meta'
|
||
```
|
||
|
||
#### Step 4: ContentProviderChain
|
||
```php
|
||
// Iterates [PolicyVersionContentProvider, MetaFallbackProvider]
|
||
// Returns first non-null result
|
||
```
|
||
|
||
#### Step 5: Integration in CompareBaselineToTenantJob
|
||
- `loadCurrentInventory()` accepts optional `ContentProviderChain`
|
||
- For each item: try chain, record fidelity + source
|
||
- `computeDrift()` unchanged (still hash vs hash comparison)
|
||
- Finding evidence includes `fidelity` and `content_hash_source`
|
||
|
||
#### Step 6: CaptureBaselineSnapshotJob enhancement
|
||
- Optional: during capture, also try `PolicyVersionContentProvider` to store content-fidelity baseline_hash
|
||
- Store `content_hash_source` in `baseline_snapshot_items.meta_jsonb`
|
||
- This means: if a backup was taken before baseline capture, the baseline itself is content-fidelity
|
||
|
||
#### Step 7: Coverage extension
|
||
- Add `content_coverage` to compare run context: which types had PolicyVersions, which fell back to meta
|
||
- Display in operation detail UI
|
||
|
||
#### Migration
|
||
```sql
|
||
-- Optional: add column for source tracking
|
||
ALTER TABLE baseline_snapshot_items
|
||
ADD COLUMN content_hash_source VARCHAR(255) NULL DEFAULT 'inventory_meta_v1';
|
||
```
|
||
|
||
### Phase v2.0 — On-Demand Content Capture (Future)
|
||
|
||
**Goal**: For types without recent PolicyVersions, perform targeted per-item GET during baseline capture/compare.
|
||
|
||
**Estimated effort**: 5-8 days
|
||
|
||
- Introduce `BaselineContentCaptureJob` that, for a given baseline profile's scope, identifies items lacking recent PolicyVersions and performs targeted GET + PolicyVersion creation.
|
||
- Reuses existing `PolicyCaptureOrchestrator` with a new "baseline-triggered" context.
|
||
- Adds `capture_mode` to baseline profile: `meta_only` (v1), `opportunistic` (v1.5), `full_content` (v2.0).
|
||
- Rate limiting: per-tenant throttle to avoid Graph API quota issues.
|
||
- Budget guard: max N items per capture run, with continuation support.
|
||
|
||
### Phase v2.5 — Inventory Content Enrichment (Future, Optional)
|
||
|
||
**Goal**: Optionally have inventory sync capture settings content inline during LIST (where type supports `$expand`).
|
||
|
||
- Some types support `$expand=settings` on LIST (settings catalog, endpoint security).
|
||
- This would give "free" content fidelity without per-item GET.
|
||
- High complexity: varies per type, may increase LIST payload size significantly.
|
||
- Evaluate ROI after v2.0 ships.
|
||
|
||
---
|
||
|
||
## 8. Test Plan (Enterprise)
|
||
|
||
### Unit Tests
|
||
|
||
| # | Test File | Scope | Key Assertions |
|
||
|---|---|---|---|
|
||
| U1 | `tests/Unit/Baselines/ContentProviderChainTest.php` | Provider chain resolution | First provider wins; null fallback; fidelity recorded correctly |
|
||
| U2 | `tests/Unit/Baselines/PolicyVersionContentProviderTest.php` | PolicyVersion lookup + normalization | Correct hash for known snapshot; returns null when no PV; respects `$since` cutoff |
|
||
| U3 | `tests/Unit/Baselines/MetaFallbackProviderTest.php` | Meta contract fallback | Produces `fidelity='meta'`; matches existing `InventoryMetaContract` behavior exactly |
|
||
| U4 | `tests/Unit/Baselines/InventoryMetaContractTest.php` | (Existing) contract stability | Null handling, ordering, versioning — extend for edge cases |
|
||
|
||
### Feature Tests
|
||
|
||
| # | Test File | Scope | Key Assertions |
|
||
|---|---|---|---|
|
||
| F1 | `tests/Feature/Baselines/BaselineCompareContentFidelityTest.php` | End-to-end compare with PolicyVersions available | Settings change → `different_version` finding with `fidelity='content'` |
|
||
| F2 | `tests/Feature/Baselines/BaselineCompareMixedFidelityTest.php` | Some types have PV, some don't | Mixed `fidelity` values in findings; coverage context records both |
|
||
| F3 | `tests/Feature/Baselines/BaselineCompareFallbackTest.php` | No PolicyVersions available | Falls back to meta fidelity; identical behavior to v1 |
|
||
| F4 | `tests/Feature/Baselines/BaselineCaptureFidelityTest.php` | Capture with PolicyVersions present | `baseline_hash` uses content fidelity; `content_hash_source` recorded |
|
||
| F5 | `tests/Feature/Baselines/BaselineCompareStaleVersionTest.php` | PolicyVersion older than snapshot | Falls back to meta (stale PV not used) |
|
||
| F6 | `tests/Feature/Baselines/BaselineCompareCoverageGuardContentTest.php` | Coverage reporting for content types | `content_coverage` in run context shows which types are content-covered |
|
||
|
||
### Existing Tests to Preserve
|
||
|
||
| # | Test File | Impact |
|
||
|---|---|---|
|
||
| E1 | `tests/Feature/Baselines/BaselineCompareFindingsTest.php` | Must still pass — meta fidelity is default when no PV exists |
|
||
| E2 | `tests/Feature/Baselines/BaselineComparePreconditionsTest.php` | No change expected |
|
||
| E3 | `tests/Feature/Baselines/BaselineCompareStatsTest.php` | Stats remain grouped by scope_key; may need fidelity breakdown |
|
||
| E4 | `tests/Feature/Baselines/BaselineOperabilityAutoCloseTest.php` | Auto-close unaffected by fidelity source |
|
||
|
||
### Integration / Regression
|
||
|
||
| # | Test | Scope |
|
||
|---|---|---|
|
||
| I1 | Content hash stability across serialization | JSON encode/decode round-trip does not change hash |
|
||
| I2 | PolicyVersion normalizer alignment | Same snapshot → `SettingsNormalizer` produces same hash in both System A (via provider) and System B (via DriftFindingGenerator) |
|
||
| I3 | Hash collision protection | Different settings → different hashes (property-based test with sample data) |
|
||
| I4 | Empty snapshot edge case | PolicyVersion with empty/null snapshot → provider returns null → fallback works |
|
||
|
||
### Performance Tests
|
||
|
||
| # | Test | Acceptance Criteria |
|
||
|---|---|---|
|
||
| P1 | Compare job with 500 items, 50% with PolicyVersions | Completes in < 30s (DB-only, no Graph calls) |
|
||
| P2 | Provider chain query efficiency | PolicyVersion lookup uses batch query, not N+1 |
|
||
|
||
---
|
||
|
||
## 9. Open Questions / Assumptions
|
||
|
||
### Open Questions
|
||
|
||
| # | Question | Impact | Proposed Resolution |
|
||
|---|---|---|---|
|
||
| OQ-1 | **Staleness threshold for PolicyVersions**: How old can a PolicyVersion be before we reject it as a content source? | Determines false-negative risk | Default: PolicyVersion must be captured after the baseline snapshot's `captured_at`. Configurable per workspace. |
|
||
| OQ-2 | **Mixed fidelity UX**: How should the UI display findings with different fidelity levels? | User trust and understanding | Badge/icon on finding cards: "High confidence (content)" vs "Structural only (meta)". Filterable in findings table. |
|
||
| OQ-3 | **Should baseline capture _force_ a backup** if no recent PolicyVersions exist? | API cost vs accuracy trade-off | No for v1.5 (opportunistic only). Yes for v2.0 as opt-in `capture_mode: full_content`. |
|
||
| OQ-4 | **etag as change hint**: Should we use etag changes as a _trigger_ for on-demand PolicyVersion capture? | Could reduce unnecessary GETs | Worth investigating in v2.0. If etag changes during inventory sync, schedule targeted per-item GET for that policy only. |
|
||
| OQ-5 | **Settings Catalog `$expand=settings`** on LIST: Does Microsoft Graph support this? | Could give "free" content fidelity for settings catalog types | Needs validation against Graph API. If supported, would eliminate per-item GET for the most impactful type. |
|
||
| OQ-6 | **Retention / pruning interaction**: If old PolicyVersions are pruned, does that affect baseline compare? | Could lose content fidelity for old baselines | Baseline compare only needs versions captured _after_ baseline snapshot. Pruning policy should respect active baseline snapshots. |
|
||
|
||
### Assumptions
|
||
|
||
| # | Assumption | Risk if Wrong |
|
||
|---|---|---|
|
||
| A-1 | `DriftHasher::hashNormalized()` is deterministic across PHP serialization boundaries | Hash mismatch → false drift findings. **Validated**: uses `json_encode` with stable flags + `ksort`. |
|
||
| A-2 | `SettingsNormalizer` / `PolicyNormalizer` produce the same output for the same input regardless of call context (System A vs System B) | Hash inconsistency between systems. **Low risk**: same code path. |
|
||
| A-3 | PolicyVersions from backups contain complete settings (not partial hydration) | Incomplete content → false negatives or incorrect hashes. **Validated**: `PolicySnapshotService` performs full hydration per type. |
|
||
| A-4 | The `Finding` model's `fingerprint`/`recurrence_key` identity allows mixed fidelity sources | Identity collision if fidelity changes source. **Safe**: recurrence_key includes snapshot_id, not hash value. |
|
||
| A-5 | Graph LIST endpoints do NOT return settings values for any supported policy type | If wrong, inventory sync could capture settings "for free". **Validated**: LIST returns only `$select` fields per `graph_contracts.php`. |
|
||
| A-6 | Per-type normalizers in backup drift path handle all 28 supported policy types | If not, some types would produce unstable hashes. **Partially validated**: `PolicyNormalizer` has a fallback for unknown types. |
|
||
|
||
---
|
||
|
||
## 10. Key Questions Answered
|
||
|
||
### KQ-01: Are Baseline Compare and Backup Drift truly separate systems?
|
||
|
||
**Yes.** They share `DriftHasher` and the `Finding` model, but differ in:
|
||
- Data source: `InventoryItem` vs `PolicyVersion`
|
||
- Hash contract: `InventoryMetaContract` (7 fields, meta only) vs `SettingsNormalizer → PolicyNormalizer` (full snapshot)
|
||
- Finding generator: `CompareBaselineToTenantJob::computeDrift()` vs `DriftFindingGenerator::generate()`
|
||
- Finding identity: different recurrence key structures
|
||
- Scope model: `BaselineProfile`-scoped vs `selection_hash`-scoped
|
||
- Trigger: post-inventory-sync vs post-backup
|
||
- Coverage: `InventoryCoverage` guard vs none (trusts backup completeness)
|
||
|
||
### KQ-02: Should they be unified or remain separate?
|
||
|
||
**Hybrid approach (Provider Chain)** — as designed in Spec 116 v2. Keep separate triggering and scoping, but let System A _consume_ data produced by System B (PolicyVersions) via a provider chain. This avoids:
|
||
- Merging two fundamentally different scoping models
|
||
- Introducing new Graph API costs
|
||
- Disrupting existing backup drift workflows
|
||
|
||
### KQ-03: What is the minimal viable "v1.5" to bridge the gap?
|
||
|
||
Add a `PolicyVersionContentProvider` that checks for recent PolicyVersions as part of baseline compare's hash computation. For types where a PolicyVersion exists (i.e., a backup was taken), the compare immediately gains content-fidelity. For types without, meta-fidelity continues as before. **Net code change: ~200-300 lines** (interface + 2 providers + chain + integration).
|
||
|
||
### KQ-04: Which types benefit most from content-fidelity drift?
|
||
|
||
**Top priority** (complex settings, high change frequency):
|
||
1. `settingsCatalogPolicy` — most common, deeply nested settings
|
||
2. `groupPolicyConfiguration` — multi-level nesting (definitionValues → presentationValues)
|
||
3. `deviceCompliancePolicy` — compliance rules + scheduled actions
|
||
4. `deviceConfiguration` — broad category, many OData sub-types
|
||
5. `endpointSecurityPolicy` — critical security settings
|
||
6. `securityBaselinePolicy` — security-critical baselines
|
||
7. `conditionalAccessPolicy` — identity security gate
|
||
|
||
**Medium priority** (simpler settings but still valuable):
|
||
8. `appProtectionPolicy`, `windowsUpdateRing`, `windowsFeatureUpdateProfile`, `windowsQualityUpdateProfile`
|
||
|
||
### KQ-05: How does coverage work and how should it extend for content fidelity?
|
||
|
||
Currently: `InventoryCoverage::fromContext(latestSyncRun->context)` → `coveredTypes()` returns types with `status=succeeded`. Uncovered types → findings suppressed, outcome = `partially_succeeded`.
|
||
|
||
For v1.5: Add `content_coverage` alongside `meta_coverage`:
|
||
- `content_covered_types`: types where PolicyVersion exists post-baseline
|
||
- `meta_only_types`: types where only meta is available
|
||
- `uncovered_types`: types with no coverage at all (findings suppressed)
|
||
|
||
Finding evidence should include:
|
||
```json
|
||
{
|
||
"fidelity": "content",
|
||
"content_hash_source": "policy_version:42",
|
||
"note": "Hash computed from PolicyVersion #42 captured 2025-07-14T10:30:00Z"
|
||
}
|
||
```
|
||
|
||
### KQ-06: What is the long-term unified architecture?
|
||
|
||
**Provider precedence chain** with configurable capture modes:
|
||
|
||
```
|
||
BaselineProfile.capture_mode:
|
||
'meta_only' → InventoryMetaContract only (v1)
|
||
'opportunistic' → PolicyVersion if available → meta fallback (v1.5)
|
||
'full_content' → On-demand GET for missing types → PolicyVersion → meta (v2.0)
|
||
|
||
ContentProviderChain:
|
||
1. PolicyVersionContentProvider (checks existing PolicyVersions)
|
||
2. InventoryContentProvider (future: if inventory sync enriched)
|
||
3. MetaFallbackProvider (InventoryMetaContract v1)
|
||
```
|
||
|
||
The long-term vision is that baseline capture + compare use the **same normalizer pipeline** as backup drift, producing identical hashes for identical content regardless of which system produced the PolicyVersion. This is achievable because `DriftHasher` and `SettingsNormalizer` are already shared code.
|
||
|
||
---
|
||
|
||
## Appendix: Database Schema Reference
|
||
|
||
### `baseline_snapshot_items` (current)
|
||
```
|
||
id BIGINT PK
|
||
baseline_snapshot_id BIGINT FK → baseline_snapshots
|
||
subject_type VARCHAR(255) -- 'policy'
|
||
subject_external_id VARCHAR(255) -- Graph resource GUID
|
||
policy_type VARCHAR(255) -- e.g. 'settingsCatalogPolicy'
|
||
baseline_hash VARCHAR(64) -- sha256 of InventoryMetaContract
|
||
meta_jsonb JSONB -- {display_name, category, platform, meta_contract: {...}, fidelity, source}
|
||
created_at TIMESTAMP
|
||
updated_at TIMESTAMP
|
||
```
|
||
|
||
### `inventory_items` (current)
|
||
```
|
||
id BIGINT PK
|
||
tenant_id BIGINT FK → tenants
|
||
policy_type VARCHAR(255)
|
||
external_id VARCHAR(255)
|
||
display_name VARCHAR(255)
|
||
category VARCHAR(255) NULL
|
||
platform VARCHAR(255) NULL
|
||
meta_jsonb JSONB -- {odata_type, etag, scope_tag_ids, assignment_target_count}
|
||
last_seen_at TIMESTAMP NULL
|
||
last_seen_operation_run_id BIGINT NULL
|
||
created_at TIMESTAMP
|
||
updated_at TIMESTAMP
|
||
```
|
||
|
||
### `policy_versions` (current)
|
||
```
|
||
id BIGINT PK
|
||
tenant_id BIGINT FK → tenants
|
||
policy_id BIGINT FK → policies
|
||
version_number INTEGER
|
||
policy_type VARCHAR(255)
|
||
platform VARCHAR(255) NULL
|
||
created_by VARCHAR(255) NULL
|
||
captured_at TIMESTAMP
|
||
snapshot JSON -- FULL Graph GET response (hydrated)
|
||
metadata JSON -- additional metadata
|
||
assignments JSON NULL -- full assignments array
|
||
scope_tags JSON NULL -- scope tag IDs
|
||
assignments_hash VARCHAR(64) NULL
|
||
scope_tags_hash VARCHAR(64) NULL
|
||
created_at TIMESTAMP
|
||
updated_at TIMESTAMP
|
||
deleted_at TIMESTAMP NULL -- soft delete
|
||
```
|
||
|
||
### Proposed v1.5 Addition
|
||
```sql
|
||
ALTER TABLE baseline_snapshot_items
|
||
ADD COLUMN content_hash_source VARCHAR(255) NULL DEFAULT 'inventory_meta_v1';
|
||
-- Values: 'inventory_meta_v1', 'policy_version:{id}', 'inventory_content_v2'
|
||
```
|