# Redaction / Masking / Sanitizing — Codebase Audit Report

**Auditor:** Security + Data-Integrity Codebase Auditor  
**Date:** 2026-03-06  
**Scope:** Entire TenantAtlas repo (excluding `vendor/`, `node_modules/`, compiled views)  
**Severity Classification:** SEV-1 = data loss / secret exposure in production path; SEV-2 = data quality degradation; SEV-3 = cosmetic / defense-in-depth

---

## Executive Summary

### The Bug

**`PolicySnapshotRedactor`** — the central class that masks secrets before persisting `PolicyVersion` snapshots to the database — uses **substring-matching regex patterns** (`/password/i`, `/secret/i`, `/token/i`, `/certificate/i`) against JSON keys.  This causes **false-positive redaction** of Intune configuration keys such as:

| Key | Actual Meaning | Wrongly Redacted? |
|-----|---------------|-------------------|
| `passwordMinimumLength` | Config: minimum password length (integer) | **YES** — matches `/password/i` |
| `passwordRequired` | Config: boolean toggle | **YES** — matches `/password/i` |
| `deviceCompliancePasswordRequired` | Config: boolean toggle | **YES** — matches `/password/i` |
| `localAdminPasswordRotationEnabled` | Config: boolean toggle | **YES** — matches `/password/i` |
| `passwordExpirationDays` | Config: integer | **YES** — matches `/password/i` |
| `passwordMinimumCharacterSetCount` | Config: integer | **YES** — matches `/password/i` |
| `passwordBlockSimple` | Config: boolean | **YES** — matches `/password/i` |
| `passwordPreviousPasswordBlockCount` | Config: integer | **YES** — matches `/password/i` |
| `certificateValidityPeriodScale` | Config: enum | **YES** — matches `/certificate/i` |
| `certificateRenewalThresholdPercentage` | Config: integer | **YES** — matches `/certificate/i` |
| `tokenType` | Config: enum | **YES** — matches `/token/i` |

### Impact

1. **SEV-1: Persistent data loss** — Redaction happens **before** storing `PolicyVersion.snapshot` (JSONB). The original values are **irrecoverably destroyed**. This breaks:
   - Drift detection (diffs show `[REDACTED]` vs `[REDACTED]` — always "no change")
   - Baseline compare (config values masked, comparison meaningless)
   - Restore flows (restored policies could lose critical settings)
   - Evidence packs / review pack exports

2. **SEV-2: False drift suppression** — Two snapshots with different `passwordMinimumLength` values (e.g., 8 vs 12) will both store `[REDACTED]`, producing identical hashes → drift goes undetected.

3. **Real secrets remain correctly redacted** — The existing patterns DO catch actual secrets (`password`, `wifi_password`, `clientSecret`, etc.) — the problem is OVER-matching config keys that *contain* the word "password".

### Affected Layers
- **3 distinct redaction/sanitization systems** with the same bug pattern
- **2 persistent storage paths** (PolicyVersion, AuditLog)
- **1 non-persistent rendering path** (Verification reports)
- **~8 additional downstream consumers** of already-redacted data

---

## Findings — Complete Redaction Map

### Finding F-1: `PolicySnapshotRedactor` (CRITICAL — SEV-1)

| Attribute | Value |
|-----------|-------|
| **File** | `app/Services/Intune/PolicySnapshotRedactor.php` |
| **Lines** | 14-23 (pattern definitions), 64-70 (`isSensitiveKey`), 72-93 (`redactValue`) |
| **Layer** | Storage (persistent JSONB) |
| **What's redacted** | Any key whose name substring-matches 8 regex patterns |
| **Entry points** | `PolicyCaptureOrchestrator::capture()` (L148-150), `VersionService::captureVersion()` (L47-49) |
| **Persisted** | **YES** — redacted data stored as `PolicyVersion.snapshot`, `.assignments`, `.scope_tags` |
| **Reversible** | **NO** — original values lost permanently |

**Rules (exact patterns):**
```php
private const array SENSITIVE_KEY_PATTERNS = [
    '/password/i',      // ← FALSE POSITIVES: passwordMinimumLength, passwordRequired, etc.
    '/secret/i',        // ← FALSE POSITIVES: possible but less common in Intune
    '/token/i',         // ← FALSE POSITIVES: tokenType, tokenAccountType, etc.
    '/client[_-]?secret/i',  // ← OK (specific enough)
    '/private[_-]?key/i',    // ← OK (specific enough)
    '/shared[_-]?secret/i',  // ← OK (specific enough)
    '/preshared/i',           // ← OK (specific enough)
    '/certificate/i',         // ← FALSE POSITIVES: certificateValidityPeriodScale, etc.
];
```

**False positive examples:**
- `passwordMinimumLength: 12` → becomes `passwordMinimumLength: "[REDACTED]"`
- `passwordRequired: true` → becomes `passwordRequired: "[REDACTED]"`
- `certificateValidityPeriodScale: "years"` → becomes `certificateValidityPeriodScale: "[REDACTED]"`
- `tokenType: "user"` → becomes `tokenType: "[REDACTED]"`

**Risk Score:** 🔴 **CRITICAL** — Permanent data loss in production path. Destroys drift detection, baseline compare, and restore fidelity.

**Double-redaction issue:** `VersionService::captureVersion()` (L47-49) applies the redactor, AND `PolicyCaptureOrchestrator::capture()` (L148-150) also applies it before passing to `captureVersion()`. When called through the orchestrator, data is redacted twice (harmless but indicates confusion about ownership).

---

### Finding F-2: `AuditContextSanitizer` (HIGH — SEV-2)

| Attribute | Value |
|-----------|-------|
| **File** | `app/Support/Audit/AuditContextSanitizer.php` |
| **Lines** | 36-44 (`shouldRedactKey`) |
| **Layer** | Audit logging (persistent) |
| **What's redacted** | Keys containing substrings: `token`, `secret`, `password`, `authorization`, `private_key`, `client_secret` |
| **Entry points** | `WorkspaceAuditLogger::log()` , `Intune\AuditLogger::log()` |
| **Persisted** | **YES** — stored in `AuditLog.metadata` (JSONB) |

**Rules (exact logic):**
```php
return str_contains($key, 'token')
    || str_contains($key, 'secret')
    || str_contains($key, 'password')
    || str_contains($key, 'authorization')
    || str_contains($key, 'private_key')
    || str_contains($key, 'client_secret');
```

**False positive examples:** Same as F-1 — any audit metadata key containing `password` or `token` as a substring is redacted. If audit context includes policy configuration metadata (e.g., `passwordMinimumLength`), it will be masked.

**Risk Score:** 🟠 **HIGH** — Audit log data corruption. Less severe than F-1 because audit logs typically contain operational metadata, not raw policy payloads. However, if policy config keys appear in audit context (e.g., during drift detection logging), they will be silently destroyed.

---

### Finding F-3: `VerificationReportSanitizer` (MEDIUM — SEV-3)

| Attribute | Value |
|-----------|-------|
| **File** | `app/Support/Verification/VerificationReportSanitizer.php` |
| **Lines** | 10-18 (`FORBIDDEN_KEY_SUBSTRINGS`), 131, 276, 357, 415, 443-460 |
| **Layer** | UI rendering (non-persistent) |
| **What's redacted** | Values/keys containing: `access_token`, `refresh_token`, `client_secret`, `authorization`, `password`, `cookie`, `set-cookie` |
| **Entry points** | `VerificationReportViewer` (Filament UI), `OperationRunResource` |
| **Persisted** | **NO** — sanitization at rendering time only |

**Rules:**
```php
private const FORBIDDEN_KEY_SUBSTRINGS = [
    'access_token', 'refresh_token', 'client_secret',
    'authorization', 'password', 'cookie', 'set-cookie',
];
// Used via str_contains() on key names AND value strings
```

**False positive examples:** If a verification report evidence value contains the substring "password" (e.g., a check message like "passwordMinimumLength must be ≥ 8"), the entire value would be nulled out. The `containsForbiddenKeySubstring` is also applied to **values**, not just keys (L357, L415), which could null out diagnostic strings.

**Risk Score:** 🟡 **MEDIUM** — No data loss (rendering only), but could hide useful diagnostic information from operators.

---

### Finding F-4: `RunFailureSanitizer::sanitizeMessage()` (LOW — SEV-3)

| Attribute | Value |
|-----------|-------|
| **File** | `app/Support/OpsUx/RunFailureSanitizer.php` |
| **Lines** | 119-140 |
| **Layer** | Error message sanitization (persistent in OperationRun context) |
| **What's redacted** | Bearer tokens, JWT-like strings, specific key=value patterns (`access_token`, `refresh_token`, `client_secret`, `password`) |
| **Entry points** | Various jobs, `OperationRunService` |
| **Persisted** | **YES** — stored in OperationRun error context |

**Rules:** Uses exact key names (`access_token`, `refresh_token`, `client_secret`, `password`) in regex patterns matching `key=value` or `"key":"value"` format. Also does a final `str_ireplace` of these exact substrings.

**False positive risk:** LOW. The regex targets `key=value` patterns specifically. However, the blanket `str_ireplace` on line 137 will replace the substring `password` in ANY position — a message like "Check passwordMinimumLength policy" becomes "Check [REDACTED]MinimumLength policy". But these are error messages, not structured data.

**Risk Score:** 🟢 **LOW** — Messages may be garbled but no structured data loss.

---

### Finding F-5: `GenerateReviewPackJob::redactArrayPii()` (SAFE)

| Attribute | Value |
|-----------|-------|
| **File** | `app/Jobs/GenerateReviewPackJob.php` |
| **Lines** | 340-352 |
| **Layer** | Export (ZIP download) |
| **What's redacted** | PII fields: `displayName`, `display_name`, `userPrincipalName`, `user_principal_name`, `email`, `mail` |
| **Entry points** | Review pack generation |
| **Persisted** | No (exported ZIP) |

**Rules:** Exact key match via `in_array($key, $piiKeys, true)`.

**Risk Score:** 🟢 **SAFE** — Uses exact key matching. No false positive risk for policy config keys.

---

### Finding F-6: `VerificationReportSanitizer::sanitizeMessage()` (LOW — SEV-3)

| Attribute | Value |
|-----------|-------|
| **File** | `app/Support/Verification/VerificationReportSanitizer.php` |
| **Lines** | 382-413 |
| **Layer** | UI rendering |
| **What's redacted** | Same pattern as `RunFailureSanitizer`: Bearer tokens, key=value secrets, long opaque blobs, plus blanket `str_ireplace` |

**Same issue as F-4:** The `str_ireplace` at L404-409 will corrupt any message containing "password" as a substring.

---

### Finding F-7: `DriftEvidence::sanitize()` (SAFE)

| Attribute | Value |
|-----------|-------|
| **File** | `app/Services/Drift/DriftEvidence.php` |
| **Lines** | 12-28 |
| **Layer** | Drift evidence formatting |
| **What's redacted** | Nothing — this is an allowlist-based key filter, not a secret redactor |

**Risk Score:** 🟢 **SAFE** — Uses allowlist approach. No false positives.

---

### Finding F-8: `InventoryMetaSanitizer` (SAFE)

| Attribute | Value |
|-----------|-------|
| **File** | `app/Services/Inventory/InventoryMetaSanitizer.php` |
| **Layer** | Inventory metadata normalization |
| **What's redacted** | Nothing secret-related — normalizes structural metadata (odata_type, etag, scope_tag_ids) |

**Risk Score:** 🟢 **SAFE** — Not a secret redactor at all.

---

### Finding F-9: `GraphContractRegistry::sanitizeArray()` (SAFE)

| Attribute | Value |
|-----------|-------|
| **File** | `app/Services/Graph/GraphContractRegistry.php` |
| **Lines** | 637+ |
| **Layer** | Graph API payload preparation (restore path) |
| **What's redacted** | Read-only/metadata keys stripped for Graph API compliance |

**Risk Score:** 🟢 **SAFE** — Not secret redaction. Strips OData metadata keys using exact-match lists.

---

### Finding F-10: Model-level `$hidden` (SAFE)

| Models | Hidden attributes |
|--------|------------------|
| `User` | `password`, `remember_token` |
| `ProviderCredential` | `payload` (encrypted:array) |
| `AlertDestination` | (needs verification) |
| `PlatformUser` | (needs verification) |

**Risk Score:** 🟢 **SAFE** — Standard Laravel serialization protection. Correct usage.

---

### Finding F-11: `ProviderCredentialObserver` (SAFE)

| Attribute | Value |
|-----------|-------|
| **File** | `app/Observers/ProviderCredentialObserver.php` |
| **Layer** | Audit logging for credential changes |
| **What's redacted** | Logs `'redacted_fields' => ['client_secret']` as metadata |

**Risk Score:** 🟢 **SAFE** — Explicitly documents redacted fields; doesn't perform substring matching.

---

## Root Cause Analysis

### Primary Root Cause: Substring Regex on Key Names

The `PolicySnapshotRedactor` (F-1) is the **single root cause** by severity. It uses:

```php
'/password/i'  →  preg_match('/password/i', 'passwordMinimumLength') === 1  ← FALSE POSITIVE
```

This pattern has **no word boundaries**, **no exact-match semantics**, and **no allowlist of known safe keys**.

### Secondary Root Cause: Redaction at Persistence Time

The redactor runs **before** data is stored in `PolicyVersion.snapshot`. This makes the data loss **permanent and irrecoverable**. If redaction only happened at rendering/export time, the original data would still be available for drift detection, compare, and restore.

### Tertiary Root Cause: No Centralized Redaction Service

Three independent implementations (`PolicySnapshotRedactor`, `AuditContextSanitizer`, `VerificationReportSanitizer`) each define their own rules with inconsistent approaches. There is no single source of truth for "what is a secret key."

---

## Patch Plan

### Phase 1: Quick Fix — Allowlist + Word Boundaries (PRIORITY A — do first)

**Fix `PolicySnapshotRedactor`** to use an explicit allowlist of exact secret key names instead of substring regex:

```php
// BEFORE (broken):
private const array SENSITIVE_KEY_PATTERNS = [
    '/password/i',
    '/secret/i',
    '/token/i',
    // ...
];

// AFTER (fixed):
private const array EXACT_SECRET_KEYS = [
    'password',
    'wifi_password',
    'wifiPassword',
    'clientSecret',
    'client_secret',
    'sharedSecret',
    'shared_secret',
    'sharedKey',
    'shared_key',
    'preSharedKey',
    'pre_shared_key',
    'presharedKey',
    'psk',
    'privateKey',
    'private_key',
    'certificatePassword',
    'certificate_password',
    'pfxPassword',
    'pfx_password',
    'token',              // bare "token" only — NOT tokenType
    'accessToken',
    'access_token',
    'refreshToken',
    'refresh_token',
    'bearerToken',
    'bearer_token',
    'secret',             // bare "secret" only — NOT secretReferenceValueId
    'passphrase',
];

// Match exact key name (case-insensitive):
private function isSensitiveKey(string $key): bool
{
    $normalized = strtolower(trim($key));
    foreach (self::EXACT_SECRET_KEYS as $secretKey) {
        if ($normalized === strtolower($secretKey)) {
            return true;
        }
    }
    return false;
}
```

**Apply the same fix to `AuditContextSanitizer::shouldRedactKey()`.**

### Phase 2: Move Redaction to Render Time (PRIORITY A — architectural fix)

Currently the redaction happens **before persistence**:

```
Graph API → PolicySnapshotRedactor → PolicyVersion.snapshot (REDACTED! 💀)
```

The correct architecture is:

```
Graph API → PolicyVersion.snapshot (ORIGINAL ✅)
                   ↓
         [On read/render/export]
                   ↓
         PolicySnapshotRedactor → UI / Export / Notification
```

This requires:
1. Remove `snapshotRedactor->redactPayload()` calls from `PolicyCaptureOrchestrator` and `VersionService::captureVersion()`
2. Apply redaction at rendering time in Filament resources, export jobs, and notification formatters
3. Ensure logs (`Log::info/warning`) never include raw snapshot data (they currently don't — they only log IDs)

**Migration consideration:** Existing `PolicyVersion` records with `[REDACTED]` values cannot be recovered. Document this as known data loss for historical versions.

### Phase 3: Centralized Redaction Service (PRIORITY B — proper fix)

Create a single `SecretKeyClassifier` service:

```php
final class SecretKeyClassifier
{
    // Exact secret keys from Intune/Graph schemas
    private const array SECRET_KEYS = [ /* ... */ ];
    
    // Keys that LOOK like secrets but are config values (safety net)
    private const array SAFE_CONFIG_KEYS = [
        'passwordMinimumLength',
        'passwordRequired',
        'passwordExpirationDays',
        'passwordMinimumCharacterSetCount',
        'passwordBlockSimple',
        'passwordPreviousPasswordBlockCount',
        'deviceCompliancePasswordRequired',
        'localAdminPasswordRotationEnabled',
        'certificateValidityPeriodScale',
        'certificateRenewalThresholdPercentage',
        'tokenType',
        'tokenAccountType',
        // ... extend as new policy types are added
    ];
    
    public function isSecretKey(string $key): bool { /* ... */ }
    public function isKnownSafeConfigKey(string $key): bool { /* ... */ }
}
```

All three sanitizer systems (`PolicySnapshotRedactor`, `AuditContextSanitizer`, `VerificationReportSanitizer`) should delegate to this single classifier.

### Phase 4: Schema-Driven Secret Classification (PRIORITY C — long-term)

Extend `GraphContractRegistry` with per-policy-type secret field metadata:

```php
'deviceConfiguration' => [
    // ...existing...
    'secret_fields' => ['wifi_password', 'preSharedKey', 'certificatePassword'],
],
```

The `SecretKeyClassifier` would then consult the contract registry for type-specific secret lists, providing the most precise classification possible.

---

## Required Tests

### Unit Tests (new)

```
tests/Unit/PolicySnapshotRedactorFalsePositiveTest.php
```

**Sample test data:**
```json
{
    "passwordMinimumLength": 12,
    "passwordRequired": true,
    "wifi_password": "SuperGeheimesPasswort123!",
    "clientSecret": "abc",
    "token": "xyz",
    "deviceCompliancePasswordRequired": true,
    "localAdminPasswordRotationEnabled": true,
    "certificatePassword": "pfxpw",
    "sharedKey": "psk",
    "osVersionMinimum": "10.0.22631",
    "certificateValidityPeriodScale": "years",
    "tokenType": "user",
    "passwordExpirationDays": 90,
    "passwordBlockSimple": true
}
```

**Expected after redaction:**
| Key | Expected |
|-----|----------|
| `passwordMinimumLength` | `12` (VISIBLE) |
| `passwordRequired` | `true` (VISIBLE) |
| `wifi_password` | `[REDACTED]` |
| `clientSecret` | `[REDACTED]` |
| `token` | `[REDACTED]` |
| `deviceCompliancePasswordRequired` | `true` (VISIBLE) |
| `localAdminPasswordRotationEnabled` | `true` (VISIBLE) |
| `certificatePassword` | `[REDACTED]` |
| `sharedKey` | `[REDACTED]` |
| `osVersionMinimum` | `10.0.22631` (VISIBLE) |
| `certificateValidityPeriodScale` | `years` (VISIBLE) |
| `tokenType` | `user` (VISIBLE) |
| `passwordExpirationDays` | `90` (VISIBLE) |
| `passwordBlockSimple` | `true` (VISIBLE) |

### Unit Tests (update existing)

- `tests/Unit/AuditContextSanitizerTest.php` — add false-positive key tests
- `tests/Feature/Intune/PolicySnapshotRedactionTest.php` — add config key preservation tests
- `tests/Feature/Audit/WorkspaceAuditLoggerRedactionTest.php` — add config key preservation

### Integration Tests

- Drift compare with `passwordMinimumLength` changes → must detect change
- Baseline compare with `passwordRequired` difference → must show diff
- Review pack export → config keys visible, actual secrets redacted
- Restore flow → original config values preserved through full cycle

---

## Guard Rails (Regression Prevention)

### 1. CI Regression Test
A dedicated test that asserts known Intune config keys are NEVER redacted:

```php
it('never redacts known Intune configuration keys', function (string $key, mixed $value) {
    $redactor = new PolicySnapshotRedactor();
    $result = $redactor->redactPayload([$key => $value]);
    expect($result[$key])->toBe($value, "Config key '{$key}' was incorrectly redacted");
})->with([
    ['passwordMinimumLength', 12],
    ['passwordRequired', true],
    ['passwordExpirationDays', 90],
    ['passwordMinimumCharacterSetCount', 4],
    ['passwordBlockSimple', true],
    ['passwordPreviousPasswordBlockCount', 5],
    ['deviceCompliancePasswordRequired', true],
    ['localAdminPasswordRotationEnabled', true],
    ['certificateValidityPeriodScale', 'years'],
    ['certificateRenewalThresholdPercentage', 20],
    ['tokenType', 'user'],
]);
```

### 2. Static Analysis / CI Grep
Add a CI step that fails if any new code introduces substring-matching regex against "password"/"secret"/"token" without word boundaries:

```bash
# Fail if new redaction patterns use substring matching
grep -rn --include='*.php' "preg_match.*password\|str_contains.*password" app/ \
  | grep -v "SAFE_CONFIG_KEYS\|EXACT_SECRET_KEYS\|SecretKeyClassifier" \
  && echo "FAIL: New substring-based password detection found" && exit 1
```

### 3. Central Redaction Service
All secret-classification logic must flow through `SecretKeyClassifier` (Phase 3). No component should independently define what constitutes a "secret key."

---

## Summary Table

| # | File | Layer | Persisted | False Positives | Severity | Fix Priority |
|---|------|-------|-----------|----------------|----------|-------------|
| F-1 | `PolicySnapshotRedactor.php` | Storage | **YES** | **HIGH** (password*, certificate*, token*) | **SEV-1** | **IMMEDIATE** |
| F-2 | `AuditContextSanitizer.php` | Audit | **YES** | **HIGH** (password, token substrings) | **SEV-2** | **HIGH** |
| F-3 | `VerificationReportSanitizer.php` | UI | No | **MEDIUM** (password in values) | **SEV-3** | MEDIUM |
| F-4 | `RunFailureSanitizer.php` | Errors | YES | **LOW** (str_ireplace on messages) | **SEV-3** | LOW |
| F-5 | `GenerateReviewPackJob.php` | Export | No | None (exact match) | SAFE | — |
| F-6 | `VerificationReportSanitizer::sanitizeMessage()` | UI | No | **LOW** (same as F-4) | **SEV-3** | LOW |
| F-7 | `DriftEvidence.php` | Drift | No | None (allowlist) | SAFE | — |
| F-8 | `InventoryMetaSanitizer.php` | Inventory | No | None (structural only) | SAFE | — |
| F-9 | `GraphContractRegistry.php` | API prep | No | None (exact match) | SAFE | — |
| F-10 | Model `$hidden` | Serialization | N/A | None | SAFE | — |
| F-11 | `ProviderCredentialObserver.php` | Audit | No | None (explicit docs) | SAFE | — |

---

## Assumptions & Caveats

1. **ASSUMPTION:** Intune Graph API responses contain keys like `passwordMinimumLength` at the top-level of policy JSON. Verified against Microsoft Graph documentation for deviceManagement policies — confirmed.

2. **ASSUMPTION:** No other secret redaction exists in Monolog processors or Laravel exception handlers. Verified: no custom Monolog processors found in `config/logging.php` or `app/Exceptions/`.

3. **CANNOT VERIFY:** Whether any existing `PolicyVersion` snapshots in production have already suffered data loss from this bug. **Recommendation:** Query production DB for `PolicyVersion` records where `snapshot::jsonb ? 'passwordMinimumLength'` and check if the value is `[REDACTED]`.

4. **ASSUMPTION:** Telescope / Debugbar are not deployed to production. If they are, their built-in redaction (which typically uses similar patterns) could also cause false positives in debug data.

5. **Double redaction in orchestrator path:** When `PolicyCaptureOrchestrator::capture()` calls `VersionService::captureVersion()`, the redactor runs twice (once in each). This is functionally harmless (idempotent) but indicates unclear ownership.