Main Confidence / confidence (push) Failing after 56s

Details

Merge 248-private-ai-policy-foundation into dev (#288 )

Automated PR: merge branch 248-private-ai-policy-foundation into dev (created by Copilot)

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #288

2026-04-27 21:18:37 +00:00

27 KiB

Raw Blame History

Implementation Plan: Private AI Execution & Policy Foundation

Branch: 248-private-ai-policy-foundation | Date: 2026-04-27 | Spec: spec.md Input: Feature specification from spec.md

Note: This template is filled in by the /speckit.plan command. See .specify/scripts/ for helper scripts.

Summary

Introduce a narrow AI governance foundation inside the existing Laravel monolith by reusing the workspace settings page for workspace-owned AI posture, reusing the system operational-controls page for a global ai.execution stop, and adding one in-process governed AI decision boundary plus a code-owned allowlist for exactly two internal-only use cases. Host-surface authorization remains a precondition; the AI boundary begins only after caller-side entitlement has already succeeded. The first slice is a preflight allow/block contract with audit-ready metadata, not a customer-facing AI workflow and not a model-provider runtime.

Filament v5 remains on Livewire v4, no panel-provider registration changes are needed (bootstrap/providers.php remains the authoritative registration location), no new globally searchable AI resource is introduced, and no new panel-only asset bundle is expected for v1.

Technical Context

Language/Version: PHP 8.4, Laravel 12
Primary Dependencies: Filament v5, Livewire v4, Pest v4, existing Settings/Audit/OperationalControls support services
Storage: PostgreSQL via existing workspace_settings, operational_control_activations, and audit_logs persistence; no new AI tables
Testing: Pest v4 (PHPUnit 12 runner), narrow unit + feature + architecture-guard coverage
Validation Lanes: fast-feedback, confidence
Target Platform: Laravel monolith in apps/platform running via Sail; admin /admin and platform /system panels
Project Type: Web application (Laravel monolith with Filament panels)
Performance Goals: decision evaluation remains synchronous and DB-only in v1; no outbound provider call or queue handoff is required to compute allow/block
Constraints: no direct external provider calls with tenant data; no OperationRun; no result or prompt persistence; reuse existing workspace settings and ops controls; keep /admin and /system auth planes separate; no new asset bundle or second AI admin surface
Scale/Scope: 2 approved use cases, 2 policy modes, 2 provider classes, 6 data classifications, 2 existing operator surfaces, 1 new governed in-process decision seam

UI / Surface Guardrail Plan

Fill for operator-facing or guardrail-relevant workflow changes. Docs-only or template-only work may use concise N/A. Copy the spec classification forward; do not rename or expand it here.

Guardrail scope: changed surfaces on the existing workspace settings and system operational-controls pages
Native vs custom classification summary: native Filament
Shared-family relevance: workspace settings, operational safety controls, audit/status copy
State layers in scope: page
Audience modes in scope: operator-MSP, operator-platform, support-platform
Decision/diagnostic/raw hierarchy plan: decision-first; diagnostics remain secondary on the control history path; no support-raw surface is introduced in v1
Raw/support gating plan: collapsed; raw prompt, source, and provider payload detail are excluded from the slice entirely
One-primary-action / duplicate-truth control: workspace settings keep Save as the single primary mutation action; the system controls card keeps Pause AI execution / Resume AI execution; workspace policy truth and runtime-stop truth stay on separate surfaces
Handling modes by drift class or surface: review-mandatory; any extra AI page, direct Run AI action, or evidence viewer is exception-required
Repository-signal treatment: review-mandatory now, future hard-stop candidate once the no-direct-provider guard exists
Special surface test profiles: standard-native-filament
Required tests or manual smoke: functional-core, state-contract
Exception path and spread control: none; v1 remains inside the two existing pages
Active feature PR close-out entry: Guardrail

Shared Pattern & System Fit

Fill when the feature touches notifications, status messaging, action links, header actions, dashboard signals/cards, navigation entry points, alerts, evidence/report viewers, or any other shared interaction family. Docs-only or template-only work may use concise N/A. Carry the same decision forward from the spec instead of renaming it here.

Cross-cutting feature marker: yes
Systems touched: WorkspaceSettings, SettingsRegistry, SettingsResolver, SettingsWriter, Controls, OperationalControlCatalog, OperationalControlEvaluator, AuditActionId, AuditRecorder, WorkspaceAuditLogger, ContextualHelpResolver, and SupportDiagnosticBundleBuilder
Shared abstractions reused: existing workspace settings persistence + audit flow, existing operational-control evaluator/catalog, existing audit recorder/logger pipeline, existing product-knowledge resolver, and existing support-diagnostics bundle builder path
New abstraction introduced? why?: one in-process governed AI decision boundary and one code-owned use-case catalog, because the current shared settings/ops/audit services do not own AI allow/block semantics
Why the existing abstraction was sufficient or insufficient: settings, ops controls, and audit are already sufficient for persistence, emergency stop, and logging; they are insufficient for AI decision evaluation because the repo currently has no app-level AI seam at all
Bounded deviation / spread control: none; future callers must depend on the new boundary rather than page-local AI helpers

OperationRun UX Impact

Fill when the feature creates, queues, deduplicates, resumes, blocks, completes, or deep-links to an OperationRun. Docs-only or template-only work may use concise N/A.

Touches OperationRun start/completion/link UX?: no
Central contract reused: N/A
Delegated UX behaviors: N/A
Surface-owned behavior kept local: initiation remains on the existing settings and controls pages only; no queued start UX is introduced
Queued DB-notification policy: N/A
Terminal notification path: N/A
Exception path: none

Provider Boundary & Portability Fit

Fill when the feature touches shared provider/platform seams, identity scope, governed-subject taxonomy, compare strategy selection, provider connection descriptors, or operator vocabulary that may leak provider-specific semantics into platform-core truth. Docs-only or template-only work may use concise N/A.

Shared provider/platform boundary touched?: yes
Provider-owned seams: none in v1; no vendor adapters, credentials, or model-selection UI are introduced
Platform-core seams: AI use-case key, provider class, data classification, workspace AI policy, and governed decision contract
Neutral platform terms / contracts preserved: AI use case, provider class, data classification, source family, workspace AI policy, and execution decision
Retained provider-specific semantics and why: none; local_private and external_public are trust classes, not vendor names
Bounded extraction or follow-up path: follow-up-spec for provider integration and usage governance; do not widen inside v1

Constitution Check

GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.

Inventory-first / snapshot truth: N/A. This slice adds no inventory or backup truth and does not change the Intune source-of-truth model.
Read/write separation: PASS. Workspace policy writes stay on the existing settings flow, and pause/resume actions stay on the existing controls flow with confirmation + audit.
Graph contract path: PASS. No Microsoft Graph contract or outbound provider call is introduced.
Deterministic capabilities: PASS. Reuses Capabilities::WORKSPACE_SETTINGS_VIEW, Capabilities::WORKSPACE_SETTINGS_MANAGE, PlatformCapabilities::ACCESS_SYSTEM_PANEL, and PlatformCapabilities::OPS_CONTROLS_MANAGE; no raw capability strings are planned.
Workspace isolation + tenant isolation: PASS. AI decision requests require a host surface that already resolved workspace context and optional tenant entitlement; the boundary does not become a cross-tenant shortcut.
RBAC-UX plane separation: PASS. /admin/settings/workspace stays tenant-plane/workspace-scoped, /system/ops/controls stays platform-scoped, and wrong-plane access remains outside scope.
Destructive confirmation standard: PASS. Pause AI execution and Resume AI execution remain confirmation-protected actions on the existing controls page.
Global search safety: PASS / N/A. No new Resource, Global Search entry, or tenantless AI list is introduced.
OperationRun and Ops-UX: PASS by non-use. This slice creates no OperationRun, queue, notification lifecycle, or Monitoring link.
Data minimization: PASS. Audit stores decision metadata only; raw prompt, source payload, and output text remain excluded.
Test governance (TEST-GOV-001): PASS. Proof stays in narrow unit + feature + architecture-guard coverage; no browser or heavy-governance family is required by default.
Proportionality / no premature abstraction: PASS with bounded exception. One governed AI boundary and one bounded use-case catalog are justified by two concrete future consumers and safety needs; no provider marketplace, queue pipeline, or persistence layer is introduced.
Persisted truth (PERSIST-001): PASS. Workspace AI policy reuses existing workspace settings; no AI table, cache, result store, or prompt ledger is added.
Behavioral state (STATE-001): PASS. disabled and private_only directly change execution eligibility; provider classes and data classifications directly change allow/block behavior.
Shared pattern first / UI semantics / Filament native UI: PASS. Existing settings, controls, and audit primitives are reused; no custom AI shell, second status framework, or duplicate truth surface is introduced.
Provider boundary (PROV-001): PASS. Shared terms stay vendor-neutral (provider class, data classification, AI use case), and direct provider-specific seams are deferred.
Filament/Laravel panel safety: PASS. Livewire v4 remains the Filament v5 runtime, SystemPanelProvider stays on the existing /system panel, and no provider-registration change beyond bootstrap/providers.php is needed.
Asset strategy: PASS. No new panel-only or shared asset registration is planned; deployment keeps the normal cd apps/platform && php artisan filament:assets step if implementation later registers assets.

Gate evaluation: PASS (no constitution violation is required to deliver the narrow v1 slice).

The governed boundary is an in-process decision seam only; it does not create provider execution, queueing, or result persistence.
Workspace policy truth stays inside the existing settings stack and reuses existing audit behavior.
The system kill switch reuses the existing operational-control evaluator and controls page rather than creating a second AI control surface.

Post-design re-check: PASS (design artifacts: research.md, data-model.md, quickstart.md, contracts/private-ai-governance.openapi.yaml).

Test Governance Check

Fill for any runtime-changing or test-affecting feature. Docs-only or template-only work may state concise N/A or none.

Test purpose / classification by changed surface: Unit for the catalog, request/decision contract, operational-control precedence, and audit metadata shaping; Feature for the workspace settings and system controls surfaces; Feature/Guard for the no-direct-provider invariant
Affected validation lanes: fast-feedback, confidence
Why this lane mix is the narrowest sufficient proof: unit coverage proves the decision matrix without Filament boot cost, feature coverage proves the two existing operator surfaces plus authorization/audit integration, and one architecture guard protects against local provider bypasses; browser and heavy-governance coverage add cost without proving new business truth
Narrowest proving command(s):
- export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/Ai/AiUseCaseCatalogTest.php
- export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/Ai/AiDecisionAuditMetadataTest.php
- export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/Ai/GovernedAiExecutionBoundaryTest.php
- export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/SettingsFoundation/WorkspaceAiPolicySettingsTest.php
- export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/SettingsFoundation/WorkspaceSettingsManageTest.php
- export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/SettingsFoundation/WorkspaceSettingsViewOnlyTest.php
- export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/SettingsFoundation/WorkspaceSettingsNonMemberNotFoundTest.php
- export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/SettingsFoundation/WorkspaceSettingsAuditTest.php
- export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/System/OpsControls/AiExecutionOperationalControlTest.php
- export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/System/OpsControls/OperationalControlManagementTest.php
- export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/OperationalControls/OperationalControlAuthorizationSemanticsTest.php
- export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Guards/NoDirectAiProviderBypassTest.php
Fixture / helper / factory / seed / context cost risks: low-to-moderate; reuse existing workspace settings, membership, platform-user, and operational-control fixtures, but avoid browser harnesses, provider emulators, or seeded AI history
Expensive defaults or shared helper growth introduced?: no; the AI boundary should accept simple value objects/arrays, and feature tests should avoid broad WorkspaceSettingsManageTest.php workflow setup unless an implementation change genuinely needs that depth
Heavy-family additions, promotions, or visibility changes: none expected; do not promote this slice into browser or heavy-governance families by default
Surface-class relief / special coverage rule: standard-native-filament relief for the two existing pages, plus one direct service-level rule that blocked requests produce no provider resolution
Closing validation and reviewer handoff: rerun the twelve focused test commands above, verify that ai.execution uses the existing operational-control path, verify that workspace policy changes still reuse the existing settings authorization and audit behavior, and verify that no app-level AI provider client exists outside the governed boundary
Budget / baseline / trend follow-up: none expected; if workspace settings coverage broadens into the existing heavy-governance family, document the lane cost in-feature rather than hiding it
Review-stop questions: lane fit, breadth, hidden setup cost, architecture-guard coverage, accidental provider/runtime scope growth
Escalation path: document-in-feature for contained lane drift; reject-or-split if implementation introduces browser/heavy-governance cost, queue semantics, or provider integration
Active feature PR close-out entry: Guardrail
Why no dedicated follow-up spec is needed: routine narrow test upkeep stays inside this feature; broader AI runtime and provider workflows are already deferred to follow-up candidates

Project Structure

Documentation (this feature)

specs/248-private-ai-policy-foundation/
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│   └── private-ai-governance.openapi.yaml
└── tasks.md             # Created later by /speckit.tasks, not by this plan step

Source Code (repository root)

apps/platform/
├── app/
│   ├── Filament/Pages/Settings/WorkspaceSettings.php
│   ├── Filament/System/Pages/Ops/Controls.php
│   ├── Providers/Filament/SystemPanelProvider.php
│   ├── Services/Audit/
│   │   ├── AuditRecorder.php
│   │   └── WorkspaceAuditLogger.php
│   ├── Services/Settings/
│   │   ├── SettingsResolver.php
│   │   └── SettingsWriter.php
│   ├── Support/Audit/AuditActionId.php
│   ├── Support/Auth/
│   │   ├── Capabilities.php
│   │   └── PlatformCapabilities.php
│   ├── Support/OperationalControls/
│   │   ├── OperationalControlCatalog.php
│   │   └── OperationalControlEvaluator.php
│   ├── Support/ProductKnowledge/ContextualHelpResolver.php
│   ├── Support/SupportDiagnostics/SupportDiagnosticBundleBuilder.php
│   └── Support/Ai/        # likely new narrow namespace if implementation proceeds
└── tests/
    ├── Feature/SettingsFoundation/
    ├── Feature/OperationalControls/
  ├── Feature/System/OpsControls/
    ├── Feature/Guards/
    ├── Unit/Support/OperationalControls/
    ├── Unit/Support/ProductKnowledge/
    └── Unit/Support/Ai/

Structure Decision: Laravel monolith. Implementation stays entirely inside apps/platform, reusing existing settings, audit, and operational-control seams while adding only one narrow AI support namespace if code work later proceeds.

Complexity Tracking

Fill when Constitution Check has violations that must be justified OR when BLOAT-001 is triggered by new persistence, abstractions, states, or semantic frameworks.

Violation	Why Needed	Simpler Alternative Rejected Because
BLOAT-001 — governed AI decision boundary	One central allow/block seam is the smallest safe place to enforce workspace policy, operational controls, provider class gating, and audit metadata before any future AI caller can reach a model	Per-surface AI helpers would duplicate policy/control/audit logic and create bypass risk across product knowledge and diagnostics
BLOAT-001 — code-owned AI use-case catalog	Two concrete future adopters need a single allowlist and stable vocabulary now	Free-form string keys spread across callers would drift and be difficult to guard or audit consistently
STATE-001 — AI policy / provider / data-classification families	These values directly change whether execution is allowed and what may cross the trust boundary	Vendor names or presentation-only labels would not be enforceable, portable, or sufficiently reviewable

Proportionality Review

Fill when the feature introduces a new enum/status family, DTO/presenter/envelope, persisted entity/table/artifact, interface/contract/registry/resolver, taxonomy/classification system, or cross-domain UI framework.

Current operator problem: TenantPilot has no safe app-level AI seam today, so future AI work would otherwise begin as local provider calls and local prompt/policy logic that bypass workspace isolation, runtime controls, and auditability.
Existing structure is insufficient because: the repo already has settings, operational controls, and audit infrastructure, but it has no place to classify AI use cases, provider trust classes, or data classifications, and no single decision service that every caller must use.
Narrowest correct implementation: add one workspace setting (ai.policy_mode), one operational control key (ai.execution), one code-owned use-case catalog for exactly two internal-only consumers, one request/decision contract, and one audit metadata shape. Do not add provider adapters, queue semantics, result persistence, or customer-visible AI surfaces.
Ownership cost created: maintain 2 use-case entries, 2 policy values, 2 provider classes, 6 data classifications, one bounded audit action/metadata shape, and one architecture guard.
Alternative intentionally rejected: local AI helpers on each future surface and a broader multi-provider AI platform were both rejected because they either create safety drift or import speculative architecture before the first real runtime need exists.
Release truth: current-release governance foundation and future-feature preflight seam; not a full AI execution product.

Phase 0 — Research (output: research.md)

Research resolved the remaining implementation-shaping decisions:

Reuse WorkspaceSettings plus SettingsRegistry / SettingsWriter for workspace-owned AI policy truth.
Reuse OperationalControlCatalog / OperationalControlEvaluator and the existing Controls page for ai.execution rather than creating a second AI control surface.
Model v1 as a governed decision boundary, not a provider runtime, queue, or result store.
Lock the first slice to two code-owned internal use cases tied to ContextualHelpResolver and the support-diagnostics bundle path.
Reuse existing audit infrastructure and keep the AI audit family minimal.

Output: research.md

Phase 1 — Design (outputs: data-model.md, contracts/, quickstart.md)

Design artifacts capture the narrow implementation shape:

Existing persisted truth reused: workspace_settings, operational_control_activations, and audit_logs.
New code-owned truth: AI policy mode, provider class, data classification, approved use-case definitions, and request/decision envelopes.
Conceptual contracts cover the existing workspace settings page, the existing system controls page, and the in-process governed decision schema.
Quickstart documents the intended slice order, validation commands, Filament/Livewire assumptions, and the no-new-assets posture.

Artifacts:

Phase 2 — Planning (for tasks.md)

Dependency-ordered implementation outline for the later tasks.md step:

Extend the existing settings registry and workspace settings page with ai.policy_mode and plain-language explanation content, without broadening the singleton settings workflow.
Add ai.execution to the operational-control catalog and controls page, keeping pause/resume confirmation-protected and audit-backed.
Introduce a narrow Support/Ai namespace containing the use-case catalog, request/decision value objects, and the governed decision boundary only.
Reuse the existing audit pipeline for workspace policy mutations and add one bounded AI decision action/metadata shape for allow/block evaluations.
Name ContextualHelpResolver and SupportDiagnosticBundleBuilder as the first adopters, but do not ship customer-facing AI UI, model-provider runtime code, or direct caller wiring beyond what the boundary contract itself requires.
Add focused unit, feature, and architecture-guard tests while keeping browser and heavy-governance families out of scope by default.
Run focused tests and Pint after implementation; no asset build is expected unless implementation later registers Filament assets.

Post-Implementation Close-Out

Implementation status: Implemented and validated on 2026-04-27.
TEST-GOV-001 outcome: PASS. Proof stayed in focused Pest Unit and Feature lanes plus one architecture guard, with no browser or heavy-governance suite expansion.
Executed validation summary:
- AI boundary unit lane: 8 tests, 83 assertions passed.
- AI execution controls feature lane: 1 test, 34 assertions passed.
- Operational controls regression lane: 11 tests, 167 assertions passed.
- Workspace settings lane: 20 tests, 267 assertions passed.
- Platform authorization semantics lane: 6 tests, 26 assertions passed.
- No-direct-provider guard lane: 1 test, 1 assertion passed.
- Approved source-input lane: 2 tests, 30 assertions passed.
- Adjacent product-knowledge/support-diagnostics regression lane: 14 tests, 107 assertions passed.
- Final targeted feature validation rollup: 42 tests, 530 assertions passed.
- Formatting: export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent passed.
Catalog lock and tenant-context declaration:
- product_knowledge.answer_draft: tenant_context_permitted = false
- support_diagnostics.summary_draft: tenant_context_permitted = true
- Boundary coverage plus the approved source adapters preserved that split.
Browser smoke result: PASS.
- /admin/settings/workspace: authenticated as a workspace manager, changed Workspace AI policy from the default effective disabled state to Private only, saved successfully, and confirmed the effective summary plus approved-use-case/provider-class copy updated on the real page.
- /system/ops/controls: authenticated as a platform operator, opened the AI execution card, paused execution with confirmation and reason text, confirmed the Paused globally state and success notification, then resumed execution and confirmed the enabled state returned.
Environment note: the integrated browser carried a stale or poisoned localhost system-panel session during smoke work. The product routes themselves were healthy; the system-panel smoke path completed successfully on 127.0.0.1 to get a clean host-scoped browser session. This was an environment/browser-session workaround, not a feature bug.
Guardrail close-out: no confirmed in-scope findings remained after the code, validation, browser smoke, and artifact analysis loop. No new provider runtime, queue, result persistence, or customer-facing AI surface was introduced.
Follow-up-spec deferrals retained:
- public or external-provider execution
- result persistence, cache, or prompt/output history
- AI budgeting, credits, or cost controls
- queued AI execution or OperationRun semantics
- customer-facing AI workflows or approval flows

27 KiB Raw Blame History