TenantAtlas/specs/415-generic-content-backed-capture/plan.md
Ahmed Darrazi 736e61c73e
Some checks failed
PR Fast Feedback / fast-feedback (pull_request) Failing after 1m37s
feat: add generic content-backed coverage capture
2026-06-25 21:55:27 +02:00

282 lines
20 KiB
Markdown

# Implementation Plan: Spec 415 - Generic Content-Backed Capture
**Branch**: `415-generic-content-backed-capture` | **Date**: 2026-06-25 | **Spec**: `specs/415-generic-content-backed-capture/spec.md`
**Input**: Feature specification from `/specs/415-generic-content-backed-capture/spec.md`
## Summary
Prepare Coverage v2 to store real content-backed evidence without activating it as customer/operator truth. The implementation should add concrete Coverage v2 resource and evidence persistence, resolve source contracts through the existing registry/Graph contract path, capture eligible payloads through `GraphClientInterface`, normalize/hash/redact payloads, and run remote capture through an authorized, queued, OperationRun-backed service.
The slice is intentionally backend/internal. No Filament page, route, navigation entry, customer output, review/report surface, restore readiness surface, or browser-visible v2 coverage claim is added.
## Technical Context
**Language/Version**: PHP 8.4.15, Laravel 12.52.0
**Primary Dependencies**: Filament 5.2.1, Livewire 4.1.4, Pest 4.3.1, Sail 1.52.0
**Storage**: PostgreSQL; JSONB for raw payload, normalized payload, permission/source metadata
**Testing**: Pest 4; unit, feature, and PostgreSQL lanes where database constraints/indexes require PostgreSQL proof
**Validation Lanes**: fast-feedback, confidence, pgsql; browser N/A unless UI scope is amended
**Target Platform**: Laravel monolith in `apps/platform`, Sail local, Dokploy container staging/production
**Project Type**: web application backend/runtime slice
**Performance Goals**: render-time remains DB-only/no Graph; remote capture queued; indexes limited to known query paths
**Constraints**: no UI activation, no direct Graph calls, no endpoint guessing, no `tenant_id`, no compatibility shim, no raw payload leakage, OperationRun lifecycle service-owned
**Scale/Scope**: initial Spec 414 resource types only; no full TCM catalog, compare/render/restore, or legacy removal
## Existing Repo Truth
- Spec 414 completed the inactive Coverage v2 kernel and contains implementation close-out evidence.
- Existing Coverage v2 kernel files include:
- `apps/platform/app/Models/TenantConfigurationResourceType.php`
- `apps/platform/app/Models/TenantConfigurationSupportedScope.php`
- `apps/platform/app/Services/TenantConfiguration/ResourceTypeRegistry.php`
- `apps/platform/app/Services/TenantConfiguration/SupportedScopeResolver.php`
- `apps/platform/app/Services/TenantConfiguration/ClaimGuard.php`
- `apps/platform/app/Support/TenantConfiguration/*`
- `apps/platform/database/migrations/2026_06_25_000414_create_tenant_configuration_kernel_tables.php`
- Spec 414 deferred `tenant_configuration_resources` and `tenant_configuration_resource_evidence`.
- `OperationRunType` does not yet contain `tenant_configuration.capture`.
- `OperationSummaryKeys::all()` does not currently contain `captured` or `blocked`; Spec 415 should use existing numeric keys unless it explicitly extends the canonical list with tests.
- `config/graph_contracts.php` contains contract entries relevant to `notificationMessageTemplate`, `roleScopeTag`, and `assignmentFilter`. TCM-aligned source eligibility must still be explicit; missing contracts must block capture.
- `Capabilities::EVIDENCE_MANAGE` exists and is granted to Manager/Owner but not Operator/Readonly. It is the default planned capability unless implementation finds a more specific existing capture capability.
## UI / Surface Guardrail Plan
- **Guardrail scope**: no operator-facing surface change.
- **Affected routes/pages/actions/states/navigation/panel/provider surfaces**: N/A.
- **No-impact class, if applicable**: backend-only internal evidence capture. Existing generic Monitoring -> Operations and central DB-notification surfaces may show OperationRun records through the shared lifecycle contract; no feature-local UI, notification copy, links, or rendered controls are added.
- **Native vs custom classification summary**: N/A.
- **Shared-family relevance**: OperationRun lifecycle and Graph service boundary only.
- **State layers in scope**: none.
- **Audience modes in scope**: N/A.
- **Decision/diagnostic/raw hierarchy plan**: raw payloads remain evidence-storage only; no default UI exposure.
- **Raw/support gating plan**: no rendered access path in this spec.
- **One-primary-action / duplicate-truth control**: no UI action introduced; no v2 customer/operator truth.
- **Handling modes by drift class or surface**: hard-stop if UI files/routes/navigation/customer output are touched without spec/plan/tasks amendment.
- **Repository-signal treatment**: review-mandatory for any UI file change, route addition, new Filament resource, or customer/report/review/evidence activation.
- **Special surface test profiles**: N/A.
- **Required tests or manual smoke**: functional-core, persistence, RBAC, OperationRun, no-UI static guards.
- **Exception path and spread control**: none.
- **Active feature PR close-out entry**: Guardrail / N/A no rendered UI surface changed.
- **UI/Productization coverage decision**: No UI surface impact.
- **Coverage artifacts to update**: none.
- **No-impact rationale**: backend-only internal capture path; no reachable UI surface changed.
- **Navigation / Filament provider-panel handling**: no panel/provider change.
- **Screenshot or page-report need**: no.
## Product Surface Contract Plan
- **Product Surface Contract reference**: `docs/product/standards/product-surface-contract.md`.
- **No-legacy posture**: canonical v2 evidence path; no compatibility exception.
- **Page archetype and surface budget plan**: N/A - no rendered product surface changed.
- **Technical Annex and deep-link demotion plan**: no default product view exposes OperationRun, raw evidence IDs, source keys, payloads, fingerprints, or logs.
- **Canonical status vocabulary plan**: N/A - no product-facing status labels.
- **Product Surface exceptions**: none.
- **Browser verification plan**: `N/A - no rendered UI surface changed`; existing generic OperationRun surfaces are not customized by this spec.
- **Human Product Sanity plan**: N/A - no rendered product surface changed.
- **Visible complexity outcome target**: neutral for rendered UI.
- **Implementation report target**: `specs/415-generic-content-backed-capture/implementation-report.md`.
## Filament / Livewire / Deployment Posture
- **Livewire v4 compliance**: Livewire v4.1.4 confirmed; no Livewire UI code planned.
- **Panel provider registration location**: no panel change; Laravel 12 panel providers remain in `apps/platform/bootstrap/providers.php`.
- **Global search posture**: no Filament Resource is added. If implementation accidentally adds a resource, stop and amend artifacts before continuing; resource must disable global search or provide safe View/Edit page and `$recordTitleAttribute`.
- **Destructive/high-impact action posture**: no rendered action. Capture start is high-impact remote/provider work and must authorize server-side, create/reuse OperationRun, audit safely, queue work, and avoid raw payloads.
- **Asset strategy**: no assets, no `filament:assets` deployment requirement from this spec.
- **Testing plan**: unit tests for resolver/normalizer/hash/redaction/outcomes; feature/pgsql tests for persistence, RBAC, provider scope, fake Graph, OperationRun, and no-legacy guards.
- **Deployment impact**: database migrations and queue workers expected; no env vars, scheduler, storage volume, routes, assets, or reverse proxy changes expected unless implementation discovers a repo-real need and amends artifacts.
## Shared Pattern & System Fit
- **Cross-cutting feature marker**: yes.
- **Systems touched**: Coverage v2 kernel, OperationRun, Graph contracts/client, capability registry, audit recorder, queue/job infrastructure.
- **Shared abstractions reused**: `ResourceTypeRegistry`, `SupportedScopeResolver`, `ClaimGuard`, `GraphClientInterface`, `GraphContractRegistry`, `OperationRunService`, `OperationSummaryKeys`, `Capabilities`, `RoleCapabilityMap`.
- **New abstraction introduced? why?**: bounded capture-specific services are introduced because registry-only Coverage v2 cannot persist content evidence, normalize/hash payloads, or produce safe per-type outcomes.
- **Why the existing abstraction was sufficient or insufficient**: existing kernel services classify resource types and claims but do not fetch, normalize, or persist payload evidence. Existing OperationRun and Graph seams are sufficient and must be reused.
- **Bounded deviation / spread control**: no new provider framework, no UI framework, no generic identity engine, no compare/render/restore pipeline.
## OperationRun UX Impact
- **Touches OperationRun start/completion/link UX?**: backend lifecycle yes; rendered UX no.
- **Central contract reused**: OperationRun lifecycle via `OperationRunService`; if a start surface appears, central OperationRun Start UX Contract is mandatory and artifacts must be amended first.
- **Delegated UX behaviors**: no local toast/link/event planned; terminal notifications remain lifecycle-owned through the existing generic OperationRun path.
- **Surface-owned behavior kept local**: initiation inputs only in an internal service/action.
- **Queued DB-notification policy**: no queued DB notifications.
- **Terminal notification path**: central lifecycle mechanism.
- **Exception path**: none.
## Provider Boundary & Portability Fit
- **Shared provider/platform boundary touched?**: yes.
- **Provider-owned seams**: TCM source class, Graph v1 fallback source class, Graph beta experimental source class, provider-specific Graph contract metadata, permission/source context.
- **Platform-core seams**: concrete Coverage v2 resource/evidence ownership, capture outcome vocabulary, OperationRun execution truth, evidence payload truth.
- **Neutral platform terms / contracts preserved**: provider, source contract, operation, evidence, resource, managed environment, capture outcome.
- **Retained provider-specific semantics and why**: Spec 414 source classes remain because this is a TCM-first Microsoft coverage path.
- **Bounded extraction or follow-up path**: Spec 416 Canonical Identity Engine after payload-backed evidence exists.
## Constitution Check
- Inventory-first / snapshots-second: PASS. This spec creates explicit evidence capture, not UI claim truth or snapshot replacement.
- Read/write separation: PASS with controls. Capture writes internal evidence and must be authorized, audited, queued, and OperationRun-backed.
- Single Graph contract path: PASS. Graph calls must go through `GraphClientInterface` and repo contract registry.
- Deterministic capabilities: PASS. Authorization uses canonical capability constants; default planned capability is `EVIDENCE_MANAGE`.
- RBAC-UX: PASS. Non-member/not entitled is 404; member missing capability is 403; server-side Gate/Policy required.
- Workspace isolation: PASS. `workspace_id` + `managed_environment_id` are required for environment-owned records.
- Tenant isolation: PASS in current terminology. No `tenant_id` ownership column is introduced.
- Provider boundary: PASS. Provider-native IDs stay metadata; provider connection must be same workspace/environment.
- OperationRun observability: PASS. Remote/provider capture uses OperationRun and queue.
- OperationRun lifecycle: PASS. Transitions must use `OperationRunService`.
- Summary counts: PASS with constraint. Use `OperationSummaryKeys::all()`; default existing keys avoid new summary-key family.
- Test governance: PASS. Unit/feature/pgsql lanes are named; browser/heavy-governance are N/A.
- Proportionality: PASS with justified complexity. New persistence and services are needed for audit/evidence/source-of-truth correctness.
- No premature abstraction: PASS with bounded exception. Capture services are specific to current Coverage v2 evidence needs and initial resource types.
- Persisted truth: PASS. Evidence rows are durable append-only proof with independent lifecycle.
- Behavioral state: PASS. Capture outcomes affect persistence, run summaries, retry/failure handling, and reviewer gates.
- UI semantics: PASS. No UI framework or rendered status taxonomy.
- Product Surface Contract: PASS. No rendered UI surface changed.
- LEAN-001: PASS. No aliases, dual writes, fallback readers, compatibility shims, or legacy fixtures.
## Test Governance Check
- **Test purpose / classification by changed surface**: Unit for pure capture helpers; Feature/PostgreSQL for persistence, RBAC, OperationRun, Graph fake, provider-scope constraints, no-legacy guards.
- **Affected validation lanes**: fast-feedback, confidence, pgsql.
- **Why this lane mix is the narrowest sufficient proof**: no rendered UI exists; backend behavior and persistence are the risk.
- **Narrowest proving command(s)**:
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/TenantConfiguration`
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/TenantConfiguration`
- `cd apps/platform && ./vendor/bin/sail php vendor/bin/pest -c phpunit.pgsql.xml tests/Feature/TenantConfiguration` when PostgreSQL-only constraints/indexes are added
- **Fixture / helper / factory / seed / context cost risks**: managed-environment/provider-connection/Graph fake setup must remain explicit and local.
- **Expensive defaults or shared helper growth introduced?**: no; any helper must be opt-in.
- **Heavy-family additions, promotions, or visibility changes**: none.
- **Surface-class relief / special coverage rule**: N/A no rendered UI.
- **Closing validation and reviewer handoff**: verify no UI activation, no real Graph calls, no `tenant_id`, no old v1 vocabulary, same-scope provider connection, sanitized contexts, OperationRunService lifecycle.
- **Budget / baseline / trend follow-up**: none expected.
- **Review-stop questions**: lane fit, hidden Graph call risk, fixture breadth, OperationRun summary keys, provider-scope constraints.
- **Escalation path**: document-in-feature for local helper cost; follow-up-spec only for identity engine or UI activation.
- **Active feature PR close-out entry**: Guardrail / no rendered UI.
- **Why no dedicated follow-up spec is needed**: the runtime capture foundation is the current feature; identity/UI/cutover are already deferred follow-ups.
## Project Structure
### Documentation (this feature)
```text
specs/415-generic-content-backed-capture/
├── spec.md
├── plan.md
├── tasks.md
├── checklists/
│ └── requirements.md
└── implementation-report.md
```
### Source Code (likely affected in later implementation)
```text
apps/platform/app/Models/
├── TenantConfigurationResource.php
└── TenantConfigurationResourceEvidence.php
apps/platform/app/Services/TenantConfiguration/
├── CoverageSourceContractResolver.php
├── GenericPayloadNormalizer.php
├── CoverageResourceUpserter.php
├── CoverageEvidenceWriter.php
├── GenericContentEvidenceCaptureService.php
├── StartTenantConfigurationCapture.php
└── CoverageCaptureOutcomeSummarizer.php
apps/platform/app/Support/TenantConfiguration/
└── CaptureOutcome.php
apps/platform/app/Jobs/TenantConfiguration/
└── CaptureTenantConfigurationEvidenceJob.php
apps/platform/database/migrations/
└── *_create_tenant_configuration_capture_tables.php
apps/platform/database/factories/
├── TenantConfigurationResourceFactory.php
└── TenantConfigurationResourceEvidenceFactory.php
apps/platform/tests/Unit/Support/TenantConfiguration/
└── Spec415*Test.php
apps/platform/tests/Feature/TenantConfiguration/
└── Spec415*Test.php
```
**Structure Decision**: Use the existing `apps/platform` Laravel monolith. Keep capture domain code under the existing `Services/TenantConfiguration` and `Support/TenantConfiguration` namespaces. Add jobs under a tenant-configuration job namespace only if the repo does not already have a closer convention.
## Complexity Tracking
| Violation | Why Needed | Simpler Alternative Rejected Because |
|---|---|---|
| New resource/evidence tables | Durable append-only v2 evidence needs independent lifecycle and auditability | v1 snapshots or metadata-only rows would create hidden dual truth |
| New capture services | Fetch/normalize/hash/redact/persist responsibilities must stay out of Filament, models, and jobs | Putting workflow in a job or model would make authorization/audit/Graph seams harder to test |
| New capture outcome family | Implementation must distinguish missing contract, permission, beta, unsupported, captured, and failed because each has different persistence/run behavior | Reusing old gap taxonomy is explicitly forbidden and misleading |
## Proportionality Review
- **Current operator problem**: future Coverage v2 operator/customer claims require concrete evidence proof; otherwise the product can overclaim coverage based on registry truth only.
- **Existing structure is insufficient because**: Spec 414 has registry/scope/claim guard only, while v1 runtime evidence cannot safely stand in for v2 proof.
- **Narrowest correct implementation**: initial 414 resource types, contract-driven eligibility, append-only evidence, generic normalization/hash, redaction, OperationRun-backed async execution, no UI.
- **Ownership cost created**: migrations/models/services/job/tests and ongoing care around redaction, queue behavior, and provider contract mapping.
- **Alternative intentionally rejected**: v1 snapshot promotion, metadata-only capture, UI activation, broad TCM catalog import, semantic compare/render/restore.
- **Release truth**: current-release foundation after completed Spec 414.
## Implementation Phases
### Phase 0 - Preflight
- Confirm branch, HEAD, clean/dirty state.
- Confirm Spec 414 implementation report and completed tasks remain read-only context.
- Confirm no existing `tenant_configuration_resources` / `tenant_configuration_resource_evidence` equivalent exists.
- Confirm graph contract entries and 414 registry metadata for initial resource types.
### Phase 1 - Tests First
- Add unit tests for resolver, normalizer, hash, redaction, and capture outcome behavior.
- Add feature/PostgreSQL tests for persistence, JSONB, same-scope provider connection, RBAC, OperationRun, fake Graph, and no-legacy/no-UI guards.
### Phase 2 - Persistence
- Add concrete resource and evidence tables/models/factories if missing.
- Enforce `workspace_id`, `managed_environment_id`, same-scope `provider_connection_id`, no `tenant_id`, append-only evidence, and targeted indexes.
### Phase 3 - Source Resolution And Normalization
- Resolve source contracts from Coverage v2 registry and repo Graph contract registry.
- Block missing/beta/unsupported sources safely.
- Normalize payloads minimally, hash deterministically, and redact permission/source context.
### Phase 4 - OperationRun Start And Queue
- Add `tenant_configuration.capture` OperationRun type/catalog entry if required by repo conventions.
- Implement authorized internal start service/action.
- Queue remote capture job and keep lifecycle transitions in `OperationRunService`.
- Use existing `OperationSummaryKeys` unless a tested canonical extension is required.
### Phase 5 - Capture And Evidence Write
- Fetch via `GraphClientInterface` fakeable calls only where explicit contracts exist.
- Upsert concrete resource rows by deterministic identity.
- Write append-only evidence rows and per-type outcomes.
- Audit start/completion/failure safely through the existing `AuditRecorder` / `AuditEventBuilder` path with stable `tenant_configuration.capture.started`, `tenant_configuration.capture.completed`, and `tenant_configuration.capture.failed` action IDs.
### Phase 6 - Report And Validation
- Complete implementation report with eligibility matrix and safety proof.
- Run focused Pint/tests/pgsql lane as needed and `git diff --check`.
- Record no-browser, no-assets, no-global-search, no-Filament-provider-change, and deployment impact.
## Stop Conditions
- Spec 414 kernel is missing or not accepted.
- Implementation needs customer/operator UI activation.
- Implementation needs v1 adapter, dual-write, fallback reader, old snapshot promotion, or old gap taxonomy.
- Implementation needs hardcoded Graph endpoints or direct HTTP/SDK calls outside `GraphClientInterface`.
- Implementation needs beta capture opt-in.
- Implementation needs broad identity engine, semantic compare, rendering, restore/apply, or full catalog import.
- Implementation changes rendered UI files, routes, navigation, reports, downloads, or evidence/review surfaces without amending spec/plan/tasks first.