TenantAtlas/specs/243-product-usage-adoption-telemetry/plan.md
ahmido 6053d87b99
Some checks failed
Main Confidence / confidence (push) Failing after 48s
feat: implement product usage adoption telemetry (#281)
## Summary
- implement spec 243 product usage adoption telemetry end-to-end
- add bounded product usage event capture, aggregation, retention pruning, and system dashboard KPIs
- add unit and feature coverage for telemetry capture, authorization, retention, privacy, and dashboard window behavior

## Validation
- ran focused Pest test suites for telemetry and system dashboard behavior
- ran Laravel Pint formatting
- verified the system dashboard telemetry widget in the integrated browser

## Notes
- branch: `243-product-usage-adoption-telemetry`
- target: `dev`

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #281
2026-04-26 20:52:38 +00:00

202 lines
17 KiB
Markdown

# Implementation Plan: Product Usage & Adoption Telemetry
**Branch**: `243-product-usage-adoption-telemetry` | **Date**: 2026-04-26 | **Spec**: `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/243-product-usage-adoption-telemetry/spec.md`
**Input**: Feature specification from `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/243-product-usage-adoption-telemetry/spec.md`
**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/scripts/` for helper scripts.
## Summary
- Add one tenant-owned telemetry ledger for a bounded set of user-initiated product milestones only: onboarding checkpoint completion, support diagnostics opened, tenant-bound operation started, stored report created, and review-pack generation requested.
- Reuse existing trustworthy source seams instead of inventing passive page tracking or scraping domain tables later: `OnboardingLifecycleService`, support-diagnostics actions, `OperationRunService`, `EntraAdminRolesReportService`, `PermissionPostureFindingGenerator`, and `ReviewPackService` become the only v1 write paths.
- Surface only one read-only adoption summary on the existing system dashboard through a native widget that follows the current `SystemConsoleWindow` filter semantics, renders five visible event families in v1, and includes active-workspace participation for the selected window. No raw event browser, no customer-facing analytics, and no AuditLog or OperationRun overloading are allowed.
## Technical Context
**Language/Version**: PHP 8.4 (Laravel 12)
**Primary Dependencies**: Laravel 12 + Filament v5 + Livewire v4 + Pest; existing `OnboardingLifecycleService`, `OperationRunService`, `SupportDiagnosticBundleBuilder`, `ReviewPackService`, `EntraAdminRolesReportService`, `PermissionPostureFindingGenerator`, system dashboard widgets
**Storage**: PostgreSQL via one new tenant-owned `product_usage_events` table; source truth stays on existing onboarding, operation, report, and review-pack tables
**Testing**: Pest unit + feature tests only
**Validation Lanes**: fast-feedback, confidence
**Target Platform**: Sail-backed Laravel admin and system panels under `/admin` and `/system`
**Project Type**: web
**Performance Goals**: one cheap insert per eligible source milestone, no passive page-view chatter, and one indexed aggregate query for the system dashboard time window without scanning arbitrary logs
**Constraints**: tenant-bound rows only, no pre-tenant onboarding events, no initiator-null operation telemetry, no raw payloads or free text in metadata, no third-party analytics, no raw event browser, no customer-facing analytics, and no new panel or provider registration changes
**Scale/Scope**: 5 code-owned event names, 1 dashboard widget, 1 recorder, 1 summary query, 1 prune command, 1 config-backed 90-day retention rule, and focused source-seam instrumentation only
## UI / Surface Guardrail Plan
- **Guardrail scope**: changed surfaces
- **Native vs custom classification summary**: native Filament + shared stats widget
- **Shared-family relevance**: dashboard signals/cards
- **State layers in scope**: page, widget, URL query
- **Handling modes by drift class or surface**: review-mandatory
- **Repository-signal treatment**: review-mandatory
- **Special surface test profiles**: standard-native-filament
- **Required tests or manual smoke**: functional-core, state-contract
- **Exception path and spread control**: none
- **Active feature PR close-out entry**: Guardrail
## Shared Pattern & System Fit
- **Cross-cutting feature marker**: yes
- **Systems touched**: `App\Filament\System\Pages\Dashboard`, `App\Filament\System\Widgets\ControlTowerKpis`, `App\Services\Onboarding\OnboardingLifecycleService`, `App\Support\SupportDiagnostics\SupportDiagnosticBundleBuilder`, `App\Services\OperationRunService`, `App\Services\EntraAdminRoles\EntraAdminRolesReportService`, `App\Services\PermissionPosture\PermissionPostureFindingGenerator`, `App\Services\ReviewPackService`, and the support-diagnostics page actions on `TenantDashboard` and `TenantlessOperationRunViewer`
- **Shared abstractions reused**: existing system dashboard widget conventions, existing source-owned service/action seams, and current workspace/tenant context resolution before writes
- **New abstraction introduced? why?**: one bounded `ProductTelemetryRecorder`, one code-owned event catalog, and one summary query are justified because telemetry semantics do not belong on the existing audit, operation, or user-preference models
- **Why the existing abstraction was sufficient or insufficient**: existing source seams know when a trustworthy milestone happened, but there is no shared telemetry contract or aggregate read path today
- **Bounded deviation / spread control**: no page-local counters, no direct writes from Blade or Livewire render hooks, and no domain-table-specific telemetry sidecar fields
## OperationRun UX Impact
- **Touches OperationRun start/completion/link UX?**: no
- **Central contract reused**: N/A
- **Delegated UX behaviors**: N/A
- **Surface-owned behavior kept local**: N/A
- **Queued DB-notification policy**: N/A
- **Terminal notification path**: N/A
- **Exception path**: none
## Provider Boundary & Portability Fit
- **Shared provider/platform boundary touched?**: yes
- **Provider-owned seams**: provider-backed operation types, report generation sources, support-diagnostic provider context
- **Platform-core seams**: telemetry event names, feature-area labels, safe metadata schema, system dashboard widget labels
- **Neutral platform terms / contracts preserved**: product telemetry, usage event, feature area, subject reference, active workspaces, recent signals
- **Retained provider-specific semantics and why**: stable canonical operation and report type identifiers may appear in safe metadata because they are already product-owned identifiers used across the repo
- **Bounded extraction or follow-up path**: no multi-provider telemetry abstraction beyond the bounded event catalog; later customer-health work reuses this shape rather than adding a parallel one
## Constitution Check
*GATE: Must pass before implementation begins. Re-check after design changes.*
- Inventory-first / snapshots-second: PASS - telemetry observes product usage only and does not become an external source of truth for tenant configuration, inventory, or backup state
- Read/write separation: PASS - telemetry writes are bounded product-observability writes triggered after existing source actions succeed; no tenant-changing behavior is added
- Graph contract path: PASS - the feature adds no new Graph calls
- RBAC-UX plane separation: PASS - writes originate in existing admin-plane flows after authorization; reads remain system-plane only via the existing dashboard gate
- Workspace isolation / tenant isolation: PASS - telemetry rows are tenant-owned with `workspace_id` and `tenant_id` required; no cross-tenant raw event viewer is introduced
- Run observability / Ops-UX: PASS - `OperationRun` remains execution truth only; telemetry observes a successful tenant-bound user start without altering run UX or lifecycle
- Shared pattern reuse / `XCUT-001`: PASS - widget reuse and source-seam reuse are explicit; no page-local or model-local side ledgers are planned
- Provider boundary / `PROV-001`: PASS - telemetry stores platform-neutral event names and only stable canonical type identifiers, not provider payload or provider transport truth
- Proportionality / `PROP-001` and `ABSTR-001`: PASS - the new structure is justified by a concrete operator need and kept to one bounded ledger, one recorder, one summary query, and one widget
- Persisted truth / `PERSIST-001`: PASS - telemetry rows represent independent product-observability truth with their own retention lifecycle and later reuse by Customer Health Score
- Behavioral state / `STATE-001`: PASS - the event catalog changes later operator visibility and product-health workflows; it is not presentation-only decoration
- Filament-native UI / `UI-FIL-001`: PASS - visibility stays on a native system widget only
- Global search rule: N/A - no new global-searchable resource is introduced
- Panel/provider registration: PASS - no panel or provider registration changes are planned; Livewire remains v4-compatible and provider registration stays in `bootstrap/providers.php`
- Test governance / `TEST-GOV-001`: PASS - proof stays in focused unit + feature coverage only
## Test Governance Check
- **Test purpose / classification by changed surface**: Unit for event-catalog legality, safe metadata, and summary-query behavior; Feature for source capture from real service/action seams plus dashboard access and visibility
- **Affected validation lanes**: fast-feedback, confidence
- **Why this lane mix is the narrowest sufficient proof**: the feature is server-driven and data-focused; unit tests prove the bounded contract, while feature tests prove the real write and read seams without browser duplication
- **Narrowest proving command(s)**:
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Unit/Support/ProductTelemetry/ProductUsageEventCatalogTest.php tests/Unit/Support/ProductTelemetry/ProductTelemetryRecorderTest.php tests/Unit/Support/ProductTelemetry/ProductTelemetrySafeMetadataTest.php tests/Unit/Support/ProductTelemetry/ProductTelemetrySummaryQueryTest.php`
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Onboarding/ProductTelemetryOnboardingCaptureTest.php tests/Feature/SupportDiagnostics/ProductTelemetrySupportDiagnosticsCaptureTest.php tests/Feature/Operations/ProductTelemetryOperationStartCaptureTest.php tests/Feature/Reports/ProductTelemetryReportCaptureTest.php`
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/System/ProductTelemetry/ProductTelemetryDashboardWidgetTest.php tests/Feature/System/ProductTelemetry/ProductTelemetryAuthorizationTest.php tests/Feature/System/ProductTelemetry/ProductTelemetryRetentionTest.php tests/Feature/System/ProductTelemetry/NoAdHocTelemetryBypassTest.php`
- **Fixture / helper / factory / seed / context cost risks**: reuse existing workspace, tenant, user, onboarding session, operation-run, stored-report, and review-pack fixtures; keep any telemetry helper local to this family only
- **Expensive defaults or shared helper growth introduced?**: no
- **Heavy-family additions, promotions, or visibility changes**: none
- **Surface-class relief / special coverage rule**: standard-native-filament relief is sufficient for the system widget; no browser harness is required
- **Closing validation and reviewer handoff**: reviewers should verify tenant-bound rows only, safe metadata only, no AuditLog or OperationRun overload, no passive page-view events, no initiator-null capture, and no raw event browser
- **Budget / baseline / trend follow-up**: none expected beyond ordinary feature-local upkeep
- **Review-stop questions**: did the implementation add passive page views, a raw event list, or a second telemetry store; did any metadata accept free text or raw payloads; did any read surface leave the system plane?
- **Escalation path**: `reject-or-split` if implementation widens into broad analytics or customer-facing dashboards; `document-in-feature` for small source-seam additions that stay bounded to the first-slice catalog
- **Active feature PR close-out entry**: Guardrail
## Project Structure
### Documentation (this feature)
```text
specs/243-product-usage-adoption-telemetry/
├── checklists/
│ └── requirements.md
├── spec.md
├── plan.md
└── tasks.md
```
### Source Code (repository root)
```text
apps/platform/
├── app/
│ ├── Filament/Pages/Operations/TenantlessOperationRunViewer.php
│ ├── Filament/Pages/TenantDashboard.php
│ ├── Filament/System/Pages/Dashboard.php
│ ├── Filament/System/Widgets/
│ │ └── ProductTelemetryKpis.php
│ ├── Models/
│ │ └── ProductUsageEvent.php
│ ├── Support/ProductTelemetry/
│ │ ├── ProductTelemetryRecorder.php
│ │ ├── ProductTelemetrySummaryQuery.php
│ │ └── ProductUsageEventCatalog.php
│ ├── Services/Onboarding/OnboardingLifecycleService.php
│ ├── Services/EntraAdminRoles/EntraAdminRolesReportService.php
│ ├── Services/PermissionPosture/PermissionPostureFindingGenerator.php
│ ├── Services/ReviewPackService.php
│ ├── Services/OperationRunService.php
│ ├── Support/SupportDiagnostics/SupportDiagnosticBundleBuilder.php
│ └── Console/Commands/
│ └── PruneProductUsageEventsCommand.php
├── config/
│ └── tenantpilot.php
├── database/
│ ├── factories/
│ │ └── ProductUsageEventFactory.php
│ └── migrations/
│ └── *_create_product_usage_events_table.php
├── routes/
│ └── console.php
└── tests/
├── Unit/Support/ProductTelemetry/
│ ├── ProductUsageEventCatalogTest.php
│ ├── ProductTelemetryRecorderTest.php
│ ├── ProductTelemetrySafeMetadataTest.php
│ └── ProductTelemetrySummaryQueryTest.php
└── Feature/
├── Onboarding/ProductTelemetryOnboardingCaptureTest.php
├── Operations/ProductTelemetryOperationStartCaptureTest.php
├── Reports/ProductTelemetryReportCaptureTest.php
├── SupportDiagnostics/ProductTelemetrySupportDiagnosticsCaptureTest.php
└── System/ProductTelemetry/
├── ProductTelemetryAuthorizationTest.php
├── ProductTelemetryDashboardWidgetTest.php
├── ProductTelemetryRetentionTest.php
└── NoAdHocTelemetryBypassTest.php
```
**Structure Decision**: Single Laravel web application. The feature adds one bounded telemetry support namespace and one system widget while reusing existing domain services and support-diagnostics page actions as source seams.
## Complexity Tracking
No constitution violations are required. The only new persisted truth and abstraction are the explicitly justified tenant-owned telemetry ledger plus its bounded recorder and summary query.
## Proportionality Review
- **Current operator problem**: product adoption and usage still require anecdotal inference or log inspection
- **Existing structure is insufficient because**: audit, operation, report, review-pack, and tenant-preference models each describe different truths and cannot safely stand in for adoption telemetry
- **Narrowest correct implementation**: one tenant-owned event table, one bounded event catalog, one recorder, one summary query, and one aggregate system widget
- **Ownership cost created**: migration, model, recorder, query, prune command, widget, config key, scheduler entry, and focused tests
- **Alternative intentionally rejected**: AuditLog piggyback, OperationRun-context piggyback, `UserTenantPreference` counters, passive page-view tracking, third-party analytics
- **Release truth**: current-release truth
## Rollout & Risk Controls
- Start with five code-owned event names only. Adding more events requires revisiting the spec scope, not silent catalog growth.
- Keep the first slice tenant-bound and user-initiated only. Pre-tenant onboarding and system-initiated signals are explicit non-goals.
- Keep the read surface aggregate-only on `/system`. A raw event list or customer-facing reporting requires a later spec.
- Use a config-backed 90-day retention window via `tenantpilot.product_usage_event_retention_days` and schedule `tenantpilot:product-usage:prune` daily in `apps/platform/routes/console.php` so telemetry does not become an unbounded side history.
## Implementation Outline
- Add the `product_usage_events` table, model, factory, bounded catalog, recorder, summary query, config-backed retention rule, and prune command.
- Instrument the five declared source seams only: onboarding checkpoint completion, support diagnostics opened, tenant-bound user-started operation, stored-report creation, and review-pack generation request.
- Add a native system dashboard widget that reuses the existing `SystemConsoleWindow` selection and shows aggregate counts only.
- Add unit and feature tests that prove safe metadata, tenant-bound scope, source capture, system access, and retention.
## Constitution Check (Post-Design)
Re-check result: PASS. The plan stays bounded to one tenant-owned observability ledger, reuses existing source seams and native system widgets, keeps provider specifics out of the platform-core contract, leaves `OperationRun` UX unchanged, fixes retention to one explicit config-backed 90-day rule with a daily scheduler anchor in `apps/platform/routes/console.php`, and limits proof to unit + feature coverage.