TenantAtlas/docs/audits/tenantpilot-architecture-audit-constitution.md
ahmido 641bb4afde feat: implement tenant lifecycle operability semantics (#172)
## Summary
- implement Spec 143 tenant lifecycle, operability, and tenant-context semantics across chooser, tenant management, onboarding, and canonical operation viewers
- add centralized tenant lifecycle and operability support types, audit action coverage, and lifecycle-aware badge and action handling
- add feature and unit coverage for tenant chooser eligibility, global search scoping, canonical operation access, onboarding authorization, and lifecycle presentation

## Testing
- vendor/bin/sail artisan test --compact
- vendor/bin/sail bin pint --dirty --format agent

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #172
2026-03-15 09:08:36 +00:00

451 lines
14 KiB
Markdown

# TenantPilot Architecture Audit Constitution
## Purpose
This constitution defines the non-negotiable architecture, security, and workflow rules for TenantPilot / TenantAtlas.
It is the standard for every AI or human repository audit. Audits must not stop at local correctness or framework best practices. They must evaluate whether the implementation violates, bypasses, or dilutes the intended enterprise SaaS operating model.
The audit focuses on:
- workspace and tenant isolation
- capability-first RBAC
- auditable operations
- safe Livewire and Filament interaction
- deterministic workflow semantics
- consistent information architecture
- provider and job boundaries
- negative-path test coverage
## Product Context
TenantPilot is not a generic admin panel and not low-risk CRUD software.
It is an enterprise SaaS platform for Intune and Microsoft 365 governance, with emphasis on:
- backup, restore, and versioning
- inventory, drift, findings, and exceptions
- operations, monitoring, and auditability
- a workspace-first operating model
- tenant-bound managed data
- security- and compliance-sensitive workflows
All implementations must be judged against this target model, not against Laravel or Filament defaults.
## I. Constitutional Principles
### 1. Workspace-first is canonical
Workspace is the primary operating context.
Tenant is a secondary domain context inside a workspace.
The audit must flag any implementation that:
- handles tenant context without a workspace frame
- introduces competing context sources
- resolves scope ad hoc instead of through a canonical path
### 2. Capability-first RBAC is binding
Access and mutation are enforced through capabilities, gates, and policies, not through implicit UI assumptions, role names, or navigation alone.
The audit must flag any implementation that:
- hides authorization only in the UI
- distributes capability decisions inconsistently across related flows
- executes actions or jobs without backend rechecks
### 3. Auditability is mandatory
Security-relevant and operational changes must be traceable.
The audit must flag any implementation that:
- performs relevant mutations without an audit trail
- leaves run or workflow outcomes weakly referenced
- stores important decisions only in transient UI state
### 4. Workflow trust is a product feature
Wizards, compare flows, review flows, findings, exceptions, restore, and other operational experiences must be deterministic, explainable, and repeatable.
The audit must flag any implementation that:
- has unclear resume, retry, reopen, or archive semantics
- lacks enforced status transition rules
- lets UI and backend interpret workflow rules differently
### 5. Strategic consistency beats local convenience
A locally convenient fix is unacceptable if it weakens the target model, duplicates domain logic, or introduces new side paths.
## II. Hard Architecture Invariants
### A. Context and Ownership
#### A1. Tenant-sensitive data must only be accessed through canonical context
Reads and writes for tenant- or workspace-bound data must enforce valid scope.
Audit finding if:
- direct `find()`, `findOrFail()`, unscoped relation loads, or free-form queries exist on sensitive models
- Filament pages, widgets, relation managers, or actions invent their own scope logic
- route model binding does not enforce ownership cleanly
#### A2. Ownership must be explicit and consistent
Every relevant model must be clearly classifiable as one of:
- workspace-owned
- tenant-owned
- system- or platform-owned
Audit finding if:
- ownership is implicit, inconsistent, or modeled differently per feature
- mixed scopes exist without explicit rules
- the data model and UI context model diverge
#### A3. Cross-tenant leakage is a Severity 1 violation
Any potential or actual path to read, mutate, or indirectly disclose another tenant's data is a constitutional failure.
### B. RBAC and Authorization
#### B1. UI visibility never replaces backend authorization
`visible()`, `hidden()`, navigation guards, and disabled buttons are UX only.
Audit finding if:
- mutating actions are only hidden in the UI
- backend actions, services, or jobs lack policy or gate enforcement
- related records are visible while underlying capability checks are absent
#### B2. Jobs must revalidate scope and authorization
Asynchronous or decoupled execution must not trust earlier UI checks.
Audit finding if:
- jobs are started with IDs and later act blindly
- authorization or scope rechecks are missing
- job execution is not bound cleanly to workspace, tenant, actor, or run context
#### B3. Capability drift is an architecture problem
If related pages, actions, APIs, jobs, or services enforce different capabilities for the same use case, that is architectural drift.
Audit finding if:
- related entry points enforce different capability models for the same operation
- role names are used where capabilities should govern behavior
### C. Livewire and Filament State Safety
#### C1. Public component state is untrusted input
Livewire and Filament state must never be treated as trusted.
Audit finding if:
- full Eloquent models are stored in public state
- sensitive attributes can appear in serialized component payloads
- IDs, tenant references, step references, or ownership-relevant values are mutable
#### C2. Sensitive data must never enter frontend state
Secrets, provider credentials, tokens, internal diagnostics, or similar sensitive values must not appear in serializable public properties or view context.
Audit finding if:
- models with sensitive fields are directly bound to public state
- hidden attributes, DTOs, locked properties, or server-side reconstruction are missing
#### C3. Workflow progress must not depend only on UI state
Resume, step access, completion, and action eligibility must be derived server-side from persisted truth.
Audit finding if:
- wizard steps are controlled only by frontend state
- direct jumps are possible without backend validation
- meaningful status exists only in the form state
### D. Workflow Integrity
#### D1. Status models need formal transitions
Status fields are domain logic, not decoration.
Audit finding if:
- free-form strings are used without transition guards
- forbidden transitions are technically possible
- different code paths interpret the same status differently
#### D2. Resume, retry, reopen, and archive must be deterministic
Every operational flow needs clear rules for:
- what resumes
- what creates a new run or object
- what is idempotent
- what is rejected
- what is auditable
Audit finding if:
- operator behavior is ambiguous
- refresh, revisit, back, or retry semantics depend on incidental state
- parallel sessions are unhandled
#### D3. Findings, exceptions, and risk acceptance need coherent lifecycles
Status, expiry, renewal, reopen, recurrence, and ownership semantics must align in both domain and implementation.
Audit finding if:
- UI actions permit more than the domain model allows
- expiry or renewal semantics are unclear or partial
- recurrence or reopen logic is inconsistent
### E. Operations, Jobs, and Provider Boundaries
#### E1. External provider access must stay inside defined boundaries
Microsoft Graph or other provider interactions must not be initiated from UI pages, Filament actions, widgets, or arbitrary services.
Audit finding if:
- direct SDK, HTTP, or Graph calls appear outside the intended gateway or resolver layer
- provider resolvers are bypassed
- provider-specific types leak unchecked into domain logic
#### E2. OperationRun is canonical for operational execution
Meaningful operations must be traceable through run semantics, status, failure paths, and result references.
Audit finding if:
- significant operations run without `OperationRun`
- result artifacts are not bound to runs
- UI intent and actual execution drift apart
#### E3. Idempotency is mandatory for repeatable operations
Capture, compare, sync, review, alerts, and similar operations must not create duplicate or contradictory results without explicit design.
Audit finding if:
- fingerprints, deduplication, or run correlation are absent where required
- retry can create uncontrolled duplicates
### F. Auditability and Evidence
#### F1. Critical mutations require an audit trail
The audit must verify whether it is traceable:
- who acted
- in which scope
- with which action type or intent
- with what outcome
Audit finding if:
- mutating flows occur without `AuditLog` or an equivalent trace
- exception or risk-acceptance decisions are not durably recorded
#### F2. Reports, findings, evidence, runs, and exceptions must remain referential
Governance-relevant artifacts need stable relationships.
Audit finding if:
- run results exist only in transient context
- findings cannot be traced to source evidence
- reports and evidence are only loosely or implicitly connected
### G. Tests as a Security Boundary
#### G1. Happy-path-only coverage is insufficient
Security- or workflow-critical areas require negative tests.
Audit finding if:
- only visibility or standard CRUD paths are covered
- wrong-tenant, unauthorized, expired, invalid-transition, or forged-state paths are absent
#### G2. Wrong-tenant tests are mandatory
Tenant-sensitive resources, pages, actions, detail views, operations, and relevant APIs must prove that foreign-scope access fails.
Audit finding if:
- systematic wrong-tenant regression coverage is missing
#### G3. Workflow misuse tests are mandatory
Wizards, findings, exceptions, reviews, runs, retry, and resume semantics must have misuse or failure-path coverage.
Audit finding if:
- invalid status jumps
- expired exception paths
- manipulated IDs
- duplicate operations
- race or retry failures
are not tested
## III. Forbidden Anti-Patterns
The auditor must explicitly search for and flag these patterns:
- tenant-sensitive `find()` or `findOrFail()` without scope hardening
- direct provider or Graph calls in Filament pages, actions, widgets, or Livewire components
- public Livewire properties containing full Eloquent models
- mutable foreign IDs or references without locking or server revalidation
- `Model::create($request->all())` or equivalent uncontrolled mass assignment in critical flows
- UI-only authorization
- jobs without repeated scope or capability checks
- free-form status strings without transition rules
- business rules duplicated across pages or resources instead of centralized
- ad hoc context determination instead of canonical resolvers
- relevant operations without `OperationRun`
- relevant mutations without audit trail
- missing wrong-tenant negative tests
## IV. Finding Classification
### 1. Constitutional Violation
Breaks a hard constitutional rule.
Examples:
- potential cross-tenant access
- missing backend authorization
- sensitive data in serialized UI state
- direct provider-boundary bypass
### 2. Architectural Drift
Code works locally but deviates from the strategic target model.
Examples:
- parallel context paths
- inconsistent run semantics
- duplicated domain rules across UI surfaces
### 3. Workflow Trust Gap
Implementation undermines operator trust, determinism, or auditability.
Examples:
- unclear resume semantics
- incomplete status transitions
- unexplained UI states
### 4. Test Blind Spot
A critical failure mode is not covered.
Examples:
- no wrong-tenant test
- no unauthorized-action test
- no retry or idempotency coverage
## V. Severity Model
### Severity 1: Critical
Immediate risk of:
- cross-tenant leakage
- unauthorized mutation
- secret exposure
- scope break
- severe audit or forensic loss
### Severity 2: High
Serious architecture or workflow-trust failure without a proven leak.
Examples:
- jobs without reauthorization
- unclear ownership
- unguarded status transitions
- direct provider bypass
### Severity 3: Medium
Architectural drift or incomplete hardening that is likely to become a safety or maintenance problem.
### Severity 4: Low
Real but non-urgent inconsistency or hardening debt.
Nothing that directly touches workspace isolation, tenant isolation, RBAC, secrets, or auditability may be rated Low.
## VI. Expected Audit Output
For each finding, the auditor must provide:
1. Title
2. Classification
3. Severity
4. Affected Area
5. Evidence
6. Why this matters in TenantPilot
7. Recommended structural correction
8. Delivery recommendation: `hotfix`, `follow-up refactor`, or `dedicated spec required`
## VII. Hotfix vs Spec Rule
### Hotfix
Use when the correction is local, clear, and restores the constitution without redefining IA or domain semantics.
### Dedicated spec required
Use when the finding:
- affects multiple layers
- changes workflow semantics
- changes ownership or context modeling
- affects run, audit, or report models
- affects product IA or operator behavior
- requires new invariants or system-wide standardization
Rule of thumb: if the correct fix is more than adding a guard, it is probably spec-worthy.
## VIII. Audit Mandate
The auditor must not only ask whether the code is correct.
The auditor must ask:
- does this violate the workspace-first model?
- does this violate capability-first RBAC?
- can this undermine operator trust?
- can this lose scope or auditability?
- does this duplicate domain logic across UI boundaries?
- are critical failure modes missing negative tests?
## IX. Non-Goals of the Audit
The auditor must not:
- deliver generic clean-code lectures
- inflate trivial style issues
- demand arbitrary design patterns without target-model fit
- prioritize Laravel or Filament defaults over the product model
- frame cosmetic UI issues as architecture failures
- recommend local fixes when the deeper issue is systemic