TenantAtlas/specs/208-heavy-suite-segmentation/research.md

# Research: Filament/Livewire Heavy Suite Segmentation

## Decision 1: Reuse the existing lane governance infrastructure instead of adding a new runner layer

- Decision: Spec 208 should extend the current `TestLaneManifest`, `TestLaneBudget`, `TestLaneReport`, Composer lane commands, and repo-root wrapper scripts rather than introducing a second classification or execution framework.
- Rationale: Spec 206 already established canonical lane entry points, artifact generation, budget reporting, and guard tests. Spec 208's problem is semantic classification and drift control inside that existing system, not missing execution plumbing.
- Alternatives considered:
  - Create a second heavy-suite manifest or dedicated classification runner: rejected because it would duplicate the existing lane contract and create a parallel maintenance burden.
  - Perform only local per-file exclusions: rejected because the problem is repository-wide lane drift, not a single-file exception.

## Decision 2: Use a five-class heavy UI taxonomy tied to cost and purpose

- Decision: The heavy segmentation model should use exactly five classes for the first slice: `ui-light`, `ui-workflow`, `surface-guard`, `discovery-heavy`, and `browser`.
- Rationale: The spec needs enough vocabulary to distinguish localized component checks from broad discovery scans and governance-wide surface guards, but not so much taxonomy that authors and reviewers stop applying it consistently.
- Alternatives considered:
  - Reuse only the existing lane names: rejected because `fast-feedback`, `confidence`, and `heavy-governance` are execution targets, not sufficient descriptions of test character.
  - Create many sub-classes for every Filament test style: rejected because that would overfit current files and violate the proportionality constraint.

## Decision 3: Keep `heavy-governance` as the heavy operational lane instead of introducing a new fifth runtime lane

- Decision: Surface-Guard and Discovery-Heavy families should flow into the existing `heavy-governance` lane, with `browser` remaining separate.
- Rationale: The repository already has a checked-in heavy lane with command wiring, artifacts, budgets, and guard coverage. The missing piece is better segmentation inside that lane, not another operational lane.
- Alternatives considered:
  - Introduce a separate `heavy-ui` operational lane: rejected because the current problem can be solved by better family attribution inside the existing heavy lane.
  - Merge Browser into heavy governance: rejected because browser is already a distinct cost class with stronger isolation and different runtime semantics.

## Decision 4: Represent heavy-family ownership through a hybrid of manifest selectors, Pest groups, and explicit hotspot inventory

- Decision: The source of truth for heavy families should remain hybrid: manifest selectors for lane execution, Pest groups where group-level drift control helps, and an explicit checked-in family inventory for reviewer visibility.
- Rationale: The current suite already uses all three seams in different ways. A hybrid model gives Spec 208 enough precision to isolate heavy families now without forcing a disruptive directory-first rewrite.
- Alternatives considered:
  - Directory-only classification: rejected because mixed files and scattered hotspot tests would remain opaque.
  - Group-only classification: rejected because many heavy families are not yet grouped, and relying on groups alone would delay adoption.
  - File-list-only classification: rejected because it does not scale as families grow.

## Decision 5: Treat mixed files as a first-class segmentation problem

- Decision: When one file combines multiple cost patterns, the repository should either split the file or classify it by the broadest cost-driving behavior.
- Rationale: Several current families mix localized assertions with discovery or surface-wide checks. Directory folklore cannot resolve those cases reliably, and reviewers need an explicit rule.
- Alternatives considered:
  - Let mixed files stay where they are until they become too slow: rejected because the spec is about preventing lane drift before runtime erosion becomes normal.
  - Always force a split: rejected because some files may be readable enough if their broadest cost driver is explicit and guarded.

## Decision 6: Seed the heavy-family inventory from the current hotspot clusters rather than attempting a full-suite rewrite

- Decision: The first inventory should focus on the current heavy clusters that already distort lane cost: broad `tests/Feature/Filament` families, action-surface and header-action discipline tests, navigation-discipline tests, RBAC relation-manager and wizard UI-enforcement families, Baseline Compare feature and Filament pages, concern-based fixture builders, and the existing Browser suite.
- Rationale: Exploratory repository analysis found these clusters repeatedly combining multi-mount Livewire tests, relation-manager breadth, wizard step flows, reflection-based discovery, header-action and navigation-discipline checks, broad action-surface validation, and expensive fixture construction.
- Alternatives considered:
  - Classify every Feature test in one pass: rejected because it would add excessive churn before the taxonomy is proven.
  - Rely only on profiling output without a starting inventory: rejected because the current hotspots are already visible enough to justify an initial catalog.

## Decision 7: Extend drift guards from coarse lane isolation to explicit class-to-lane validation

- Decision: Spec 208 should add or expand guard tests so they validate heavy classification membership, lane compatibility, and wrong-lane drift for `browser`, `surface-guard`, and `discovery-heavy` families.
- Rationale: The repository already has coarse guards for browser isolation and initial heavy-governance placement. Spec 208 needs those guards to become semantic instead of only path-based.
- Alternatives considered:
  - Rely on documentation only: rejected because lane drift is a regression problem and must fail automatically.
  - Rely only on manual review: rejected because the suite surface is already large enough that inconsistent review would reintroduce drift quickly.

## Decision 8: Attribute heavy cost by class and family in reporting, not just by lane

- Decision: `TestLaneReport` and the family budget contract should be extended so heavy cost can be reported by heavy class and named family, not only by lane total.
- Rationale: Once heavy families move out of faster lanes, maintainers still need to know whether drift is coming from discovery-heavy scans, surface-guard breadth, or workflow-heavy multi-mount tests.
- Alternatives considered:
  - Keep lane-only reporting: rejected because lane totals alone cannot explain which heavy family is growing.
  - Add only per-file slowest output: rejected because file-level output lacks the semantic grouping reviewers need for governance decisions.

## Decision 9: Preserve Confidence with documented UI-Light and selected UI-Workflow coverage

- Decision: Confidence should explicitly retain localized `ui-light` coverage and a curated subset of `ui-workflow` tests, while Discovery-Heavy and broad Surface-Guard families move to `heavy-governance`.
- Rationale: The spec explicitly rejects hollowing out Confidence. The lane must still carry meaningful UI safety even as the broadest heavy governance families are removed.
- Alternatives considered:
  - Move nearly all Filament and Livewire tests to Heavy Governance: rejected because it would turn Confidence into a weak smoke-only lane.
  - Leave all current UI-heavy families in Confidence: rejected because that would fail the spec's fast-lane preservation goal.

## Decision 10: Keep the classification catalog repo-local and current-release only

- Decision: The classification catalog should stay inside repository planning and test-governance seams, with no new product persistence, runtime service, or cross-domain application taxonomy.
- Rationale: The problem is local to test-suite architecture and developer feedback loops. Anything broader would violate PROP-001 and import unnecessary maintenance cost.
- Alternatives considered:
  - Build a generic testing metadata system for future features: rejected because the current need is narrower and already satisfied by the manifest-plus-guards model.
  - Persist test family metadata outside the repo: rejected because the information is build-time governance data, not product truth.