TenantAtlas/specs/211-runtime-trend-recalibration/data-model.md

# Data Model: Test Runtime Trend Reporting & Baseline Recalibration

This feature adds repository-owned governance artifacts only. It does not add product database tables. All objects below are implemented as manifest metadata, generated JSON payloads, markdown summaries, or guard-test fixtures derived from the existing lane report outputs.

## 1. LaneTrendPolicy

**Purpose**: Defines the lane-specific rules for bounded history retention, comparable-window evaluation, hotspot visibility, and recalibration guidance.

| Field | Type | Description |
|-------|------|-------------|
| `laneId` | string | Canonical lane identifier (`fast-feedback`, `confidence`, `heavy-governance`, `browser`, `junit`, `profiling`). |
| `workflowProfile` | string | Workflow profile that owns the lane history source in CI. |
| `retentionLimit` | integer | Max history records retained for the lane. |
| `comparisonWindowSize` | integer | Number of recent comparable records used for drift evaluation. |
| `minimumComparableSamples` | integer | Required sample count before a stable non-`unstable` health class is allowed. |
| `varianceFloorSeconds` | integer | Minimum meaningful delta for the lane, aligned with current enforcement tolerance. |
| `nearBudgetHeadroomSeconds` | integer | Headroom threshold for `budget-near`. |
| `hotspotFamilyLimit` | integer | Max family deltas shown in readable summaries. |
| `hotspotFileLimit` | integer | Max file hotspots shown in readable summaries. |
| `slowestEntryRetention` | integer | Max slowest test entries retained in JSON evidence. |
| `recalibrationPolicy` | array | Rule summary for acceptable baseline and budget recalibration triggers. |

**Relationships**

- One `LaneTrendPolicy` governs many `LaneTrendRecord` entries for the same lane.
- One `LaneTrendPolicy` informs one `TrendComparisonWindow`, one `LaneDriftAssessment`, and zero or more `RecalibrationDecisionRecord` entries per reporting cycle.

**Validation Rules**

- `retentionLimit` must be greater than or equal to `comparisonWindowSize`.
- `minimumComparableSamples` must be at least 3.
- `varianceFloorSeconds` must align with or exceed the lane's existing enforcement tolerance.
- Primary lanes use a larger retention window than support lanes.

## 2. LaneTrendRecord

**Purpose**: Captures the per-run evidence snapshot that can safely be compared over time.

| Field | Type | Description |
|-------|------|-------------|
| `runRef` | string | Stable run reference from CI or local execution. |
| `laneId` | string | Governed lane identifier. |
| `workflowId` | string | Workflow profile or logical workflow owner for the run. |
| `triggerClass` | string | Pull request, mainline push, manual, scheduled, or local classification. |
| `generatedAt` | datetime | When the record was emitted. |
| `wallClockSeconds` | number | Current lane runtime in seconds. |
| `baselineSeconds` | number or null | Current comparison baseline for the lane if defined. |
| `baselineSource` | string | Manifest source or comparison source that supplied the baseline. |
| `budgetSeconds` | number | Current lane budget threshold in seconds. |
| `budgetStatus` | string | Current lane budget status from the existing budget evaluator. |
| `blockingStatus` | string | Whether the current CI context blocks on this outcome. |
| `comparisonFingerprint` | string | Hash or structured fingerprint capturing comparability boundaries. |
| `classificationTotals` | array | Runtime grouped by current classification totals. |
| `familyTotals` | array | Runtime grouped by current family totals. |
| `hotspotFiles` | array | Current dominant hotspot files. |
| `slowestEntries` | array | Current slowest test entries, capped by policy. |
| `artifactRefs` | array | References to the summary, report, budget, JUnit, and history artifacts backing the record. |

**Validation Rules**

- A record must derive from the same lane's current `summary.md`, `report.json`, `budget.json`, and available JUnit output.
- `comparisonFingerprint` must be present for any record eligible for comparison.
- `wallClockSeconds`, `budgetSeconds`, and `generatedAt` are required.
- `slowestEntries` must not exceed the lane policy retention cap.

## 3. TrendComparisonWindow

**Purpose**: Represents the bounded comparable history used to evaluate one lane in one reporting cycle.

| Field | Type | Description |
|-------|------|-------------|
| `laneId` | string | Governed lane identifier. |
| `policyRef` | string | Reference to the governing `LaneTrendPolicy`. |
| `currentRecord` | object | The latest `LaneTrendRecord`. |
| `previousComparableRecord` | object or null | The most recent prior comparable record, if one exists. |
| `comparableRecords` | array | Ordered comparable records used for trend evaluation. |
| `excludedRecords` | array | Recent records skipped because of fingerprint mismatch or invalid evidence. |
| `windowStatus` | enum | `stable`, `insufficient-history`, `scope-changed`, or `noisy`. |
| `sampleCount` | integer | Number of comparable records in the active window. |

**Validation Rules**

- Every comparable record must share the same `comparisonFingerprint`.
- `sampleCount` may not exceed `comparisonWindowSize`.
- `previousComparableRecord` must be the immediately preceding entry in `comparableRecords` when present.
- `windowStatus` becomes `insufficient-history` whenever `sampleCount` is below `minimumComparableSamples`.

## 4. LaneDriftAssessment

**Purpose**: Summarizes the current drift verdict for one lane using the bounded comparison window.

| Field | Type | Description |
|-------|------|-------------|
| `laneId` | string | Governed lane identifier. |
| `healthClass` | enum | `healthy`, `budget-near`, `trending-worse`, `regressed`, or `unstable`. |
| `deltaToPreviousSeconds` | number or null | Current runtime delta vs previous comparable run. |
| `deltaToPreviousPercent` | number or null | Percent delta vs previous comparable run. |
| `deltaToBaselineSeconds` | number or null | Current runtime delta vs lane baseline. |
| `deltaToBaselinePercent` | number or null | Percent delta vs lane baseline. |
| `budgetHeadroomSeconds` | number | Remaining headroom before budget breach. |
| `worseningStreak` | integer | Count of recent comparable records showing meaningful worsening. |
| `varianceObservedSeconds` | number | Effective variance observed across the active window. |
| `recalibrationRecommendation` | enum | `none`, `investigate`, `review-baseline`, or `review-budget`. |
| `summaryLine` | string | Human-readable explanation emitted into markdown summaries. |

**Validation Rules**

- `healthClass` may only be non-`unstable` when the comparison window has at least `minimumComparableSamples` comparable records.
- `recalibrationRecommendation` must remain separate from `healthClass`.
- `budgetHeadroomSeconds` may be negative only when the lane is over budget.

## 5. HotspotTrendSnapshot

**Purpose**: Captures how the dominant runtime contributors changed between the current and previous comparable run.

| Field | Type | Description |
|-------|------|-------------|
| `laneId` | string | Governed lane identifier. |
| `familyDeltas` | array | Top family-level deltas with current seconds, previous seconds, and delta values. |
| `fileHotspots` | array | Top file hotspots with current/previous runtime and rank movement. |
| `newEntrants` | array | Families or files newly entering the visible hotspot set. |
| `droppedEntrants` | array | Families or files leaving the visible hotspot set. |
| `evidenceAvailability` | enum | `available` or `unavailable`, used when JUnit or attribution evidence is missing. |

**Validation Rules**

- Human-readable summaries must cap output at the policy's family/file limits.
- JSON evidence may retain more detail, but must not exceed `slowestEntryRetention`.
- If hotspot evidence is unavailable, the summary must say so explicitly.

## 6. RecalibrationDecisionRecord

**Purpose**: Records structured evidence for a proposed, approved, or rejected baseline/budget recalibration.

| Field | Type | Description |
|-------|------|-------------|
| `laneId` | string | Governed lane identifier. |
| `targetType` | enum | `baseline` or `budget`. |
| `decisionStatus` | enum | `candidate`, `approved`, or `rejected`. |
| `evidenceRunRefs` | array | Comparable runs supporting the decision. |
| `previousValueSeconds` | number | Existing baseline or budget value. |
| `proposedValueSeconds` | number or null | Proposed replacement value. |
| `rationaleCode` | enum | `lane-scope-change`, `infrastructure-shift`, `post-improvement-reset`, `sustained-erosion`, `noise-rejected`, or `manual-hold`. |
| `recordedIn` | string | Active spec path or implementation PR reference where the decision is documented. |
| `notes` | string | Concise reviewer-facing explanation. |

**Validation Rules**

- Approved baseline changes require at least one accepted rationale tied to scope or environment truth.
- Approved budget changes require a stronger evidence window than approved baseline changes.
- Rejected decisions must retain the rejection reason.
- The artifact may propose candidates, but approval remains human-controlled.

## 7. TrendSummaryCycle

**Purpose**: Represents one generated trend-aware reporting cycle across the relevant lanes.

| Field | Type | Description |
|-------|------|-------------|
| `cycleId` | string | Reporting-cycle identifier, typically anchored to the current lane run or summary generation timestamp. |
| `generatedAt` | datetime | When the cycle summary was emitted. |
| `laneSummaries` | array | Per-lane summary entries containing `laneId`, current runtime, previous comparable runtime, baseline, budget, and the embedded drift assessment used by the readable summary surface. |
| `laneAssessments` | array | `LaneDriftAssessment` items for all relevant lanes. |
| `hotspotSnapshots` | array | `HotspotTrendSnapshot` items for lanes with available evidence. |
| `recalibrationDecisions` | array | Candidate, approved, or rejected recalibration records emitted for the cycle. |
| `artifactPublicationStatus` | array | Whether required current-run and history artifacts were published successfully. |
| `warnings` | array | Legibility notes such as missing comparable history or unavailable hotspot evidence. |

**Validation Rules**

- Every relevant primary lane must have exactly one `laneSummaries` entry and exactly one `LaneDriftAssessment` per cycle.
- Each `laneSummaries` entry must expose the current runtime, previous comparable runtime, baseline, budget, and embedded health assessment needed by the readable summary surface.
- `warnings` must be explicit when any required evidence is unavailable.
- The cycle summary must stay readable without requiring a second dashboard surface.

## State Transitions

### LaneDriftAssessment.healthClass

- `unstable` -> `healthy`: allowed once there are enough comparable samples and the lane is comfortably below budget without sustained worsening.
- `unstable` -> `budget-near`: allowed once there are enough comparable samples and budget headroom falls inside the near-budget window.
- `unstable` -> `trending-worse`: allowed once there are enough comparable samples and worsening exceeds the lane variance floor across the bounded window.
- `healthy` <-> `budget-near`: allowed as headroom enters or leaves the near-budget band.
- `healthy` or `budget-near` -> `trending-worse`: allowed when sustained worsening appears without a budget breach.
- `trending-worse` -> `regressed`: allowed when the lane breaches budget or shows a materially worse repeated trend strong enough to stop calling it merely erosion.
- Any state -> `unstable`: allowed when comparability breaks, history is insufficient, or the window is too noisy to classify reliably.

### RecalibrationDecisionRecord.decisionStatus

- `candidate` -> `approved`: allowed only by explicit human review with structured evidence.
- `candidate` -> `rejected`: allowed when the evidence is noisy, incomplete, or policy says repository truth should not move.
- `approved` and `rejected`: terminal statuses for the recorded decision.