# Data Model: Test Runtime Trend Reporting & Baseline Recalibration This feature adds repository-owned governance artifacts only. It does not add product database tables. All objects below are implemented as manifest metadata, generated JSON payloads, markdown summaries, or guard-test fixtures derived from the existing lane report outputs. ## 1. LaneTrendPolicy **Purpose**: Defines the lane-specific rules for bounded history retention, comparable-window evaluation, hotspot visibility, and recalibration guidance. | Field | Type | Description | |-------|------|-------------| | `laneId` | string | Canonical lane identifier (`fast-feedback`, `confidence`, `heavy-governance`, `browser`, `junit`, `profiling`). | | `workflowProfile` | string | Workflow profile that owns the lane history source in CI. | | `retentionLimit` | integer | Max history records retained for the lane. | | `comparisonWindowSize` | integer | Number of recent comparable records used for drift evaluation. | | `minimumComparableSamples` | integer | Required sample count before a stable non-`unstable` health class is allowed. | | `varianceFloorSeconds` | integer | Minimum meaningful delta for the lane, aligned with current enforcement tolerance. | | `nearBudgetHeadroomSeconds` | integer | Headroom threshold for `budget-near`. | | `hotspotFamilyLimit` | integer | Max family deltas shown in readable summaries. | | `hotspotFileLimit` | integer | Max file hotspots shown in readable summaries. | | `slowestEntryRetention` | integer | Max slowest test entries retained in JSON evidence. | | `recalibrationPolicy` | array | Rule summary for acceptable baseline and budget recalibration triggers. | **Relationships** - One `LaneTrendPolicy` governs many `LaneTrendRecord` entries for the same lane. - One `LaneTrendPolicy` informs one `TrendComparisonWindow`, one `LaneDriftAssessment`, and zero or more `RecalibrationDecisionRecord` entries per reporting cycle. **Validation Rules** - `retentionLimit` must be greater than or equal to `comparisonWindowSize`. - `minimumComparableSamples` must be at least 3. - `varianceFloorSeconds` must align with or exceed the lane's existing enforcement tolerance. - Primary lanes use a larger retention window than support lanes. ## 2. LaneTrendRecord **Purpose**: Captures the per-run evidence snapshot that can safely be compared over time. | Field | Type | Description | |-------|------|-------------| | `runRef` | string | Stable run reference from CI or local execution. | | `laneId` | string | Governed lane identifier. | | `workflowId` | string | Workflow profile or logical workflow owner for the run. | | `triggerClass` | string | Pull request, mainline push, manual, scheduled, or local classification. | | `generatedAt` | datetime | When the record was emitted. | | `wallClockSeconds` | number | Current lane runtime in seconds. | | `baselineSeconds` | number or null | Current comparison baseline for the lane if defined. | | `baselineSource` | string | Manifest source or comparison source that supplied the baseline. | | `budgetSeconds` | number | Current lane budget threshold in seconds. | | `budgetStatus` | string | Current lane budget status from the existing budget evaluator. | | `blockingStatus` | string | Whether the current CI context blocks on this outcome. | | `comparisonFingerprint` | string | Hash or structured fingerprint capturing comparability boundaries. | | `classificationTotals` | array | Runtime grouped by current classification totals. | | `familyTotals` | array | Runtime grouped by current family totals. | | `hotspotFiles` | array | Current dominant hotspot files. | | `slowestEntries` | array | Current slowest test entries, capped by policy. | | `artifactRefs` | array | References to the summary, report, budget, JUnit, and history artifacts backing the record. | **Validation Rules** - A record must derive from the same lane's current `summary.md`, `report.json`, `budget.json`, and available JUnit output. - `comparisonFingerprint` must be present for any record eligible for comparison. - `wallClockSeconds`, `budgetSeconds`, and `generatedAt` are required. - `slowestEntries` must not exceed the lane policy retention cap. ## 3. TrendComparisonWindow **Purpose**: Represents the bounded comparable history used to evaluate one lane in one reporting cycle. | Field | Type | Description | |-------|------|-------------| | `laneId` | string | Governed lane identifier. | | `policyRef` | string | Reference to the governing `LaneTrendPolicy`. | | `currentRecord` | object | The latest `LaneTrendRecord`. | | `previousComparableRecord` | object or null | The most recent prior comparable record, if one exists. | | `comparableRecords` | array | Ordered comparable records used for trend evaluation. | | `excludedRecords` | array | Recent records skipped because of fingerprint mismatch or invalid evidence. | | `windowStatus` | enum | `stable`, `insufficient-history`, `scope-changed`, or `noisy`. | | `sampleCount` | integer | Number of comparable records in the active window. | **Validation Rules** - Every comparable record must share the same `comparisonFingerprint`. - `sampleCount` may not exceed `comparisonWindowSize`. - `previousComparableRecord` must be the immediately preceding entry in `comparableRecords` when present. - `windowStatus` becomes `insufficient-history` whenever `sampleCount` is below `minimumComparableSamples`. ## 4. LaneDriftAssessment **Purpose**: Summarizes the current drift verdict for one lane using the bounded comparison window. | Field | Type | Description | |-------|------|-------------| | `laneId` | string | Governed lane identifier. | | `healthClass` | enum | `healthy`, `budget-near`, `trending-worse`, `regressed`, or `unstable`. | | `deltaToPreviousSeconds` | number or null | Current runtime delta vs previous comparable run. | | `deltaToPreviousPercent` | number or null | Percent delta vs previous comparable run. | | `deltaToBaselineSeconds` | number or null | Current runtime delta vs lane baseline. | | `deltaToBaselinePercent` | number or null | Percent delta vs lane baseline. | | `budgetHeadroomSeconds` | number | Remaining headroom before budget breach. | | `worseningStreak` | integer | Count of recent comparable records showing meaningful worsening. | | `varianceObservedSeconds` | number | Effective variance observed across the active window. | | `recalibrationRecommendation` | enum | `none`, `investigate`, `review-baseline`, or `review-budget`. | | `summaryLine` | string | Human-readable explanation emitted into markdown summaries. | **Validation Rules** - `healthClass` may only be non-`unstable` when the comparison window has at least `minimumComparableSamples` comparable records. - `recalibrationRecommendation` must remain separate from `healthClass`. - `budgetHeadroomSeconds` may be negative only when the lane is over budget. ## 5. HotspotTrendSnapshot **Purpose**: Captures how the dominant runtime contributors changed between the current and previous comparable run. | Field | Type | Description | |-------|------|-------------| | `laneId` | string | Governed lane identifier. | | `familyDeltas` | array | Top family-level deltas with current seconds, previous seconds, and delta values. | | `fileHotspots` | array | Top file hotspots with current/previous runtime and rank movement. | | `newEntrants` | array | Families or files newly entering the visible hotspot set. | | `droppedEntrants` | array | Families or files leaving the visible hotspot set. | | `evidenceAvailability` | enum | `available` or `unavailable`, used when JUnit or attribution evidence is missing. | **Validation Rules** - Human-readable summaries must cap output at the policy's family/file limits. - JSON evidence may retain more detail, but must not exceed `slowestEntryRetention`. - If hotspot evidence is unavailable, the summary must say so explicitly. ## 6. RecalibrationDecisionRecord **Purpose**: Records structured evidence for a proposed, approved, or rejected baseline/budget recalibration. | Field | Type | Description | |-------|------|-------------| | `laneId` | string | Governed lane identifier. | | `targetType` | enum | `baseline` or `budget`. | | `decisionStatus` | enum | `candidate`, `approved`, or `rejected`. | | `evidenceRunRefs` | array | Comparable runs supporting the decision. | | `previousValueSeconds` | number | Existing baseline or budget value. | | `proposedValueSeconds` | number or null | Proposed replacement value. | | `rationaleCode` | enum | `lane-scope-change`, `infrastructure-shift`, `post-improvement-reset`, `sustained-erosion`, `noise-rejected`, or `manual-hold`. | | `recordedIn` | string | Active spec path or implementation PR reference where the decision is documented. | | `notes` | string | Concise reviewer-facing explanation. | **Validation Rules** - Approved baseline changes require at least one accepted rationale tied to scope or environment truth. - Approved budget changes require a stronger evidence window than approved baseline changes. - Rejected decisions must retain the rejection reason. - The artifact may propose candidates, but approval remains human-controlled. ## 7. TrendSummaryCycle **Purpose**: Represents one generated trend-aware reporting cycle across the relevant lanes. | Field | Type | Description | |-------|------|-------------| | `cycleId` | string | Reporting-cycle identifier, typically anchored to the current lane run or summary generation timestamp. | | `generatedAt` | datetime | When the cycle summary was emitted. | | `laneSummaries` | array | Per-lane summary entries containing `laneId`, current runtime, previous comparable runtime, baseline, budget, and the embedded drift assessment used by the readable summary surface. | | `laneAssessments` | array | `LaneDriftAssessment` items for all relevant lanes. | | `hotspotSnapshots` | array | `HotspotTrendSnapshot` items for lanes with available evidence. | | `recalibrationDecisions` | array | Candidate, approved, or rejected recalibration records emitted for the cycle. | | `artifactPublicationStatus` | array | Whether required current-run and history artifacts were published successfully. | | `warnings` | array | Legibility notes such as missing comparable history or unavailable hotspot evidence. | **Validation Rules** - Every relevant primary lane must have exactly one `laneSummaries` entry and exactly one `LaneDriftAssessment` per cycle. - Each `laneSummaries` entry must expose the current runtime, previous comparable runtime, baseline, budget, and embedded health assessment needed by the readable summary surface. - `warnings` must be explicit when any required evidence is unavailable. - The cycle summary must stay readable without requiring a second dashboard surface. ## State Transitions ### LaneDriftAssessment.healthClass - `unstable` -> `healthy`: allowed once there are enough comparable samples and the lane is comfortably below budget without sustained worsening. - `unstable` -> `budget-near`: allowed once there are enough comparable samples and budget headroom falls inside the near-budget window. - `unstable` -> `trending-worse`: allowed once there are enough comparable samples and worsening exceeds the lane variance floor across the bounded window. - `healthy` <-> `budget-near`: allowed as headroom enters or leaves the near-budget band. - `healthy` or `budget-near` -> `trending-worse`: allowed when sustained worsening appears without a budget breach. - `trending-worse` -> `regressed`: allowed when the lane breaches budget or shows a materially worse repeated trend strong enough to stop calling it merely erosion. - Any state -> `unstable`: allowed when comparability breaks, history is insufficient, or the window is too noisy to classify reliably. ### RecalibrationDecisionRecord.decisionStatus - `candidate` -> `approved`: allowed only by explicit human review with structured evidence. - `candidate` -> `rejected`: allowed when the evidence is noisy, incomplete, or policy says repository truth should not move. - `approved` and `rejected`: terminal statuses for the recorded decision.