Back to Signal Feed
BenchmarkTracked since May 19, 2026

Add hard-gated cold-start walltime regression checks for agent-deck CLI

This change adds a dedicated walltime regression suite for the CLI cold-start path, making startup latency a CI-gated performance contract instead of a latent post-merge issue. The primary gate centers on `TestPerf_ColdStart_Help` and `TestPerf_ColdStart_Version`, with `make bench` retained as advisory trend output.

agent-deck CLI cold startTestPerf_*ColdBudgetTrimmedMean

What Happened

  • This change adds a dedicated walltime regression suite for the CLI cold-start path, making startup latency a CI-gated performance contract instead of a latent post-merge issue. The primary gate centers on `TestPerf_ColdStart_Help` and `TestPerf_ColdStart_Version`, with `make bench` retained as advisory trend output.
  • This change adds a dedicated walltime regression suite for the CLI cold-start path, making startup latency a CI-gated performance contract instead of a latent post-merge issue. The primary gate centers on `TestPerf_ColdStart_Help` and `TestPerf_ColdStart_Version`, with `make bench` retained as advisory trend output.
  • 1 evidence item attached for review.

What is Different

Before

Scattered source updates, isolated context, and manual follow-up across multiple feeds.

Now

Introduced a hard-gated cold-start performance test path (`TestPerf_ColdStart_Help`, `TestPerf_ColdStart_Version`) with budget-based walltime assertions in CI, including shared helper utilities (`ColdBudget`, `TrimmedMean`) and workflow wiring so startup regressions are automatically detected during PR checks.

Why Track This

Why It Matters

Operators and users will be protected from silent CLI startup regressions, because merged changes now fail performance gates when cold starts become materially slower before release. The gate classifies cold-start timing as COLD, applies the stricter formula (`max(base×5, 1ms)` with a CI multiplier of 2.0), and uses an 11-sample trimmed-mean measurement to reduce noise; watch for CI timing instability and verify whether the next latency-dominant paths (like storage-backed lifecycle operations) still bypass this guard.

Impact

Operators and users will be protected from silent CLI startup regressions, because merged changes now fail performance gates when cold starts become materially slower before release. The gate classifies cold-start timing as COLD, applies the stricter formula (`max(base×5, 1ms)` with a CI multiplier of 2.0), and uses an 11-sample trimmed-mean measurement to reduce noise; watch for CI timing instability and verify whether the next latency-dominant paths (like storage-backed lifecycle operations) still bypass this guard.

What To Watch Next

  • Watch whether agent-deck CLI cold start becomes a repeated pattern.
  • Track follow-up changes around Observability and Tracing.
  • Compare future signals against this evidence trail.
  • Re-check risk flags: ci_timing_variance_false_positives, clock_resolution_floor_sensitivity.
Open Topic TimelineOpen Technical EventOpen Original Sourceci_timing_variance_false_positives / clock_resolution_floor_sensitivity / cold_start_measurement_noise_at_scale / coverage_gap_storage_layer_lifecycle

Supporting Evidence