CodeTracked since May 19, 2026

Make cancel endpoint idempotent for already-interrupted runs

Fixed a race in `cancel_run` where two concurrent cancel requests could both pass the pre-check and one call returned `409` after another had already interrupted the run; the API now re-checks state after `cancel()` fails and returns `202` when the run is already interrupted or already cleaned up.

bytedance/deer-flowcancel_runrun statusinterrupted

What Happened

Fixed a race in `cancel_run` where two concurrent cancel requests could both pass the pre-check and one call returned `409` after another had already interrupted the run; the API now re-checks state after `cancel()` fails and returns `202` when the run is already interrupted or already cleaned up.
Fixed a race in `cancel_run` where two concurrent cancel requests could both pass the pre-check and one call returned `409` after another had already interrupted the run; the API now re-checks state after `cancel()` fails and returns `202` when the run is already interrupted or already cleaned up.
1 evidence item attached for review.

What is Different

Before

Scattered source updates, isolated context, and manual follow-up across multiple feeds.

Now

Added explicit post-`cancel()` re-read logic in the run-cancel path: if cancellation fails, the handler now inspects the latest run record and returns idempotent success for `interrupted` or missing records (cleanup races), with conflict only for completed/error/timeout states.

Why Track This

Why It Matters

Operators and integrations that send duplicate cancel requests (for retries, timeouts, or concurrent controllers) now get deterministic success instead of random `409` failures when a run was already interrupted, reducing false alarms and unnecessary re-cancel loops while keeping protection against canceling already finished work. After a failed `cancel()` call, the gateway re-fetches the run state and maps `interrupted` or absent records to `202`, while preserving `409` for successful/completed runs. Watch for clients that previously treated any `409` as a terminal cancel error, and monitor conflict-rate telemetry under high-concurrency cancellation storms to catch any remaining race regressions.

Impact

What To Watch Next

Watch whether bytedance/deer-flow becomes a repeated pattern.
Track follow-up changes around Agent Orchestration Platforms.
Compare future signals against this evidence trail.
Re-check risk flags: idempotent_cancel_handling_client_expectations, remaining_race_conditions_at_record_cleanup.

Open Topic Timeline Open Technical Event Open Original Sourceidempotent_cancel_handling_client_expectations / remaining_race_conditions_at_record_cleanup / high_concurrency_cancel_conflict_visibility / retry_backoff_interactions_with_202

Supporting Evidence

GITHUB PULL REQUESTHigh Trust

bytedance/deer-flow PR #3057: fix(gateway): make cancel idempotent for already-interrupted runs

The PR changes gateway cancel logic so duplicate or racing cancel calls are handled as successful idempotent operations in interrupted/cleanup races, while keeping `409` for genuinely non-cancelable completed runs.