CodeTracked since May 19, 2026

Preserve interrupted run partial AI messages in journal

Fixes a user-visible data-loss path where stopping generation mid-stream caused `on_llm_end` to skip writeback, so partial assistant output was never recorded and disappeared after refresh. The change adds buffered per-message chunk tracking and writes a partial AI message when a run is interrupted, preventing interrupted sessions from losing context and making interrupted run history actionable.

RunJournalAIMessageChunkRunStatus.interruptedrecord_partial_ai_message

What Happened

Fixes a user-visible data-loss path where stopping generation mid-stream caused `on_llm_end` to skip writeback, so partial assistant output was never recorded and disappeared after refresh. The change adds buffered per-message chunk tracking and writes a partial AI message when a run is interrupted, preventing interrupted sessions from losing context and making interrupted run history actionable.
Fixes a user-visible data-loss path where stopping generation mid-stream caused `on_llm_end` to skip writeback, so partial assistant output was never recorded and disappeared after refresh. The change adds buffered per-message chunk tracking and writes a partial AI message when a run is interrupted, preventing interrupted sessions from losing context and making interrupted run history actionable.
1 evidence item attached for review.

What is Different

Before

Scattered source updates, isolated context, and manual follow-up across multiple feeds.

Now

Adds interruption-safe persistence for streaming LLM output by buffering `AIMessageChunk` payloads per stable message ID in the worker, then writing them as `llm.ai.partial` on interrupted exits; adds completed-message ID tracking in `RunJournal` to avoid duplicate writes when completion and interruption overlap.

Why Track This

Why It Matters

Users who cancel a generation with Stop now keep the partial assistant output in run history after refresh, so they can continue from the last emitted content instead of seeing an empty/blank recovery state. The worker now accumulates stream chunks and persists them via `journal.record_partial_ai_message` when `RunStatus.interrupted` occurs, while `RunJournal` deduplicates IDs already finalized through `on_llm_end` to avoid duplicate entries. Teams should watch for stability of message IDs and ensure interrupted-session caches are bounded and cleaned so repeated rapid cancellations do not accumulate stale partial data.

Impact

What To Watch Next

Watch whether RunJournal becomes a repeated pattern.
Track follow-up changes around Agent Evaluation and Observability.
Compare future signals against this evidence trail.
Re-check risk flags: message_id_stability_changes, interrupted_then_completed_race.

Open Topic Timeline Open Technical Event Open Original Sourcemessage_id_stability_changes / interrupted_then_completed_race / partial_chunk_cache_growth / dedupe_set_consistency

Supporting Evidence

GITHUB PULL REQUESTHigh Trust

bytedance/deer-flow PR #3039: fix(journal): preserve partial AI response on run interruption (#3036)

When users stop generation mid-call, `on_llm_end` was not firing for the active LLM stream, so `RunJournal` stored no partial output and the response was lost after page refresh; PR #3039 adds partial accumulation plus interrupted-run persistence.