CodeTracked since May 19, 2026

Persist interrupted LLM partial responses in run journal

In DeerFlow PR #3039, the run worker now buffers streamed `AIMessageChunk` data by stable message ID and, on `RunStatus.interrupted`, writes it as `llm.ai.partial` journal output so a stopped run can return partial assistant text instead of an empty run record after refresh.

RunJournalAIMessageChunkpartial_ai_contentrecord_partial_ai_message

What Happened

In DeerFlow PR #3039, the run worker now buffers streamed `AIMessageChunk` data by stable message ID and, on `RunStatus.interrupted`, writes it as `llm.ai.partial` journal output so a stopped run can return partial assistant text instead of an empty run record after refresh.
In DeerFlow PR #3039, the run worker now buffers streamed `AIMessageChunk` data by stable message ID and, on `RunStatus.interrupted`, writes it as `llm.ai.partial` journal output so a stopped run can return partial assistant text instead of an empty run record after refresh.
1 evidence item attached for review.

What is Different

Before

Scattered source updates, isolated context, and manual follow-up across multiple feeds.

Now

Introduces interruption-safe partial-message persistence by buffering streamed chunks during LLM generation and persisting them when a run is manually stopped, while recording completed message IDs to prevent duplicate entries; this turns previously lost in-progress output into recoverable history.

Why Track This

Why It Matters

Developers and operators reloading a run after an explicit stop will now see the partial assistant response instead of a blank/partial conversation, which preserves diagnostic context and reduces the chance of having to replay or discard interrupted sessions. The mechanism buffers chunked output in the worker and emits `llm.ai.partial` events during the interrupt finalization path, with completed-message ID tracking to suppress duplicates if normal completion happened just before stop. Watch next for partial-buffer cleanup and dedup behavior under rapid stop/retry cycles or high-concurrency streaming, since stale buffers or race windows could otherwise reintroduce duplicated or missing fragments.

Impact

What To Watch Next

Watch whether RunJournal becomes a repeated pattern.
Track follow-up changes around AI Agents.
Compare future signals against this evidence trail.
Re-check risk flags: partial_buffer_cleanup_on_abort, high_concurrency_duplicate_suppression.

Open Topic Timeline Open Technical Event Open Original Sourcepartial_buffer_cleanup_on_abort / high_concurrency_duplicate_suppression / long_running_stream_memory_growth

Supporting Evidence

GITHUB PULL REQUESTHigh Trust

bytedance/deer-flow PR #3039: fix(journal): preserve partial AI response on run interruption (#3036)

Adds `_accumulate_ai_chunk` buffering plus `journal.record_partial_ai_message` in the interruption path, and `RunJournal` now tracks completed IDs to avoid duplicate writes.