Track important changes in Reasoning Models, including capabilities, product updates, adoption signals, risks, and evidence worth continued monitoring.
The pull request updates client payload construction so requests to Azure-hosted DeepSeek v4 endpoints no longer include `extra_body` (including `extra_body.thinking`), which had been rejected with a 400 error, while still sending `reasoning_effort`, a field Azure supports.
What ChangedThe pull request updates client payload construction so requests to Azure-hosted DeepSeek v4 endpoints no longer include `extra_body` (including `extra_body.thinking`), which had been rejected with a 400 error, while still sending `reasoning_effort`, a field Azure supports.
Why It MattersOperators using Microsoft Azure-hosted DeepSeek v4 endpoints will avoid previously common 400 failures caused by `extra_body.thinking`, so Azure inference calls are less likely to break unexpectedly and teams can run reasoning workloads with fewer failed requests and less manual fallback. The code now checks endpoint hostnames and only suppresses `extra_body` for Azure while keeping `reasoning_effort` in the payload, so the remaining compatibility gap is limited to nonstandard host patterns or backend changes; watch for Azure endpoint URL variations and any changes in accepted request fields that could silently reintroduce payload incompatibility.
Final score 81Confidence 981 evidence itemDeepSeek-ReasonixDeepSeek v4Azure Foundryextra_bodyreasoning_effortbaseUrlbuildPayload_isAzureEndpoint()src/client.ts
The About modal’s update button was changed to call the existing Tauri updater plugin’s `check()` API instead of performing a WebView `fetch()` to GitHub’s releases endpoint, which was blocked by CSP and produced a `Failed to fetch` error. The old manual release-tag parsing and semver compare path was removed in favor of plugin-managed update versioning and endpoint handling.
What ChangedThe About modal’s update button was changed to call the existing Tauri updater plugin’s `check()` API instead of performing a WebView `fetch()` to GitHub’s releases endpoint, which was blocked by CSP and produced a `Failed to fetch` error. The old manual release-tag parsing and semver compare path was removed in favor of plugin-managed update versioning and endpoint handling.
Why It MattersDesktop users pressing "Check for updates" in the About screen no longer get a misleading `Failed to fetch` UI error, so update checks now complete through the intended updater path instead of failing silently due to WebView sandbox policy. The fix bypasses the WebView CSP restriction by moving request logic to Tauri IPC and Rust-side updater networking, while preserving version comparison behavior in the plugin. Watch for whether updater-plugin error messages remain clear in offline or restricted networks, and whether version-check edge cases (e.g., release channel mismatches) are surfaced to users as actionable status rather than false negatives.
This PR fixes a compatibility regression for DeepSeek Flash-style models on `api.deepseek.com`: when a request includes `tools`, the client now auto-removes `reasoning_effort`/`reasoning` for models that route through `deepseek-reasoner` (e.g., IDs ending in `-flash`, containing `deepseek-r1`, or equal to `deepseek-reasoner`), preventing the previous immediate 400 error path. The change is implemented via a new `OpenAICompat` flag, provider-side auto-detection wiring, and request-parameter filtering in `buildParams`, with regression tests added to lock behavior.
What ChangedThis PR fixes a compatibility regression for DeepSeek Flash-style models on `api.deepseek.com`: when a request includes `tools`, the client now auto-removes `reasoning_effort`/`reasoning` for models that route through `deepseek-reasoner` (e.g., IDs ending in `-flash`, containing `deepseek-r1`, or equal to `deepseek-reasoner`), preventing the previous immediate 400 error path. The change is implemented via a new `OpenAICompat` flag, provider-side auto-detection wiring, and request-parameter filtering in `buildParams`, with regression tests added to lock behavior.
Why It MattersTool-enabled integrations using DeepSeek Flash (and similar reasoner-routed model IDs) will stop failing at the first request with a 400 error, so agents/subagents and application workflows can execute tool calls without abrupt aborts or retry-heavy error handling. The fix adds `disableReasoningWhenToolsPresent` and routes it through compat resolution so those models keep tool calls working by dropping unsupported reasoning params before dispatch; monitor model routing metadata and DeepSeek capability flags for drift, because changes in route patterns or tool-support semantics could make the heuristic over-broaden or miss new affected models.
Final score 81Confidence 981 evidence itemdeepseek-v4-flashdeepseek-reasonerapi.deepseek.comreasoning_efforttoolsOpenAICompatbuildParamsdisableReasoningWhenToolsPresent
DeepSeek-Reasonix now fixes its heap re-exec flow so child processes restart from a stable package entrypoint derived from import.meta.url instead of process.argv-derived paths, while reusing existing execArgv and skipping redundant re-exec for Vitest workers already in re-run state.
What ChangedDeepSeek-Reasonix now fixes its heap re-exec flow so child processes restart from a stable package entrypoint derived from import.meta.url instead of process.argv-derived paths, while reusing existing execArgv and skipping redundant re-exec for Vitest workers already in re-run state.
Why It MattersOperators and developers running CLI/heap-limited flows get more reliable launches, because re-spawned children now launch from the correct package entrypoint with the same runtime flags instead of inheriting a potentially shifted working directory that can break execution and tests. Technically, re-exec now avoids path drift from process.argv-based resolution and prevents Vitest workers from being replaced repeatedly; monitor for environments where import.meta.url resolution differs from previous argv behavior and for any tooling that relies on custom execArgv mutation at spawn time.
Final score 79Confidence 951 evidence itemimport.meta.urlprocess.argv.slice(1)execArgvheap re-execVitest
This change replaces regex-based automatic lifecycle arming with an explicit host-side EngineeringLifecycleRuntime flow where `engineeringLifecycle.mode` defaults to `off` and strict behavior is enabled only by `/plan strict` or explicit config, with unknown values also falling back to `off`.
What ChangedThis change replaces regex-based automatic lifecycle arming with an explicit host-side EngineeringLifecycleRuntime flow where `engineeringLifecycle.mode` defaults to `off` and strict behavior is enabled only by `/plan strict` or explicit config, with unknown values also falling back to `off`.
Why It MattersDevelopers and operators using planning workflows now have predictable default behavior because strict lifecycle checks no longer activate implicitly, so normal runs avoid unexpected prompt/tooling changes and extra static-prefix overhead, while strict-mode teams can explicitly enable tighter control when needed; watch for regressions in strict-mode command classification edge cases and state consistency across lifecycle transitions, since these updates also change how lifecycle errors are surfaced.
Final score 79Confidence 931 evidence itemEngineeringLifecycleRuntimeengineeringLifecycle.mode/plan strictruntime/plan storeshell command tokenizerinterceptor
The PR makes Perplexity a first-class chat provider in Roo Code by wiring a new handler, provider registry entries, settings/schema updates, UI configuration, and tests so requests can be sent through Perplexity’s OpenAI-compatible endpoint using the `sonar` model family.
What ChangedThe PR makes Perplexity a first-class chat provider in Roo Code by wiring a new handler, provider registry entries, settings/schema updates, UI configuration, and tests so requests can be sent through Perplexity’s OpenAI-compatible endpoint using the `sonar` model family.
Why It MattersRoo Code users and operators can now select Perplexity from the existing provider list and send chat-completions through the same interface, which makes it easier to evaluate another model source without changing clients or rewiring their workflow; the next thing to watch is deployment behavior when switching between configured keys and environment-provided keys, plus real-world endpoint reliability. Technically, this is implemented via a new OpenAI-compatible provider path to `https://api.perplexity.ai`, `sonar`/`sonar-pro`/`sonar-reasoning` variants with 128k context, and explicit fallback order (`perplexityApiKey` -> `PERPLEXITY_API_KEY` -> `PPLX_API_KEY`) with request routing through the provider factory and validation layer.
Final score 78Confidence 971 evidence itemPerplexityRoo Codeapi.perplexity.aiPerplexityHandlerBaseOpenAiCompatibleProviderperplexityApiKeyPERPLEXITY_API_KEYPPLX_API_KEYsonarsonar-proProviderNamecreateHandlerForProvider
Kilo-Org/kilocode updates its message schema to accept `reasoning` as a valid `interleaved.field` value and sets Cerebras Zai-GLM to use that field by default, replacing the unsupported top-level `reasoning_content` path.
What ChangedKilo-Org/kilocode updates its message schema to accept `reasoning` as a valid `interleaved.field` value and sets Cerebras Zai-GLM to use that field by default, replacing the unsupported top-level `reasoning_content` path.
Why It MattersDevelopers building on Kilocode with Cerebras Zai-GLM can avoid failed or rejected calls when sending reasoning/chained content, so reasoning workflows run without extra format shims. This change aligns payloads to Cerebras’s rule that reasoning must be inline in `content`; operators should watch mixed-model or multi-provider setups for regressions where clients still emit `reasoning_content` or assume the old default field.
Final score 78Confidence 971 evidence itemCerebras APICerebras Zai-GLMinterleaved.fieldreasoningreasoning_contentKilo-Org/kilocodeCLI message payload
This PR upgrades `sublinear-time-solver` from 1.7.1 to 1.7.2, and the primary behavior change is the added `find_anomalous_rows_in_subset` contrastive primitive, which switches top-k anomaly checks from full scans to scanning only a caller-provided candidate set (cost moving from O(n log k) to O(|candidates| log k) when candidates are sparse).
What ChangedThis PR upgrades `sublinear-time-solver` from 1.7.1 to 1.7.2, and the primary behavior change is the added `find_anomalous_rows_in_subset` contrastive primitive, which switches top-k anomaly checks from full scans to scanning only a caller-provided candidate set (cost moving from O(n log k) to O(|candidates| log k) when candidates are sparse).
Why It MattersRuflo operators running anomaly ranking on large graphs can get lower latency and lower compute burn on sparse updates, because checks now focus on likely rows instead of scanning the whole candidate space each time. Watch for regressions in recall or missed anomalies when candidate sets are poorly selected, and validate behavior on dense/full-scan workloads where |candidates| is close to n.
Final score 77Confidence 921 evidence itemsublinear-time-solverfind_anomalous_rows_in_subsetcontrastive top-kcandidate setRuflo graph-intelligence
This change updates `AppendContent` and `AppendReasoningContent` to stop scanning after the first matching `TextContent`/`ReasoningContent` entry and return immediately, preventing a streamed delta from being appended to every matching block when a message has multiple parts.
What ChangedThis change updates `AppendContent` and `AppendReasoningContent` to stop scanning after the first matching `TextContent`/`ReasoningContent` entry and return immediately, preventing a streamed delta from being appended to every matching block when a message has multiple parts.
Why It MattersDevelopers building streaming integrations on Crush gain a safer behavior if future messages include multiple text or reasoning blocks, because one delta will no longer be duplicated across several parts and potentially distort visible output. The patch aligns these two helpers with existing sibling functions, and this behavior is now validated by focused tests, so teams should watch for future protocol changes where providers intentionally emit multiple same-type content blocks and would require a different merge policy.
Final score 74Confidence 991 evidence itemAppendContentAppendReasoningContentTextContentReasoningContentMessage.PartsOnTextDeltaOnReasoningDeltago test
The release moves regex search into a worker thread and adds explicit, longer-but-bounded deadlines (search worker 5s→60s, repo walk 15s→120s), with `search_content` walk-level limits and ESC-based cancellation, to prevent search requests from stalling the app.
What ChangedThe release moves regex search into a worker thread and adds explicit, longer-but-bounded deadlines (search worker 5s→60s, repo walk 15s→120s), with `search_content` walk-level limits and ESC-based cancellation, to prevent search requests from stalling the app.
Why It MattersUsers and operators can keep working in Reasonix search without the app becoming unresponsive when a regex query is expensive or malicious, because heavy pattern evaluation is now offloaded and cancelable with bounded runtime. This reduces practical freeze risk during codebase investigation and review workflows; next, monitor whether legitimate long-running regex queries are increasingly interrupted as false timeouts and whether background worker fallback behavior remains stable as repository sizes and query complexity grow.
Final score 73Confidence 881 evidence itemregexsearch worker threadsearch_contentwalk deadlineESC preemptionReDoS
The update corrects a correctness regression where `/search-engine tavily` was silently routed to Mojeek and often failed with 403s, because Tavily was missing from dispatch logic and the web-tool engine setting was cached too early.
What ChangedThe update corrects a correctness regression where `/search-engine tavily` was silently routed to Mojeek and often failed with 403s, because Tavily was missing from dispatch logic and the web-tool engine setting was cached too early.
Why It MattersUsers who change the active search provider with `/search-engine` will now get the provider they selected instead of unexpected fallback behavior, so Tavily-based search workflows avoid silent failures and broken query runs; continue to watch long-running sessions for any regression in latency or behavior when switching engines frequently. Technically, this is enforced by adding the Tavily branch in `webSearchEngine` and resolving endpoint/engine inside dispatch rather than from cached registration options.
Final score 51Confidence 961 evidence itemwebSearchEngineregisterWebToolstavilymojeekruntime /search-engine switch
Google Research announced ReasoningBank, framing a shift toward agent systems that can reuse prior interaction experience to improve future reasoning behavior, rather than acting only from static model weights at each run.
What ChangedGoogle Research announced ReasoningBank, framing a shift toward agent systems that can reuse prior interaction experience to improve future reasoning behavior, rather than acting only from static model weights at each run.
Why It MattersDevelopers and operators of AI agents can leverage ReasoningBank so deployed agents improve with continued use, potentially reducing repeated trial-and-error failures on similar tasks without full manual retuning. This represents an experiential-learning mechanism for agent systems (collecting and reusing prior traces), so teams should watch for trace quality, policy drift from noisy histories, privacy exposure in logged interactions, and increasing storage/compute overhead of maintaining experience data.
Final score 36Confidence 541 evidence itemReasoningBankreasoning agentsexperience replayAI agent learning
This PR fixes a compatibility regression for DeepSeek Flash-style models on `api.deepseek.com`: when a request includes `tools`, the client now auto-removes `reasoning_effort`/`reasoning` for models that route through `deepseek-reasoner` (e.g., IDs ending in `-flash`, containing `deepseek-r1`, or equal to `deepseek-reasoner`), preventing the previous immediate 400 error path. The change is implemented via a new `OpenAICompat` flag, provider-side auto-detection wiring, and request-parameter filtering in `buildParams`, with regression tests added to lock behavior.
ContributionIntroduced an automatic DeepSeek compatibility guard that detects reasoner-routed direct-API models and strips `reasoning_effort`/`reasoning` whenever `tools` is present, resolving a hard failure for tool-calling flows while preserving behavior for `deepseek-v4-pro` and other unaffected paths.
ImpactTool-enabled integrations using DeepSeek Flash (and similar reasoner-routed model IDs) will stop failing at the first request with a 400 error, so agents/subagents and application workflows can execute tool calls without abrupt aborts or retry-heavy error handling. The fix adds `disableReasoningWhenToolsPresent` and routes it through compat resolution so those models keep tool calls working by dropping unsupported reasoning params before dispatch; monitor model routing metadata and DeepSeek capability flags for drift, because changes in route patterns or tool-support semantics could make the heuristic over-broaden or miss new affected models.
This PR upgrades `sublinear-time-solver` from 1.7.1 to 1.7.2, and the primary behavior change is the added `find_anomalous_rows_in_subset` contrastive primitive, which switches top-k anomaly checks from full scans to scanning only a caller-provided candidate set (cost moving from O(n log k) to O(|candidates| log k) when candidates are sparse).
ContributionIntroduced a candidate-set-aware contrastive anomaly primitive through the updated dependency, changing Ruflo-linked workloads to execute contrastive top-k checks on selected rows rather than the full index.
ImpactRuflo operators running anomaly ranking on large graphs can get lower latency and lower compute burn on sparse updates, because checks now focus on likely rows instead of scanning the whole candidate space each time. Watch for regressions in recall or missed anomalies when candidate sets are poorly selected, and validate behavior on dense/full-scan workloads where |candidates| is close to n.
The pull request updates client payload construction so requests to Azure-hosted DeepSeek v4 endpoints no longer include `extra_body` (including `extra_body.thinking`), which had been rejected with a 400 error, while still sending `reasoning_effort`, a field Azure supports.
ContributionAdded explicit Azure endpoint detection in `src/client.ts` and guarded `buildPayload()` to drop `extra_body` for Azure URLs, resolving the `Bad request (DeepSeek 400): Unrecognized request argument supplied: extra_body` failure while preserving `reasoning_effort`; added a regression test for the Azure URL path in `tests/loop-r1-reasoning.test.ts`.
ImpactOperators using Microsoft Azure-hosted DeepSeek v4 endpoints will avoid previously common 400 failures caused by `extra_body.thinking`, so Azure inference calls are less likely to break unexpectedly and teams can run reasoning workloads with fewer failed requests and less manual fallback. The code now checks endpoint hostnames and only suppresses `extra_body` for Azure while keeping `reasoning_effort` in the payload, so the remaining compatibility gap is limited to nonstandard host patterns or backend changes; watch for Azure endpoint URL variations and any changes in accepted request fields that could silently reintroduce payload incompatibility.
The update corrects a correctness regression where `/search-engine tavily` was silently routed to Mojeek and often failed with 403s, because Tavily was missing from dispatch logic and the web-tool engine setting was cached too early.
ContributionIntroduced explicit Tavily handling in the web search dispatcher and removed stale registration-time options capture, so runtime search-engine changes are actually honored during tool execution.
ImpactUsers who change the active search provider with `/search-engine` will now get the provider they selected instead of unexpected fallback behavior, so Tavily-based search workflows avoid silent failures and broken query runs; continue to watch long-running sessions for any regression in latency or behavior when switching engines frequently. Technically, this is enforced by adding the Tavily branch in `webSearchEngine` and resolving endpoint/engine inside dispatch rather than from cached registration options.
This change replaces regex-based automatic lifecycle arming with an explicit host-side EngineeringLifecycleRuntime flow where `engineeringLifecycle.mode` defaults to `off` and strict behavior is enabled only by `/plan strict` or explicit config, with unknown values also falling back to `off`.
ContributionAdded a concrete safety-capability change by introducing an explicit strict lifecycle mode, gating lifecycle prompt contract cost to opt-in paths, tracking lifecycle state in runtime/plan storage, tokenizing shell commands for higher-risk detection accuracy, and hardening repeated interceptor rejections with clearer stop/replan feedback.
ImpactDevelopers and operators using planning workflows now have predictable default behavior because strict lifecycle checks no longer activate implicitly, so normal runs avoid unexpected prompt/tooling changes and extra static-prefix overhead, while strict-mode teams can explicitly enable tighter control when needed; watch for regressions in strict-mode command classification edge cases and state consistency across lifecycle transitions, since these updates also change how lifecycle errors are surfaced.
The About modal’s update button was changed to call the existing Tauri updater plugin’s `check()` API instead of performing a WebView `fetch()` to GitHub’s releases endpoint, which was blocked by CSP and produced a `Failed to fetch` error. The old manual release-tag parsing and semver compare path was removed in favor of plugin-managed update versioning and endpoint handling.
ContributionReplaced `fetch()` + manual GitHub release-tag filtering and version comparison in `desktop/src/ui/about.tsx` with a single updater `check()` call, then removed the unused `RELEASES_API`/`cmpSemver` path and the unreachable i18n key tied to the removed branch.
ImpactDesktop users pressing "Check for updates" in the About screen no longer get a misleading `Failed to fetch` UI error, so update checks now complete through the intended updater path instead of failing silently due to WebView sandbox policy. The fix bypasses the WebView CSP restriction by moving request logic to Tauri IPC and Rust-side updater networking, while preserving version comparison behavior in the plugin. Watch for whether updater-plugin error messages remain clear in offline or restricted networks, and whether version-check edge cases (e.g., release channel mismatches) are surfaced to users as actionable status rather than false negatives.
DeepSeek-Reasonix now fixes its heap re-exec flow so child processes restart from a stable package entrypoint derived from import.meta.url instead of process.argv-derived paths, while reusing existing execArgv and skipping redundant re-exec for Vitest workers already in re-run state.
ContributionUpdated the CLI heap re-launch logic to use import.meta.url as the child entry target and to preserve existing exec arguments during re-exec, plus a guard that treats Vitest workers as already re-executed.
ImpactOperators and developers running CLI/heap-limited flows get more reliable launches, because re-spawned children now launch from the correct package entrypoint with the same runtime flags instead of inheriting a potentially shifted working directory that can break execution and tests. Technically, re-exec now avoids path drift from process.argv-based resolution and prevents Vitest workers from being replaced repeatedly; monitor for environments where import.meta.url resolution differs from previous argv behavior and for any tooling that relies on custom execArgv mutation at spawn time.
This change updates `AppendContent` and `AppendReasoningContent` to stop scanning after the first matching `TextContent`/`ReasoningContent` entry and return immediately, preventing a streamed delta from being appended to every matching block when a message has multiple parts.
ContributionFixed a latent correctness bug by enforcing single-match semantics in two message append helpers, ensuring each incoming delta updates only the first matching content block and adding regression tests that lock this contract.
ImpactDevelopers building streaming integrations on Crush gain a safer behavior if future messages include multiple text or reasoning blocks, because one delta will no longer be duplicated across several parts and potentially distort visible output. The patch aligns these two helpers with existing sibling functions, and this behavior is now validated by focused tests, so teams should watch for future protocol changes where providers intentionally emit multiple same-type content blocks and would require a different merge policy.
The release moves regex search into a worker thread and adds explicit, longer-but-bounded deadlines (search worker 5s→60s, repo walk 15s→120s), with `search_content` walk-level limits and ESC-based cancellation, to prevent search requests from stalling the app.
ContributionIntroduced worker-thread isolation for regex matching in desktop search plus strict execution deadlines and interruption support, so pathological or intentionally abusive patterns are canceled instead of locking the main UI flow.
ImpactUsers and operators can keep working in Reasonix search without the app becoming unresponsive when a regex query is expensive or malicious, because heavy pattern evaluation is now offloaded and cancelable with bounded runtime. This reduces practical freeze risk during codebase investigation and review workflows; next, monitor whether legitimate long-running regex queries are increasingly interrupted as false timeouts and whether background worker fallback behavior remains stable as repository sizes and query complexity grow.
Kilo-Org/kilocode updates its message schema to accept `reasoning` as a valid `interleaved.field` value and sets Cerebras Zai-GLM to use that field by default, replacing the unsupported top-level `reasoning_content` path.
ContributionAdded `reasoning` as a supported `interleaved.field` and changed the Cerebras Zai-GLM default to emit reasoning via this inline-compatible field, fixing request-format incompatibility with the Cerebras API.
ImpactDevelopers building on Kilocode with Cerebras Zai-GLM can avoid failed or rejected calls when sending reasoning/chained content, so reasoning workflows run without extra format shims. This change aligns payloads to Cerebras’s rule that reasoning must be inline in `content`; operators should watch mixed-model or multi-provider setups for regressions where clients still emit `reasoning_content` or assume the old default field.
The PR makes Perplexity a first-class chat provider in Roo Code by wiring a new handler, provider registry entries, settings/schema updates, UI configuration, and tests so requests can be sent through Perplexity’s OpenAI-compatible endpoint using the `sonar` model family.
ContributionIntroduced a full Perplexity integration path: new provider types and model metadata (`PerplexityModelId`, `perplexityModels`), provider settings and schema registration, a dedicated `PerplexityHandler` with settings-first API key resolution, factory wiring, profile validation updates, settings UI entry points, locale strings, and tests for URL/key precedence/error behavior.
ImpactRoo Code users and operators can now select Perplexity from the existing provider list and send chat-completions through the same interface, which makes it easier to evaluate another model source without changing clients or rewiring their workflow; the next thing to watch is deployment behavior when switching between configured keys and environment-provided keys, plus real-world endpoint reliability. Technically, this is implemented via a new OpenAI-compatible provider path to `https://api.perplexity.ai`, `sonar`/`sonar-pro`/`sonar-reasoning` variants with 128k context, and explicit fallback order (`perplexityApiKey` -> `PERPLEXITY_API_KEY` -> `PPLX_API_KEY`) with request routing through the provider factory and validation layer.
Google Research announced ReasoningBank, framing a shift toward agent systems that can reuse prior interaction experience to improve future reasoning behavior, rather than acting only from static model weights at each run.
ContributionIntroduces a concrete capability for reasoning agents: a structured loop that captures prior execution traces and uses them as learning signal for future behavior improvement.
ImpactDevelopers and operators of AI agents can leverage ReasoningBank so deployed agents improve with continued use, potentially reducing repeated trial-and-error failures on similar tasks without full manual retuning. This represents an experiential-learning mechanism for agent systems (collecting and reusing prior traces), so teams should watch for trace quality, policy drift from noisy histories, privacy exposure in logged interactions, and increasing storage/compute overhead of maintaining experience data.