Stage: Expansion

AI Code Review

Track important changes in AI Code Review, including capabilities, product updates, adoption signals, risks, and evidence worth continued monitoring.

AI CODETRACKING

Signal Feed

Changes worth continued tracking

16 unique signals

pull requestMay 20, 2026, 6:47 PM
Fix Review Style Analyze startup by using configured LangGraph endpoint
This PR fixes production failures in Open-SWE's Review Styles → Analyze flow where runs failed to start with `httpx.ConnectError: All connection attempts failed`. It removes a hardcoded `http://localhost:2024` endpoint in `review_style_jobs.py` and instead resolves the LangGraph client URL from `LANGGRAPH_URL`/`LANGGRAPH_URL_PROD` (same logic as `webapp.py`), with a fallback to `get_client()` when not set.
What ChangedThis PR fixes production failures in Open-SWE's Review Styles → Analyze flow where runs failed to start with `httpx.ConnectError: All connection attempts failed`. It removes a hardcoded `http://localhost:2024` endpoint in `review_style_jobs.py` and instead resolves the LangGraph client URL from `LANGGRAPH_URL`/`LANGGRAPH_URL_PROD` (same logic as `webapp.py`), with a fallback to `get_client()` when not set.
Why It MattersDevelopers and operators running PR review style checks can start analyze jobs on Open-SWE without silent startup failures, so automated review quality gates are less likely to get blocked by connection errors. The change replaces the fixed localhost endpoint with environment-aware URL resolution (`LANGGRAPH_URL`/`LANGGRAPH_URL_PROD`) and a `get_client()` fallback, but teams should verify each deployment’s LangGraph URL configuration and continue watching for any remaining `Failed to start review style analyzer` logs or analyzer runs that do not complete.
Final score 81Confidence 971 evidence itemLangGraph SDKreview_style_jobs.pyLANGGRAPH_URLLANGGRAPH_URL_PRODget_clienthttpx.ConnectError
Analyze Evidence
pull requestMay 21, 2026, 4:17 PM
Open SWE Review now requires explicit repo enablement before reviewer webhooks run
The PR introduces an opt-in repository gate for Open SWE Review: a new `enabled_review_repos` list in the LangGraph Store is checked via `_is_repo_enabled_for_review` at reviewer webhook entry points, and only repositories that satisfy both this list and the existing env allowlist can trigger review actions.
What ChangedThe PR introduces an opt-in repository gate for Open SWE Review: a new `enabled_review_repos` list in the LangGraph Store is checked via `_is_repo_enabled_for_review` at reviewer webhook entry points, and only repositories that satisfy both this list and the existing env allowlist can trigger review actions.
Why It MattersOperators and repository admins will see automatic review automation go quiet in repos that are not explicitly enabled, so they can no longer assume every project gets SWE feedback by default after this change. The implementation adds a LangGraph-backed config check (`enabled_review_repos`) to every reviewer webhook chokepoint via `_is_repo_enabled_for_review`, requiring an explicit on-switch in addition to the environment allowlist; teams should monitor rollout for unintended gaps in coverage right after deployment and verify all previously reviewed repos are manually re-enabled before relying on automation.
Final score 80Confidence 941 evidence itemenabled_review_reposOpen SWE ReviewLangGraph Store_is_repo_enabled_for_reviewenv allowlist
Analyze Evidence
pull requestMay 20, 2026, 9:13 PM
Open-SWE dashboard switched to sidebar settings flow with team-level review configuration
The pull request replaces the dashboard's header-based UI with an AppShell/AppSidebar layout and introduces a new team settings path for review controls, moving key reviewer behavior options into dedicated sections. Reviewer settings such as trigger mode, draft-PR review, PR summaries, and autofix behavior (including severity threshold) are now persisted through the new `/dashboard/api/team-settings` flow.
What ChangedThe pull request replaces the dashboard's header-based UI with an AppShell/AppSidebar layout and introduces a new team settings path for review controls, moving key reviewer behavior options into dedicated sections. Reviewer settings such as trigger mode, draft-PR review, PR summaries, and autofix behavior (including severity threshold) are now persisted through the new `/dashboard/api/team-settings` flow.
Why It MattersOperators and maintainers configuring Open-SWE can now manage cloud-agent and review behavior from one consistent dashboard flow, which reduces UI fragmentation and setup mistakes before running PR reviews, but the path still needs production validation end-to-end with real GitHub OAuth and a LangGraph backend to confirm there are no navigation or save regressions. The change also replaces a Vite-sensitive account dropdown implementation that could trigger client hook crashes, so the refactor is aimed at keeping dashboard sessions stable during real OAuth and review workflows. Watch for failures in settings migration from prior routes, sidebar navigation regressions, and hidden admin-route discoverability.
Final score 77Confidence 901 evidence itemAppShellAppSidebar/dashboard/api/team-settingsteam_settingsLangGraph StoreProfileUpdateopen-swe dashboard
Analyze Evidence
pull requestMay 21, 2026, 5:44 PM
Fix review-style prompts stuck in running state
The PR fixes a correctness issue in review-style prompt execution by reconciling LangGraph runtime status with stored prompt state, so prompts that were saved before a status write can no longer remain incorrectly stuck in `running`.
What ChangedThe PR fixes a correctness issue in review-style prompt execution by reconciling LangGraph runtime status with stored prompt state, so prompts that were saved before a status write can no longer remain incorrectly stuck in `running`.
Why It MattersOperators and users of review-style analysis will see fewer apparent hangs, because prompts that used to stay forever in `running` after a missed status write now move to `completed`, reducing manual retries and workflow blockage. The technical change is a sync-time reconciliation between LangGraph and persisted state, and teams should watch for concurrency or partial-write races that could still recreate stale states after crashes or overlapping sync operations.
Final score 76Confidence 951 evidence itemLangGraphreview-style promptsrun statusprompt storestatus synchronization
Analyze Evidence
pull requestMay 21, 2026, 8:34 AM
Add inline base-branch and instruction support to /local-review
The `/local-review` command now accepts optional trailing text: a single token is parsed as the base branch to compare against, while multi-word input is treated as review instructions (usable via `base -- instructions`), with Kilo-specific parsing moved into review-session helpers.
What ChangedThe `/local-review` command now accepts optional trailing text: a single token is parsed as the base branch to compare against, while multi-word input is treated as review instructions (usable via `base -- instructions`), with Kilo-specific parsing moved into review-session helpers.
Why It MattersDevelopers using `/local-review` can now target a specific base branch and supply custom review guidance without an extra blocking step, which makes review sessions faster and keeps the workflow less interrupted by follow-up prompts. The parser now maps a single token to base-branch selection and multi-word text to instruction content via `base -- instructions`, while shared runtime paths stay mostly untouched; continue monitoring for parsing edge cases (e.g., odd branch-name patterns) that could direct reviews to the wrong base or drop instruction text.
Final score 74Confidence 951 evidence itemlocal-reviewreview/session helpersopencodebase branchinstruction parsing
Analyze Evidence
pull requestMay 20, 2026, 10:41 PM
Add role-based review setup schema as foundation for `entire review`
This PR lays the first-step foundation for role-based review configuration by adding `ReviewConfig.Role` and `fix_after_review` settings, plus a dedicated setup flow to collect and persist per-agent roles and review instructions.
What ChangedThis PR lays the first-step foundation for role-based review configuration by adding `ReviewConfig.Role` and `fix_after_review` settings, plus a dedicated setup flow to collect and persist per-agent roles and review instructions.
Why It MattersOperators and teams using `entire review` can now predefine who reviews and who fixes before running review commands, reducing role ambiguity and downstream configuration drift in team workflows. This is implemented through `Role`-based settings, migration helpers, and the new `entire review setup` flow; continue watching downstream consumers of `.entire/settings.json` for assumptions about the removed legacy field and whether follow-up PR #2 cutover correctly preserves these assumptions after the legacy picker is removed.
Final score 73Confidence 941 evidence itemReviewConfig.Rolesettings.LoadMigrateLegacyRolesentire review setupEntireSettings.fix_after_review
Analyze Evidence
releaseMay 22, 2026, 12:42 AM
Added pre/post execution hooks with PreToolUse denial for tool calls
Goose v1.35.0 introduced a new tool-execution hooks system with PreToolUse denial support, enabling custom logic to run before and after each tool call and block disallowed actions before they execute.
What ChangedGoose v1.35.0 introduced a new tool-execution hooks system with PreToolUse denial support, enabling custom logic to run before and after each tool call and block disallowed actions before they execute.
Why It MattersDevelopers and operators can now intercept tool calls before execution, so risky or policy-violating actions are more likely to be prevented instead of happening unexpectedly in active agent sessions. This is implemented via a new extensible pre/post tool execution hook pipeline and dedicated PreToolUse denial support, which should reduce unsafe automation fallout; watch for hook failures, ordering/precedence interactions, and whether denial paths surface clear user-visible diagnostics so legitimate workflows are not accidentally blocked.
Final score 73Confidence 901 evidence itemGoose tool executionpre/post hooksPreToolUse denial hookACP tool runtime
Analyze Evidence
pull requestMay 22, 2026, 9:06 AM
Track only accepted review suggestions in telemetry
Kilo-Org/kilocode now records review-telemetry only when a user accepts a review suggestion, while excluding dismissed, invalid, and non-review suggestions, and marks the next model response as suggest-tool-driven using sanitized review command names.
What ChangedKilo-Org/kilocode now records review-telemetry only when a user accepts a review suggestion, while excluding dismissed, invalid, and non-review suggestions, and marks the next model response as suggest-tool-driven using sanitized review command names.
Why It MattersReview workflow teams get cleaner, more actionable telemetry, because acceptance dashboards now reflect real accepted suggestions instead of overcounting dismissed or invalid ones and can better measure whether suggestion-based review automation is being used; operators should watch for any valid but custom-formatted suggestion commands that stop being tracked due to the new command-name allowlist. This reduces uncertainty in rollout decisions and bug triage, while introducing a new edge case where unsupported command variants may lower observability until command coverage is expanded.
Final score 73Confidence 951 evidence itemreview suggestiontelemetrysuggest toolreview command nameskilo-telemetryopencode
Analyze Evidence
releaseMay 18, 2026, 12:00 PM
entire review gains live JSONL agent-event streaming
`entire review` in v0.6.2 now exposes live multi-agent progress and failure signals through JSONL output, making review execution more observable instead of only showing results at the end of a run.
What Changed`entire review` in v0.6.2 now exposes live multi-agent progress and failure signals through JSONL output, making review execution more observable instead of only showing results at the end of a run.
Why It MattersDevelopers and automation operators using `entire review` can monitor long review sessions as they execute, so failures and stalls are visible earlier and teams can intervene before wasting time on a full, unproductive cycle; this reduces uncertainty in CI workflows and lowers the cost of debugging long checks, but teams should watch for log parser compatibility, increased event volume in downstream tooling, and whether the expanded default scope (including uncommitted changes) changes review signal-to-noise.
Final score 72Confidence 931 evidence itementire reviewJSONL outputagent eventsmulti-agent reviewfailure diagnostics
Analyze Evidence
pull requestMay 20, 2026, 2:58 AM
Promote AI suggestion button to primary position in review comment threads
This PR changes the code-review comment-thread button layout so the AI suggestion action (`costrict.askReviewSuggestionWithAI`) is the primary action (`navigation@1`), replacing the prior accept action (`costrict.acceptIssue`) and removing the duplicated third-slot action entry.
What ChangedThis PR changes the code-review comment-thread button layout so the AI suggestion action (`costrict.askReviewSuggestionWithAI`) is the primary action (`navigation@1`), replacing the prior accept action (`costrict.acceptIssue`) and removing the duplicated third-slot action entry.
Why It MattersReviewers using the Costrict VS Code extension will immediately see the AI suggestion option as the top action in comment threads, reducing steps to trigger AI help and speeding review interactions; watch whether any teams with existing accept-button habits encounter missed or mis-clicked actions after this reorder. The implementation sets `costrict.askReviewSuggestionWithAI` to the primary navigation slot and removes the duplicated third action, so rollout should be monitored for UI workflow expectations and any workflow shortcuts that assumed the old accept placement.
Final score 71Confidence 981 evidence itemcostrict.askReviewSuggestionWithAIcostrict.acceptIssuenavigation@1review comment threadVS Code extension
Analyze Evidence
commit burstMay 21, 2026, 8:23 PM
Restore automatic IDs for ID-less incoming messages
Open SWE updated deepagents to 0.6.3 to recover message ID assignment for inbound messages that lack explicit IDs, fixing a correctness regression introduced in 0.6.2.
What ChangedOpen SWE updated deepagents to 0.6.3 to recover message ID assignment for inbound messages that lack explicit IDs, fixing a correctness regression introduced in 0.6.2.
Why It MattersOperators and downstream automation that correlate Open SWE events can expect fewer untraceable/inconsistent message entries, because inbound messages without explicit IDs are once again assigned identifiers instead of arriving as anonymous items; teams should continue monitoring integrations and dashboard traces for any components that assumed null IDs or that still depend on caller-supplied IDs. The patch is a dependency update to deepagents 0.6.3 to restore pre-0.6.2 behavior.
Final score 70Confidence 931 evidence itemdeepagents 0.6.3incoming message ingestionmessage ID assignment
Analyze Evidence
commit burstMay 22, 2026, 9:39 AM
New AI Code Review signal is ready for review
A source-backed change was recorded for AI Code Review. Review the signal detail for evidence and context.
What ChangedAI Code Review recorded a source-backed change that affects how teams should keep watching this topic.
Why It MattersIt matters because repeated evidence-backed changes help separate durable movement from noisy update streams.
Final score 70Confidence 891 evidence itemdotnet/skillsMSBuilddotnet-msbuild plugintarget-authoringproperty-patternsitem-managementextension-pointsSKILL.mdeval prompts
Analyze Evidence
product updateMay 20, 2026, 12:00 AM
Ramp adopts Codex with GPT-5.5 for faster code review
Ramp integrated Codex powered by GPT-5.5 into its engineering code-review workflow, allowing reviewers to receive substantive feedback in minutes instead of hours before shipping improvements.
What ChangedRamp integrated Codex powered by GPT-5.5 into its engineering code-review workflow, allowing reviewers to receive substantive feedback in minutes instead of hours before shipping improvements.
Why It MattersRamp developers can get meaningful review feedback in minutes instead of hours, so teams can fix issues and release improvements faster with tighter iteration loops. This appears to be driven by AI-assisted review pass integration using Codex and GPT-5.5, so the team should watch for suggestion accuracy drift, missed risks in sensitive code paths, and whether human review rigor declines as trust in the assistant grows.
Final score 67Confidence 841 evidence itemOpenAI CodexGPT-5.5code review workflowRamp
Analyze Evidence
commit burstMay 20, 2026, 9:13 PM
Open SWE reviewer precision overhaul removes confidence-based publish gating
The commit set’s main change is a reviewer quality overhaul that replaces confidence-score-based publishing gates with a stricter prompt discipline, aiming to reduce false positives by enforcing clearer severity/evidence rules and explicit exclusions for speculative or non-actionable findings.
What ChangedThe commit set’s main change is a reviewer quality overhaul that replaces confidence-score-based publishing gates with a stricter prompt discipline, aiming to reduce false positives by enforcing clearer severity/evidence rules and explicit exclusions for speculative or non-actionable findings.
Why It MattersEngineers using Open SWE reviews will get fewer noisy or speculative findings, so less engineering time is spent triaging false-positive comments and attention shifts to likely real issues. Concretely, this is implemented by removing confidence-score publication gates and relying on a stricter prompt rubric for defensibility at runtime, after an eval audit identified speculative and style-noise as the dominant false-positive sources. Track next whether this reduces noisy output without increasing missed bugs by comparing precision and recall across subsequent review eval runs, especially on production-like repositories.
Final score 69Confidence 861 evidence itemOpen SWE Reviewerreviewer system promptconfidence gateCONFIDENCE_THRESHOLDfilter_findings_for_publishSeverity enum
Analyze Evidence
commit burstMay 20, 2026, 2:58 AM
Fix commit message targeting for multi-root workspaces
The key change fixes commit message generation in multi-root workspaces by ensuring commit operations resolve the active repository at action time instead of using a globally cached first root, so messages are generated for the repo the developer is actually working in.
What ChangedThe key change fixes commit message generation in multi-root workspaces by ensuring commit operations resolve the active repository at action time instead of using a globally cached first root, so messages are generated for the repo the developer is actually working in.
Why It MattersDevelopers working in VS Code with multiple repositories open can now generate commit messages for the currently active project instead of accidentally targeting another repo, which reduces misfiled review metadata and broken workflow handoffs; teams should still watch mixed-repo edge cases where root detection might still resolve incorrectly after rapid workspace switching. The fix replaces a singleton cached root path in the commit service with active root hints from SourceControl, directly addressing cross-repo ambiguity in multi-root environments and making generated commit context more reliable.
Final score 51Confidence 921 evidence itemmulti-root workspacecommit message generationCommitServiceSourceControl rootUriworkspaceRootHint
Analyze Evidence
releaseMay 19, 2026, 2:01 PM
Agent Manager supports speech-to-text for inline review comments
v7.3.1 introduces voice input in Agent Manager inline review comments, letting reviewers submit feedback by speaking instead of typing.
What Changedv7.3.1 introduces voice input in Agent Manager inline review comments, letting reviewers submit feedback by speaking instead of typing.
Why It MattersDevelopers doing code review can now dictate inline comments, which can speed up feedback cycles and reduce typing effort, so teams should track whether dictation remains reliable enough for day-to-day review flow before depending on it for routine operations. This change extends the inline review input pipeline in Agent Manager to accept recorded audio and convert it into comment text at send time; watch transcription accuracy on noisy calls, microphone permission edge cases, and how failures are surfaced when voice capture is unavailable.
Final score 62Confidence 911 evidence itemKilo CodeAgent Managerinline review commentsspeech-to-textvoice input
Analyze Evidence

Topic Timeline

How the topic has changed over time

17 events

May 22, 2026, 9:39 AM
commit burst
Fix Review Style Analyze startup by using configured LangGraph endpoint
AI Code Review showed a tracked change with evidence attached, making the topic easier to monitor over time.
ContributionAdds evidence to the topic's change timeline.
ImpactHelps teams decide whether this direction deserves continued tracking.
May 22, 2026, 9:06 AM
pull request
Track only accepted review suggestions in telemetry
Kilo-Org/kilocode now records review-telemetry only when a user accepts a review suggestion, while excluding dismissed, invalid, and non-review suggestions, and marks the next model response as suggest-tool-driven using sanitized review command names.
ContributionAdded acceptance-gated telemetry for code-review suggestions: only confirmed user-accepted suggestions increment suggestion metrics, and the immediate follow-up model completion is now attributed to the suggest tool with command names constrained to recognized review commands.
ImpactReview workflow teams get cleaner, more actionable telemetry, because acceptance dashboards now reflect real accepted suggestions instead of overcounting dismissed or invalid ones and can better measure whether suggestion-based review automation is being used; operators should watch for any valid but custom-formatted suggestion commands that stop being tracked due to the new command-name allowlist. This reduces uncertainty in rollout decisions and bug triage, while introducing a new edge case where unsupported command variants may lower observability until command coverage is expanded.
May 22, 2026, 12:42 AM
release
Added pre/post execution hooks with PreToolUse denial for tool calls
Goose v1.35.0 introduced a new tool-execution hooks system with PreToolUse denial support, enabling custom logic to run before and after each tool call and block disallowed actions before they execute.
ContributionThe release adds a concrete extensibility layer in the tool runtime: developers can register hook logic around tool calls and a PreToolUse denial hook to enforce policy decisions at execution time.
ImpactDevelopers and operators can now intercept tool calls before execution, so risky or policy-violating actions are more likely to be prevented instead of happening unexpectedly in active agent sessions. This is implemented via a new extensible pre/post tool execution hook pipeline and dedicated PreToolUse denial support, which should reduce unsafe automation fallout; watch for hook failures, ordering/precedence interactions, and whether denial paths surface clear user-visible diagnostics so legitimate workflows are not accidentally blocked.
May 21, 2026, 8:23 PM
commit burst
Restore automatic IDs for ID-less incoming messages
Open SWE updated deepagents to 0.6.3 to recover message ID assignment for inbound messages that lack explicit IDs, fixing a correctness regression introduced in 0.6.2.
ContributionThe change explicitly targets the 0.6.2 regression by reintroducing automatic identifier assignment when upstream senders omit message IDs, so each incoming message gets a stable ID for tracking and correlation.
ImpactOperators and downstream automation that correlate Open SWE events can expect fewer untraceable/inconsistent message entries, because inbound messages without explicit IDs are once again assigned identifiers instead of arriving as anonymous items; teams should continue monitoring integrations and dashboard traces for any components that assumed null IDs or that still depend on caller-supplied IDs. The patch is a dependency update to deepagents 0.6.3 to restore pre-0.6.2 behavior.
May 21, 2026, 5:44 PM
pull request
Fix review-style prompts stuck in running state
The PR fixes a correctness issue in review-style prompt execution by reconciling LangGraph runtime status with stored prompt state, so prompts that were saved before a status write can no longer remain incorrectly stuck in `running`.
ContributionAdds status reconciliation logic that cross-checks execution status against stored state during prompt sync, explicitly covering cases where prompt metadata is saved but the status transition is not recorded, so completion transitions are repaired automatically.
ImpactOperators and users of review-style analysis will see fewer apparent hangs, because prompts that used to stay forever in `running` after a missed status write now move to `completed`, reducing manual retries and workflow blockage. The technical change is a sync-time reconciliation between LangGraph and persisted state, and teams should watch for concurrency or partial-write races that could still recreate stale states after crashes or overlapping sync operations.
May 21, 2026, 4:17 PM
pull request
Open SWE Review now requires explicit repo enablement before reviewer webhooks run
The PR introduces an opt-in repository gate for Open SWE Review: a new `enabled_review_repos` list in the LangGraph Store is checked via `_is_repo_enabled_for_review` at reviewer webhook entry points, and only repositories that satisfy both this list and the existing env allowlist can trigger review actions.
ContributionImplemented a repository-level activation control for reviewer execution so review automation is no longer implicitly global; the new setting adds an explicit checklist gate in runtime to stop webhook review processing unless a repo is explicitly enabled in dashboard-backed config.
ImpactOperators and repository admins will see automatic review automation go quiet in repos that are not explicitly enabled, so they can no longer assume every project gets SWE feedback by default after this change. The implementation adds a LangGraph-backed config check (`enabled_review_repos`) to every reviewer webhook chokepoint via `_is_repo_enabled_for_review`, requiring an explicit on-switch in addition to the environment allowlist; teams should monitor rollout for unintended gaps in coverage right after deployment and verify all previously reviewed repos are manually re-enabled before relying on automation.
May 21, 2026, 8:34 AM
pull request
Add inline base-branch and instruction support to /local-review
The `/local-review` command now accepts optional trailing text: a single token is parsed as the base branch to compare against, while multi-word input is treated as review instructions (usable via `base -- instructions`), with Kilo-specific parsing moved into review-session helpers.
ContributionIntroduced command parsing for optional `/local-review` arguments so developers can set a compare branch and pass extra review instructions in one call; the logic is implemented in Kilo’s review/session helpers with a narrow hook change in shared opencode code.
ImpactDevelopers using `/local-review` can now target a specific base branch and supply custom review guidance without an extra blocking step, which makes review sessions faster and keeps the workflow less interrupted by follow-up prompts. The parser now maps a single token to base-branch selection and multi-word text to instruction content via `base -- instructions`, while shared runtime paths stay mostly untouched; continue monitoring for parsing edge cases (e.g., odd branch-name patterns) that could direct reviews to the wrong base or drop instruction text.
May 20, 2026, 10:41 PM
pull request
Add role-based review setup schema as foundation for `entire review`
This PR lays the first-step foundation for role-based review configuration by adding `ReviewConfig.Role` and `fix_after_review` settings, plus a dedicated setup flow to collect and persist per-agent roles and review instructions.
ContributionAdds concrete review-configuration primitives that separate reviewer vs fixer responsibilities in settings, with idempotent migration from the legacy field, a one-fixer validation rule at setup time, and a new interactive command to persist role maps.
ImpactOperators and teams using `entire review` can now predefine who reviews and who fixes before running review commands, reducing role ambiguity and downstream configuration drift in team workflows. This is implemented through `Role`-based settings, migration helpers, and the new `entire review setup` flow; continue watching downstream consumers of `.entire/settings.json` for assumptions about the removed legacy field and whether follow-up PR #2 cutover correctly preserves these assumptions after the legacy picker is removed.
May 20, 2026, 9:13 PM
pull request
Open-SWE dashboard switched to sidebar settings flow with team-level review configuration
The pull request replaces the dashboard's header-based UI with an AppShell/AppSidebar layout and introduces a new team settings path for review controls, moving key reviewer behavior options into dedicated sections. Reviewer settings such as trigger mode, draft-PR review, PR summaries, and autofix behavior (including severity threshold) are now persisted through the new `/dashboard/api/team-settings` flow.
ContributionImplemented a concrete dashboard redesign plus settings persistence layer: a new sidebar navigation flow (My Settings, Cloud Agents, Open SWE Review, Integrations), new optional profile/dashboard fields, and API-backed team settings stored in the LangGraph Store for defaults such as trigger mode, draft PR review, PR summaries, autofix mode, and autofix severity threshold.
ImpactOperators and maintainers configuring Open-SWE can now manage cloud-agent and review behavior from one consistent dashboard flow, which reduces UI fragmentation and setup mistakes before running PR reviews, but the path still needs production validation end-to-end with real GitHub OAuth and a LangGraph backend to confirm there are no navigation or save regressions. The change also replaces a Vite-sensitive account dropdown implementation that could trigger client hook crashes, so the refactor is aimed at keeping dashboard sessions stable during real OAuth and review workflows. Watch for failures in settings migration from prior routes, sidebar navigation regressions, and hidden admin-route discoverability.
May 20, 2026, 9:13 PM
commit burst
Open SWE reviewer precision overhaul removes confidence-based publish gating
The commit set’s main change is a reviewer quality overhaul that replaces confidence-score-based publishing gates with a stricter prompt discipline, aiming to reduce false positives by enforcing clearer severity/evidence rules and explicit exclusions for speculative or non-actionable findings.
ContributionRedesigned the reviewer behavior around precision-first prompt logic (severity ladder, mandatory evidence checks, explicit do-not-file list) and dropped confidence-based filtering paths such as CONFIDENCE_THRESHOLD, CONFIDENCE_ORDER, confidence_filtered mode, and informational severity handling.
ImpactEngineers using Open SWE reviews will get fewer noisy or speculative findings, so less engineering time is spent triaging false-positive comments and attention shifts to likely real issues. Concretely, this is implemented by removing confidence-score publication gates and relying on a stricter prompt rubric for defensibility at runtime, after an eval audit identified speculative and style-noise as the dominant false-positive sources. Track next whether this reduces noisy output without increasing missed bugs by comparing precision and recall across subsequent review eval runs, especially on production-like repositories.
May 20, 2026, 6:47 PM
pull request
Fix Review Style Analyze startup by using configured LangGraph endpoint
This PR fixes production failures in Open-SWE's Review Styles → Analyze flow where runs failed to start with `httpx.ConnectError: All connection attempts failed`. It removes a hardcoded `http://localhost:2024` endpoint in `review_style_jobs.py` and instead resolves the LangGraph client URL from `LANGGRAPH_URL`/`LANGGRAPH_URL_PROD` (same logic as `webapp.py`), with a fallback to `get_client()` when not set.
ContributionCorrected the analyzer job client endpoint selection so Review Styles → Analyze no longer targets a fixed local URL in deployments; it now follows the same environment-based LangGraph URL resolution as the web app, with an in-cluster fallback.
ImpactDevelopers and operators running PR review style checks can start analyze jobs on Open-SWE without silent startup failures, so automated review quality gates are less likely to get blocked by connection errors. The change replaces the fixed localhost endpoint with environment-aware URL resolution (`LANGGRAPH_URL`/`LANGGRAPH_URL_PROD`) and a `get_client()` fallback, but teams should verify each deployment’s LangGraph URL configuration and continue watching for any remaining `Failed to start review style analyzer` logs or analyzer runs that do not complete.
May 20, 2026, 2:58 AM
pull request
Promote AI suggestion button to primary position in review comment threads
This PR changes the code-review comment-thread button layout so the AI suggestion action (`costrict.askReviewSuggestionWithAI`) is the primary action (`navigation@1`), replacing the prior accept action (`costrict.acceptIssue`) and removing the duplicated third-slot action entry.
ContributionIntroduced a user-visible UI interaction change that makes AI review suggestions the first-class action in code-review comment threads, so reviewers can access AI-assisted fixes more directly during review without searching through lower-priority actions.
ImpactReviewers using the Costrict VS Code extension will immediately see the AI suggestion option as the top action in comment threads, reducing steps to trigger AI help and speeding review interactions; watch whether any teams with existing accept-button habits encounter missed or mis-clicked actions after this reorder. The implementation sets `costrict.askReviewSuggestionWithAI` to the primary navigation slot and removes the duplicated third action, so rollout should be monitored for UI workflow expectations and any workflow shortcuts that assumed the old accept placement.
May 20, 2026, 2:58 AM
commit burst
Fix commit message targeting for multi-root workspaces
The key change fixes commit message generation in multi-root workspaces by ensuring commit operations resolve the active repository at action time instead of using a globally cached first root, so messages are generated for the repo the developer is actually working in.
ContributionIntroduced per-action workspace-root resolution for commit message generation: removed the singleton CommitService root cache and passed the active SourceControl root URI as a workspaceRootHint so commit workflows can consistently bind to the correct repository in multi-repo sessions.
ImpactDevelopers working in VS Code with multiple repositories open can now generate commit messages for the currently active project instead of accidentally targeting another repo, which reduces misfiled review metadata and broken workflow handoffs; teams should still watch mixed-repo edge cases where root detection might still resolve incorrectly after rapid workspace switching. The fix replaces a singleton cached root path in the commit service with active root hints from SourceControl, directly addressing cross-repo ambiguity in multi-root environments and making generated commit context more reliable.
May 20, 2026, 12:00 AM
product update
Ramp adopts Codex with GPT-5.5 for faster code review
Ramp integrated Codex powered by GPT-5.5 into its engineering code-review workflow, allowing reviewers to receive substantive feedback in minutes instead of hours before shipping improvements.
ContributionAdded an AI-assisted code-review capability in the engineering workflow by combining Codex with GPT-5.5 to generate substantive review feedback during development.
ImpactRamp developers can get meaningful review feedback in minutes instead of hours, so teams can fix issues and release improvements faster with tighter iteration loops. This appears to be driven by AI-assisted review pass integration using Codex and GPT-5.5, so the team should watch for suggestion accuracy drift, missed risks in sensitive code paths, and whether human review rigor declines as trust in the assistant grows.
May 19, 2026, 2:01 PM
release
Agent Manager supports speech-to-text for inline review comments
v7.3.1 introduces voice input in Agent Manager inline review comments, letting reviewers submit feedback by speaking instead of typing.
ContributionAdded a voice-to-text capture and send path in the Agent Manager inline review composer, enabling dictated comments to be submitted directly in the review UI.
ImpactDevelopers doing code review can now dictate inline comments, which can speed up feedback cycles and reduce typing effort, so teams should track whether dictation remains reliable enough for day-to-day review flow before depending on it for routine operations. This change extends the inline review input pipeline in Agent Manager to accept recorded audio and convert it into comment text at send time; watch transcription accuracy on noisy calls, microphone permission edge cases, and how failures are surfaced when voice capture is unavailable.
May 18, 2026, 10:04 PM
pull request
Add AIDLC Code Reviewer CLI for unified technical and business-logic reviews
Introduces a new `scripts/aidlc-codereview` package with the `aidlc-code-reviewer` CLI, enabling one-command review of AI-DLC code through built-in static tools plus Bedrock-powered business-logic analysis, and generating linked HTML/Markdown reports.
ContributionAdded a new CLI-driven review workflow in AIDLC (`aidlc-code-reviewer`) that centralizes execution of multiple analyzers and AI agents (critical code, structural quality, and business logic), outputs both technical and business reports, and supports automatic Bedrock-backed wrapper generation for tools not pre-integrated.
ImpactDevelopers using AIDLC can now run one command to get consolidated quality and business-logic findings, which can materially reduce manual review effort and catch issues earlier before code reaches later stages; in practice this changes review turnaround and lowers the chance that logic or security mistakes slip through. The implementation relies on Bedrock invocation, so operator risk now includes credentials/permissions (`bedrock:InvokeModel`), output stability of auto-generated wrappers for new tools, and whether severity policy consistently limits HIGH/CRITICAL to security findings. Monitor review coverage and false-positive/false-negative patterns as model output changes and new tool additions are added.
May 18, 2026, 12:00 PM
release
entire review gains live JSONL agent-event streaming
`entire review` in v0.6.2 now exposes live multi-agent progress and failure signals through JSONL output, making review execution more observable instead of only showing results at the end of a run.
ContributionAdded real-time JSONL event streaming for `entire review` and updated review execution feedback so progress and failure outcomes are surfaced continuously during a run.
ImpactDevelopers and automation operators using `entire review` can monitor long review sessions as they execute, so failures and stalls are visible earlier and teams can intervene before wasting time on a full, unproductive cycle; this reduces uncertainty in CI workflows and lowers the cost of debugging long checks, but teams should watch for log parser compatibility, increased event volume in downstream tooling, and whether the expanded default scope (including uncommitted changes) changes review signal-to-noise.

Evidence Trail

github_commit_burst
dotnet/skills commit burst: 3 commits in 7 days
AI Code Review has source-backed evidence attached to the latest tracked change.
Open Source
github_pull_request
Kilo-Org/kilocode PR #10502: feat: track accepted review suggestions
The app now records a telemetry event only after a user accepts a review suggestion; dismissed, invalid, and non-review suggestions are no longer counted, and accepted-suggestion follow-up model responses are tagged as coming from the suggest tool with known review command names.
Open Source
github_release
v1.35.0
Hooks system for extensible pre/post tool execution - PreToolUse denial hook support
Open Source
github_commit_burst
langchain-ai/open-swe commit burst: 4 commits in 7 days
chore: bump deepagents to 0.6.3 (#1322) Pick up the patch release that restores message ID assignment for incoming messages without explicit IDs after the 0.6.2 regression.
Open Source

Source Coverage

github pull request: 9 events · 9 evidence items; 2 days ago
github commit burst: 4 events · 4 evidence items; 2 days ago
github release: 3 events · 3 evidence items; 2 days ago
rss feed: 1 event · 1 evidence item; 4 days ago

Subscribe to this topic

Keep tracking AI Code Review with weekly digests and high-signal alerts once your account subscription is active.

Review Pro tracking

Watching Next

AI Code Review tracks source-backed changes, trend stages, evidence volume, and the signals worth watching over time.

Turn on alerts

Stage: Expansion

AI Code Review

Track important changes in AI Code Review, including capabilities, product updates, adoption signals, risks, and evidence worth continued monitoring.

AI CODETRACKING

Signal Feed

Changes worth continued tracking

16 unique signals

pull requestMay 20, 2026, 6:47 PM
Fix Review Style Analyze startup by using configured LangGraph endpoint
This PR fixes production failures in Open-SWE's Review Styles → Analyze flow where runs failed to start with `httpx.ConnectError: All connection attempts failed`. It removes a hardcoded `http://localhost:2024` endpoint in `review_style_jobs.py` and instead resolves the LangGraph client URL from `LANGGRAPH_URL`/`LANGGRAPH_URL_PROD` (same logic as `webapp.py`), with a fallback to `get_client()` when not set.
What ChangedThis PR fixes production failures in Open-SWE's Review Styles → Analyze flow where runs failed to start with `httpx.ConnectError: All connection attempts failed`. It removes a hardcoded `http://localhost:2024` endpoint in `review_style_jobs.py` and instead resolves the LangGraph client URL from `LANGGRAPH_URL`/`LANGGRAPH_URL_PROD` (same logic as `webapp.py`), with a fallback to `get_client()` when not set.
Why It MattersDevelopers and operators running PR review style checks can start analyze jobs on Open-SWE without silent startup failures, so automated review quality gates are less likely to get blocked by connection errors. The change replaces the fixed localhost endpoint with environment-aware URL resolution (`LANGGRAPH_URL`/`LANGGRAPH_URL_PROD`) and a `get_client()` fallback, but teams should verify each deployment’s LangGraph URL configuration and continue watching for any remaining `Failed to start review style analyzer` logs or analyzer runs that do not complete.
Final score 81Confidence 971 evidence itemLangGraph SDKreview_style_jobs.pyLANGGRAPH_URLLANGGRAPH_URL_PRODget_clienthttpx.ConnectError
Analyze Evidence
pull requestMay 21, 2026, 4:17 PM
Open SWE Review now requires explicit repo enablement before reviewer webhooks run
The PR introduces an opt-in repository gate for Open SWE Review: a new `enabled_review_repos` list in the LangGraph Store is checked via `_is_repo_enabled_for_review` at reviewer webhook entry points, and only repositories that satisfy both this list and the existing env allowlist can trigger review actions.
What ChangedThe PR introduces an opt-in repository gate for Open SWE Review: a new `enabled_review_repos` list in the LangGraph Store is checked via `_is_repo_enabled_for_review` at reviewer webhook entry points, and only repositories that satisfy both this list and the existing env allowlist can trigger review actions.
Why It MattersOperators and repository admins will see automatic review automation go quiet in repos that are not explicitly enabled, so they can no longer assume every project gets SWE feedback by default after this change. The implementation adds a LangGraph-backed config check (`enabled_review_repos`) to every reviewer webhook chokepoint via `_is_repo_enabled_for_review`, requiring an explicit on-switch in addition to the environment allowlist; teams should monitor rollout for unintended gaps in coverage right after deployment and verify all previously reviewed repos are manually re-enabled before relying on automation.
Final score 80Confidence 941 evidence itemenabled_review_reposOpen SWE ReviewLangGraph Store_is_repo_enabled_for_reviewenv allowlist
Analyze Evidence
pull requestMay 20, 2026, 9:13 PM
Open-SWE dashboard switched to sidebar settings flow with team-level review configuration
The pull request replaces the dashboard's header-based UI with an AppShell/AppSidebar layout and introduces a new team settings path for review controls, moving key reviewer behavior options into dedicated sections. Reviewer settings such as trigger mode, draft-PR review, PR summaries, and autofix behavior (including severity threshold) are now persisted through the new `/dashboard/api/team-settings` flow.
What ChangedThe pull request replaces the dashboard's header-based UI with an AppShell/AppSidebar layout and introduces a new team settings path for review controls, moving key reviewer behavior options into dedicated sections. Reviewer settings such as trigger mode, draft-PR review, PR summaries, and autofix behavior (including severity threshold) are now persisted through the new `/dashboard/api/team-settings` flow.
Why It MattersOperators and maintainers configuring Open-SWE can now manage cloud-agent and review behavior from one consistent dashboard flow, which reduces UI fragmentation and setup mistakes before running PR reviews, but the path still needs production validation end-to-end with real GitHub OAuth and a LangGraph backend to confirm there are no navigation or save regressions. The change also replaces a Vite-sensitive account dropdown implementation that could trigger client hook crashes, so the refactor is aimed at keeping dashboard sessions stable during real OAuth and review workflows. Watch for failures in settings migration from prior routes, sidebar navigation regressions, and hidden admin-route discoverability.
Final score 77Confidence 901 evidence itemAppShellAppSidebar/dashboard/api/team-settingsteam_settingsLangGraph StoreProfileUpdateopen-swe dashboard
Analyze Evidence
pull requestMay 21, 2026, 5:44 PM
Fix review-style prompts stuck in running state
The PR fixes a correctness issue in review-style prompt execution by reconciling LangGraph runtime status with stored prompt state, so prompts that were saved before a status write can no longer remain incorrectly stuck in `running`.
What ChangedThe PR fixes a correctness issue in review-style prompt execution by reconciling LangGraph runtime status with stored prompt state, so prompts that were saved before a status write can no longer remain incorrectly stuck in `running`.
Why It MattersOperators and users of review-style analysis will see fewer apparent hangs, because prompts that used to stay forever in `running` after a missed status write now move to `completed`, reducing manual retries and workflow blockage. The technical change is a sync-time reconciliation between LangGraph and persisted state, and teams should watch for concurrency or partial-write races that could still recreate stale states after crashes or overlapping sync operations.
Final score 76Confidence 951 evidence itemLangGraphreview-style promptsrun statusprompt storestatus synchronization
Analyze Evidence
pull requestMay 21, 2026, 8:34 AM
Add inline base-branch and instruction support to /local-review
The `/local-review` command now accepts optional trailing text: a single token is parsed as the base branch to compare against, while multi-word input is treated as review instructions (usable via `base -- instructions`), with Kilo-specific parsing moved into review-session helpers.
What ChangedThe `/local-review` command now accepts optional trailing text: a single token is parsed as the base branch to compare against, while multi-word input is treated as review instructions (usable via `base -- instructions`), with Kilo-specific parsing moved into review-session helpers.
Why It MattersDevelopers using `/local-review` can now target a specific base branch and supply custom review guidance without an extra blocking step, which makes review sessions faster and keeps the workflow less interrupted by follow-up prompts. The parser now maps a single token to base-branch selection and multi-word text to instruction content via `base -- instructions`, while shared runtime paths stay mostly untouched; continue monitoring for parsing edge cases (e.g., odd branch-name patterns) that could direct reviews to the wrong base or drop instruction text.
Final score 74Confidence 951 evidence itemlocal-reviewreview/session helpersopencodebase branchinstruction parsing
Analyze Evidence
pull requestMay 20, 2026, 10:41 PM
Add role-based review setup schema as foundation for `entire review`
This PR lays the first-step foundation for role-based review configuration by adding `ReviewConfig.Role` and `fix_after_review` settings, plus a dedicated setup flow to collect and persist per-agent roles and review instructions.
What ChangedThis PR lays the first-step foundation for role-based review configuration by adding `ReviewConfig.Role` and `fix_after_review` settings, plus a dedicated setup flow to collect and persist per-agent roles and review instructions.
Why It MattersOperators and teams using `entire review` can now predefine who reviews and who fixes before running review commands, reducing role ambiguity and downstream configuration drift in team workflows. This is implemented through `Role`-based settings, migration helpers, and the new `entire review setup` flow; continue watching downstream consumers of `.entire/settings.json` for assumptions about the removed legacy field and whether follow-up PR #2 cutover correctly preserves these assumptions after the legacy picker is removed.
Final score 73Confidence 941 evidence itemReviewConfig.Rolesettings.LoadMigrateLegacyRolesentire review setupEntireSettings.fix_after_review
Analyze Evidence
releaseMay 22, 2026, 12:42 AM
Added pre/post execution hooks with PreToolUse denial for tool calls
Goose v1.35.0 introduced a new tool-execution hooks system with PreToolUse denial support, enabling custom logic to run before and after each tool call and block disallowed actions before they execute.
What ChangedGoose v1.35.0 introduced a new tool-execution hooks system with PreToolUse denial support, enabling custom logic to run before and after each tool call and block disallowed actions before they execute.
Why It MattersDevelopers and operators can now intercept tool calls before execution, so risky or policy-violating actions are more likely to be prevented instead of happening unexpectedly in active agent sessions. This is implemented via a new extensible pre/post tool execution hook pipeline and dedicated PreToolUse denial support, which should reduce unsafe automation fallout; watch for hook failures, ordering/precedence interactions, and whether denial paths surface clear user-visible diagnostics so legitimate workflows are not accidentally blocked.
Final score 73Confidence 901 evidence itemGoose tool executionpre/post hooksPreToolUse denial hookACP tool runtime
Analyze Evidence
pull requestMay 22, 2026, 9:06 AM
Track only accepted review suggestions in telemetry
Kilo-Org/kilocode now records review-telemetry only when a user accepts a review suggestion, while excluding dismissed, invalid, and non-review suggestions, and marks the next model response as suggest-tool-driven using sanitized review command names.
What ChangedKilo-Org/kilocode now records review-telemetry only when a user accepts a review suggestion, while excluding dismissed, invalid, and non-review suggestions, and marks the next model response as suggest-tool-driven using sanitized review command names.
Why It MattersReview workflow teams get cleaner, more actionable telemetry, because acceptance dashboards now reflect real accepted suggestions instead of overcounting dismissed or invalid ones and can better measure whether suggestion-based review automation is being used; operators should watch for any valid but custom-formatted suggestion commands that stop being tracked due to the new command-name allowlist. This reduces uncertainty in rollout decisions and bug triage, while introducing a new edge case where unsupported command variants may lower observability until command coverage is expanded.
Final score 73Confidence 951 evidence itemreview suggestiontelemetrysuggest toolreview command nameskilo-telemetryopencode
Analyze Evidence
releaseMay 18, 2026, 12:00 PM
entire review gains live JSONL agent-event streaming
`entire review` in v0.6.2 now exposes live multi-agent progress and failure signals through JSONL output, making review execution more observable instead of only showing results at the end of a run.
What Changed`entire review` in v0.6.2 now exposes live multi-agent progress and failure signals through JSONL output, making review execution more observable instead of only showing results at the end of a run.
Why It MattersDevelopers and automation operators using `entire review` can monitor long review sessions as they execute, so failures and stalls are visible earlier and teams can intervene before wasting time on a full, unproductive cycle; this reduces uncertainty in CI workflows and lowers the cost of debugging long checks, but teams should watch for log parser compatibility, increased event volume in downstream tooling, and whether the expanded default scope (including uncommitted changes) changes review signal-to-noise.
Final score 72Confidence 931 evidence itementire reviewJSONL outputagent eventsmulti-agent reviewfailure diagnostics
Analyze Evidence
pull requestMay 20, 2026, 2:58 AM
Promote AI suggestion button to primary position in review comment threads
This PR changes the code-review comment-thread button layout so the AI suggestion action (`costrict.askReviewSuggestionWithAI`) is the primary action (`navigation@1`), replacing the prior accept action (`costrict.acceptIssue`) and removing the duplicated third-slot action entry.
What ChangedThis PR changes the code-review comment-thread button layout so the AI suggestion action (`costrict.askReviewSuggestionWithAI`) is the primary action (`navigation@1`), replacing the prior accept action (`costrict.acceptIssue`) and removing the duplicated third-slot action entry.
Why It MattersReviewers using the Costrict VS Code extension will immediately see the AI suggestion option as the top action in comment threads, reducing steps to trigger AI help and speeding review interactions; watch whether any teams with existing accept-button habits encounter missed or mis-clicked actions after this reorder. The implementation sets `costrict.askReviewSuggestionWithAI` to the primary navigation slot and removes the duplicated third action, so rollout should be monitored for UI workflow expectations and any workflow shortcuts that assumed the old accept placement.
Final score 71Confidence 981 evidence itemcostrict.askReviewSuggestionWithAIcostrict.acceptIssuenavigation@1review comment threadVS Code extension
Analyze Evidence
commit burstMay 21, 2026, 8:23 PM
Restore automatic IDs for ID-less incoming messages
Open SWE updated deepagents to 0.6.3 to recover message ID assignment for inbound messages that lack explicit IDs, fixing a correctness regression introduced in 0.6.2.
What ChangedOpen SWE updated deepagents to 0.6.3 to recover message ID assignment for inbound messages that lack explicit IDs, fixing a correctness regression introduced in 0.6.2.
Why It MattersOperators and downstream automation that correlate Open SWE events can expect fewer untraceable/inconsistent message entries, because inbound messages without explicit IDs are once again assigned identifiers instead of arriving as anonymous items; teams should continue monitoring integrations and dashboard traces for any components that assumed null IDs or that still depend on caller-supplied IDs. The patch is a dependency update to deepagents 0.6.3 to restore pre-0.6.2 behavior.
Final score 70Confidence 931 evidence itemdeepagents 0.6.3incoming message ingestionmessage ID assignment
Analyze Evidence
commit burstMay 22, 2026, 9:39 AM
New AI Code Review signal is ready for review
A source-backed change was recorded for AI Code Review. Review the signal detail for evidence and context.
What ChangedAI Code Review recorded a source-backed change that affects how teams should keep watching this topic.
Why It MattersIt matters because repeated evidence-backed changes help separate durable movement from noisy update streams.
Final score 70Confidence 891 evidence itemdotnet/skillsMSBuilddotnet-msbuild plugintarget-authoringproperty-patternsitem-managementextension-pointsSKILL.mdeval prompts
Analyze Evidence
product updateMay 20, 2026, 12:00 AM
Ramp adopts Codex with GPT-5.5 for faster code review
Ramp integrated Codex powered by GPT-5.5 into its engineering code-review workflow, allowing reviewers to receive substantive feedback in minutes instead of hours before shipping improvements.
What ChangedRamp integrated Codex powered by GPT-5.5 into its engineering code-review workflow, allowing reviewers to receive substantive feedback in minutes instead of hours before shipping improvements.
Why It MattersRamp developers can get meaningful review feedback in minutes instead of hours, so teams can fix issues and release improvements faster with tighter iteration loops. This appears to be driven by AI-assisted review pass integration using Codex and GPT-5.5, so the team should watch for suggestion accuracy drift, missed risks in sensitive code paths, and whether human review rigor declines as trust in the assistant grows.
Final score 67Confidence 841 evidence itemOpenAI CodexGPT-5.5code review workflowRamp
Analyze Evidence
commit burstMay 20, 2026, 9:13 PM
Open SWE reviewer precision overhaul removes confidence-based publish gating
The commit set’s main change is a reviewer quality overhaul that replaces confidence-score-based publishing gates with a stricter prompt discipline, aiming to reduce false positives by enforcing clearer severity/evidence rules and explicit exclusions for speculative or non-actionable findings.
What ChangedThe commit set’s main change is a reviewer quality overhaul that replaces confidence-score-based publishing gates with a stricter prompt discipline, aiming to reduce false positives by enforcing clearer severity/evidence rules and explicit exclusions for speculative or non-actionable findings.
Why It MattersEngineers using Open SWE reviews will get fewer noisy or speculative findings, so less engineering time is spent triaging false-positive comments and attention shifts to likely real issues. Concretely, this is implemented by removing confidence-score publication gates and relying on a stricter prompt rubric for defensibility at runtime, after an eval audit identified speculative and style-noise as the dominant false-positive sources. Track next whether this reduces noisy output without increasing missed bugs by comparing precision and recall across subsequent review eval runs, especially on production-like repositories.
Final score 69Confidence 861 evidence itemOpen SWE Reviewerreviewer system promptconfidence gateCONFIDENCE_THRESHOLDfilter_findings_for_publishSeverity enum
Analyze Evidence
commit burstMay 20, 2026, 2:58 AM
Fix commit message targeting for multi-root workspaces
The key change fixes commit message generation in multi-root workspaces by ensuring commit operations resolve the active repository at action time instead of using a globally cached first root, so messages are generated for the repo the developer is actually working in.
What ChangedThe key change fixes commit message generation in multi-root workspaces by ensuring commit operations resolve the active repository at action time instead of using a globally cached first root, so messages are generated for the repo the developer is actually working in.
Why It MattersDevelopers working in VS Code with multiple repositories open can now generate commit messages for the currently active project instead of accidentally targeting another repo, which reduces misfiled review metadata and broken workflow handoffs; teams should still watch mixed-repo edge cases where root detection might still resolve incorrectly after rapid workspace switching. The fix replaces a singleton cached root path in the commit service with active root hints from SourceControl, directly addressing cross-repo ambiguity in multi-root environments and making generated commit context more reliable.
Final score 51Confidence 921 evidence itemmulti-root workspacecommit message generationCommitServiceSourceControl rootUriworkspaceRootHint
Analyze Evidence
releaseMay 19, 2026, 2:01 PM
Agent Manager supports speech-to-text for inline review comments
v7.3.1 introduces voice input in Agent Manager inline review comments, letting reviewers submit feedback by speaking instead of typing.
What Changedv7.3.1 introduces voice input in Agent Manager inline review comments, letting reviewers submit feedback by speaking instead of typing.
Why It MattersDevelopers doing code review can now dictate inline comments, which can speed up feedback cycles and reduce typing effort, so teams should track whether dictation remains reliable enough for day-to-day review flow before depending on it for routine operations. This change extends the inline review input pipeline in Agent Manager to accept recorded audio and convert it into comment text at send time; watch transcription accuracy on noisy calls, microphone permission edge cases, and how failures are surfaced when voice capture is unavailable.
Final score 62Confidence 911 evidence itemKilo CodeAgent Managerinline review commentsspeech-to-textvoice input
Analyze Evidence

Topic Timeline

How the topic has changed over time

17 events

May 22, 2026, 9:39 AM
commit burst
Fix Review Style Analyze startup by using configured LangGraph endpoint
AI Code Review showed a tracked change with evidence attached, making the topic easier to monitor over time.
ContributionAdds evidence to the topic's change timeline.
ImpactHelps teams decide whether this direction deserves continued tracking.
May 22, 2026, 9:06 AM
pull request
Track only accepted review suggestions in telemetry
Kilo-Org/kilocode now records review-telemetry only when a user accepts a review suggestion, while excluding dismissed, invalid, and non-review suggestions, and marks the next model response as suggest-tool-driven using sanitized review command names.
ContributionAdded acceptance-gated telemetry for code-review suggestions: only confirmed user-accepted suggestions increment suggestion metrics, and the immediate follow-up model completion is now attributed to the suggest tool with command names constrained to recognized review commands.
ImpactReview workflow teams get cleaner, more actionable telemetry, because acceptance dashboards now reflect real accepted suggestions instead of overcounting dismissed or invalid ones and can better measure whether suggestion-based review automation is being used; operators should watch for any valid but custom-formatted suggestion commands that stop being tracked due to the new command-name allowlist. This reduces uncertainty in rollout decisions and bug triage, while introducing a new edge case where unsupported command variants may lower observability until command coverage is expanded.
May 22, 2026, 12:42 AM
release
Added pre/post execution hooks with PreToolUse denial for tool calls
Goose v1.35.0 introduced a new tool-execution hooks system with PreToolUse denial support, enabling custom logic to run before and after each tool call and block disallowed actions before they execute.
ContributionThe release adds a concrete extensibility layer in the tool runtime: developers can register hook logic around tool calls and a PreToolUse denial hook to enforce policy decisions at execution time.
ImpactDevelopers and operators can now intercept tool calls before execution, so risky or policy-violating actions are more likely to be prevented instead of happening unexpectedly in active agent sessions. This is implemented via a new extensible pre/post tool execution hook pipeline and dedicated PreToolUse denial support, which should reduce unsafe automation fallout; watch for hook failures, ordering/precedence interactions, and whether denial paths surface clear user-visible diagnostics so legitimate workflows are not accidentally blocked.
May 21, 2026, 8:23 PM
commit burst
Restore automatic IDs for ID-less incoming messages
Open SWE updated deepagents to 0.6.3 to recover message ID assignment for inbound messages that lack explicit IDs, fixing a correctness regression introduced in 0.6.2.
ContributionThe change explicitly targets the 0.6.2 regression by reintroducing automatic identifier assignment when upstream senders omit message IDs, so each incoming message gets a stable ID for tracking and correlation.
ImpactOperators and downstream automation that correlate Open SWE events can expect fewer untraceable/inconsistent message entries, because inbound messages without explicit IDs are once again assigned identifiers instead of arriving as anonymous items; teams should continue monitoring integrations and dashboard traces for any components that assumed null IDs or that still depend on caller-supplied IDs. The patch is a dependency update to deepagents 0.6.3 to restore pre-0.6.2 behavior.
May 21, 2026, 5:44 PM
pull request
Fix review-style prompts stuck in running state
The PR fixes a correctness issue in review-style prompt execution by reconciling LangGraph runtime status with stored prompt state, so prompts that were saved before a status write can no longer remain incorrectly stuck in `running`.
ContributionAdds status reconciliation logic that cross-checks execution status against stored state during prompt sync, explicitly covering cases where prompt metadata is saved but the status transition is not recorded, so completion transitions are repaired automatically.
ImpactOperators and users of review-style analysis will see fewer apparent hangs, because prompts that used to stay forever in `running` after a missed status write now move to `completed`, reducing manual retries and workflow blockage. The technical change is a sync-time reconciliation between LangGraph and persisted state, and teams should watch for concurrency or partial-write races that could still recreate stale states after crashes or overlapping sync operations.
May 21, 2026, 4:17 PM
pull request
Open SWE Review now requires explicit repo enablement before reviewer webhooks run
The PR introduces an opt-in repository gate for Open SWE Review: a new `enabled_review_repos` list in the LangGraph Store is checked via `_is_repo_enabled_for_review` at reviewer webhook entry points, and only repositories that satisfy both this list and the existing env allowlist can trigger review actions.
ContributionImplemented a repository-level activation control for reviewer execution so review automation is no longer implicitly global; the new setting adds an explicit checklist gate in runtime to stop webhook review processing unless a repo is explicitly enabled in dashboard-backed config.
ImpactOperators and repository admins will see automatic review automation go quiet in repos that are not explicitly enabled, so they can no longer assume every project gets SWE feedback by default after this change. The implementation adds a LangGraph-backed config check (`enabled_review_repos`) to every reviewer webhook chokepoint via `_is_repo_enabled_for_review`, requiring an explicit on-switch in addition to the environment allowlist; teams should monitor rollout for unintended gaps in coverage right after deployment and verify all previously reviewed repos are manually re-enabled before relying on automation.
May 21, 2026, 8:34 AM
pull request
Add inline base-branch and instruction support to /local-review
The `/local-review` command now accepts optional trailing text: a single token is parsed as the base branch to compare against, while multi-word input is treated as review instructions (usable via `base -- instructions`), with Kilo-specific parsing moved into review-session helpers.
ContributionIntroduced command parsing for optional `/local-review` arguments so developers can set a compare branch and pass extra review instructions in one call; the logic is implemented in Kilo’s review/session helpers with a narrow hook change in shared opencode code.
ImpactDevelopers using `/local-review` can now target a specific base branch and supply custom review guidance without an extra blocking step, which makes review sessions faster and keeps the workflow less interrupted by follow-up prompts. The parser now maps a single token to base-branch selection and multi-word text to instruction content via `base -- instructions`, while shared runtime paths stay mostly untouched; continue monitoring for parsing edge cases (e.g., odd branch-name patterns) that could direct reviews to the wrong base or drop instruction text.
May 20, 2026, 10:41 PM
pull request
Add role-based review setup schema as foundation for `entire review`
This PR lays the first-step foundation for role-based review configuration by adding `ReviewConfig.Role` and `fix_after_review` settings, plus a dedicated setup flow to collect and persist per-agent roles and review instructions.
ContributionAdds concrete review-configuration primitives that separate reviewer vs fixer responsibilities in settings, with idempotent migration from the legacy field, a one-fixer validation rule at setup time, and a new interactive command to persist role maps.
ImpactOperators and teams using `entire review` can now predefine who reviews and who fixes before running review commands, reducing role ambiguity and downstream configuration drift in team workflows. This is implemented through `Role`-based settings, migration helpers, and the new `entire review setup` flow; continue watching downstream consumers of `.entire/settings.json` for assumptions about the removed legacy field and whether follow-up PR #2 cutover correctly preserves these assumptions after the legacy picker is removed.
May 20, 2026, 9:13 PM
pull request
Open-SWE dashboard switched to sidebar settings flow with team-level review configuration
The pull request replaces the dashboard's header-based UI with an AppShell/AppSidebar layout and introduces a new team settings path for review controls, moving key reviewer behavior options into dedicated sections. Reviewer settings such as trigger mode, draft-PR review, PR summaries, and autofix behavior (including severity threshold) are now persisted through the new `/dashboard/api/team-settings` flow.
ContributionImplemented a concrete dashboard redesign plus settings persistence layer: a new sidebar navigation flow (My Settings, Cloud Agents, Open SWE Review, Integrations), new optional profile/dashboard fields, and API-backed team settings stored in the LangGraph Store for defaults such as trigger mode, draft PR review, PR summaries, autofix mode, and autofix severity threshold.
ImpactOperators and maintainers configuring Open-SWE can now manage cloud-agent and review behavior from one consistent dashboard flow, which reduces UI fragmentation and setup mistakes before running PR reviews, but the path still needs production validation end-to-end with real GitHub OAuth and a LangGraph backend to confirm there are no navigation or save regressions. The change also replaces a Vite-sensitive account dropdown implementation that could trigger client hook crashes, so the refactor is aimed at keeping dashboard sessions stable during real OAuth and review workflows. Watch for failures in settings migration from prior routes, sidebar navigation regressions, and hidden admin-route discoverability.
May 20, 2026, 9:13 PM
commit burst
Open SWE reviewer precision overhaul removes confidence-based publish gating
The commit set’s main change is a reviewer quality overhaul that replaces confidence-score-based publishing gates with a stricter prompt discipline, aiming to reduce false positives by enforcing clearer severity/evidence rules and explicit exclusions for speculative or non-actionable findings.
ContributionRedesigned the reviewer behavior around precision-first prompt logic (severity ladder, mandatory evidence checks, explicit do-not-file list) and dropped confidence-based filtering paths such as CONFIDENCE_THRESHOLD, CONFIDENCE_ORDER, confidence_filtered mode, and informational severity handling.
ImpactEngineers using Open SWE reviews will get fewer noisy or speculative findings, so less engineering time is spent triaging false-positive comments and attention shifts to likely real issues. Concretely, this is implemented by removing confidence-score publication gates and relying on a stricter prompt rubric for defensibility at runtime, after an eval audit identified speculative and style-noise as the dominant false-positive sources. Track next whether this reduces noisy output without increasing missed bugs by comparing precision and recall across subsequent review eval runs, especially on production-like repositories.
May 20, 2026, 6:47 PM
pull request
Fix Review Style Analyze startup by using configured LangGraph endpoint
This PR fixes production failures in Open-SWE's Review Styles → Analyze flow where runs failed to start with `httpx.ConnectError: All connection attempts failed`. It removes a hardcoded `http://localhost:2024` endpoint in `review_style_jobs.py` and instead resolves the LangGraph client URL from `LANGGRAPH_URL`/`LANGGRAPH_URL_PROD` (same logic as `webapp.py`), with a fallback to `get_client()` when not set.
ContributionCorrected the analyzer job client endpoint selection so Review Styles → Analyze no longer targets a fixed local URL in deployments; it now follows the same environment-based LangGraph URL resolution as the web app, with an in-cluster fallback.
ImpactDevelopers and operators running PR review style checks can start analyze jobs on Open-SWE without silent startup failures, so automated review quality gates are less likely to get blocked by connection errors. The change replaces the fixed localhost endpoint with environment-aware URL resolution (`LANGGRAPH_URL`/`LANGGRAPH_URL_PROD`) and a `get_client()` fallback, but teams should verify each deployment’s LangGraph URL configuration and continue watching for any remaining `Failed to start review style analyzer` logs or analyzer runs that do not complete.
May 20, 2026, 2:58 AM
pull request
Promote AI suggestion button to primary position in review comment threads
This PR changes the code-review comment-thread button layout so the AI suggestion action (`costrict.askReviewSuggestionWithAI`) is the primary action (`navigation@1`), replacing the prior accept action (`costrict.acceptIssue`) and removing the duplicated third-slot action entry.
ContributionIntroduced a user-visible UI interaction change that makes AI review suggestions the first-class action in code-review comment threads, so reviewers can access AI-assisted fixes more directly during review without searching through lower-priority actions.
ImpactReviewers using the Costrict VS Code extension will immediately see the AI suggestion option as the top action in comment threads, reducing steps to trigger AI help and speeding review interactions; watch whether any teams with existing accept-button habits encounter missed or mis-clicked actions after this reorder. The implementation sets `costrict.askReviewSuggestionWithAI` to the primary navigation slot and removes the duplicated third action, so rollout should be monitored for UI workflow expectations and any workflow shortcuts that assumed the old accept placement.
May 20, 2026, 2:58 AM
commit burst
Fix commit message targeting for multi-root workspaces
The key change fixes commit message generation in multi-root workspaces by ensuring commit operations resolve the active repository at action time instead of using a globally cached first root, so messages are generated for the repo the developer is actually working in.
ContributionIntroduced per-action workspace-root resolution for commit message generation: removed the singleton CommitService root cache and passed the active SourceControl root URI as a workspaceRootHint so commit workflows can consistently bind to the correct repository in multi-repo sessions.
ImpactDevelopers working in VS Code with multiple repositories open can now generate commit messages for the currently active project instead of accidentally targeting another repo, which reduces misfiled review metadata and broken workflow handoffs; teams should still watch mixed-repo edge cases where root detection might still resolve incorrectly after rapid workspace switching. The fix replaces a singleton cached root path in the commit service with active root hints from SourceControl, directly addressing cross-repo ambiguity in multi-root environments and making generated commit context more reliable.
May 20, 2026, 12:00 AM
product update
Ramp adopts Codex with GPT-5.5 for faster code review
Ramp integrated Codex powered by GPT-5.5 into its engineering code-review workflow, allowing reviewers to receive substantive feedback in minutes instead of hours before shipping improvements.
ContributionAdded an AI-assisted code-review capability in the engineering workflow by combining Codex with GPT-5.5 to generate substantive review feedback during development.
ImpactRamp developers can get meaningful review feedback in minutes instead of hours, so teams can fix issues and release improvements faster with tighter iteration loops. This appears to be driven by AI-assisted review pass integration using Codex and GPT-5.5, so the team should watch for suggestion accuracy drift, missed risks in sensitive code paths, and whether human review rigor declines as trust in the assistant grows.
May 19, 2026, 2:01 PM
release
Agent Manager supports speech-to-text for inline review comments
v7.3.1 introduces voice input in Agent Manager inline review comments, letting reviewers submit feedback by speaking instead of typing.
ContributionAdded a voice-to-text capture and send path in the Agent Manager inline review composer, enabling dictated comments to be submitted directly in the review UI.
ImpactDevelopers doing code review can now dictate inline comments, which can speed up feedback cycles and reduce typing effort, so teams should track whether dictation remains reliable enough for day-to-day review flow before depending on it for routine operations. This change extends the inline review input pipeline in Agent Manager to accept recorded audio and convert it into comment text at send time; watch transcription accuracy on noisy calls, microphone permission edge cases, and how failures are surfaced when voice capture is unavailable.
May 18, 2026, 10:04 PM
pull request
Add AIDLC Code Reviewer CLI for unified technical and business-logic reviews
Introduces a new `scripts/aidlc-codereview` package with the `aidlc-code-reviewer` CLI, enabling one-command review of AI-DLC code through built-in static tools plus Bedrock-powered business-logic analysis, and generating linked HTML/Markdown reports.
ContributionAdded a new CLI-driven review workflow in AIDLC (`aidlc-code-reviewer`) that centralizes execution of multiple analyzers and AI agents (critical code, structural quality, and business logic), outputs both technical and business reports, and supports automatic Bedrock-backed wrapper generation for tools not pre-integrated.
ImpactDevelopers using AIDLC can now run one command to get consolidated quality and business-logic findings, which can materially reduce manual review effort and catch issues earlier before code reaches later stages; in practice this changes review turnaround and lowers the chance that logic or security mistakes slip through. The implementation relies on Bedrock invocation, so operator risk now includes credentials/permissions (`bedrock:InvokeModel`), output stability of auto-generated wrappers for new tools, and whether severity policy consistently limits HIGH/CRITICAL to security findings. Monitor review coverage and false-positive/false-negative patterns as model output changes and new tool additions are added.
May 18, 2026, 12:00 PM
release
entire review gains live JSONL agent-event streaming
`entire review` in v0.6.2 now exposes live multi-agent progress and failure signals through JSONL output, making review execution more observable instead of only showing results at the end of a run.
ContributionAdded real-time JSONL event streaming for `entire review` and updated review execution feedback so progress and failure outcomes are surfaced continuously during a run.
ImpactDevelopers and automation operators using `entire review` can monitor long review sessions as they execute, so failures and stalls are visible earlier and teams can intervene before wasting time on a full, unproductive cycle; this reduces uncertainty in CI workflows and lowers the cost of debugging long checks, but teams should watch for log parser compatibility, increased event volume in downstream tooling, and whether the expanded default scope (including uncommitted changes) changes review signal-to-noise.

Evidence Trail

github_commit_burst
dotnet/skills commit burst: 3 commits in 7 days
AI Code Review has source-backed evidence attached to the latest tracked change.
Open Source
github_pull_request
Kilo-Org/kilocode PR #10502: feat: track accepted review suggestions
The app now records a telemetry event only after a user accepts a review suggestion; dismissed, invalid, and non-review suggestions are no longer counted, and accepted-suggestion follow-up model responses are tagged as coming from the suggest tool with known review command names.
Open Source
github_release
v1.35.0
Hooks system for extensible pre/post tool execution - PreToolUse denial hook support
Open Source
github_commit_burst
langchain-ai/open-swe commit burst: 4 commits in 7 days
chore: bump deepagents to 0.6.3 (#1322) Pick up the patch release that restores message ID assignment for incoming messages without explicit IDs after the 0.6.2 regression.
Open Source

Source Coverage

github pull request: 9 events · 9 evidence items; 2 days ago
github commit burst: 4 events · 4 evidence items; 2 days ago
github release: 3 events · 3 evidence items; 2 days ago
rss feed: 1 event · 1 evidence item; 4 days ago

Subscribe to this topic

Keep tracking AI Code Review with weekly digests and high-signal alerts once your account subscription is active.

Review Pro tracking

Watching Next

AI Code Review tracks source-backed changes, trend stages, evidence volume, and the signals worth watching over time.

Turn on alerts