CodeTracked since May 22, 2026

Default Bedrock maxTokens to model limits to prevent silent 4k truncation

The change updates Bedrock requests so that when callers do not supply `maxTokens`, the provider sends `inferenceConfig.maxTokens` from `model.maxTokens` instead of relying on the provider default, removing a silent 4096-token output cap that triggered `stopReason: "length"` on long Anthropic Claude generations.

Amazon BedrockinferenceConfig.maxTokensmodel.maxTokensAnthropic Claude Opus 4.7

What Happened

The change updates Bedrock requests so that when callers do not supply `maxTokens`, the provider sends `inferenceConfig.maxTokens` from `model.maxTokens` instead of relying on the provider default, removing a silent 4096-token output cap that triggered `stopReason: "length"` on long Anthropic Claude generations.
The change updates Bedrock requests so that when callers do not supply `maxTokens`, the provider sends `inferenceConfig.maxTokens` from `model.maxTokens` instead of relying on the provider default, removing a silent 4096-token output cap that triggered `stopReason: "length"` on long Anthropic Claude generations.
1 evidence item attached for review.

What is Different

Before

Scattered source updates, isolated context, and manual follow-up across multiple feeds.

Now

Added a concrete fallback in the Bedrock provider request builder to populate `inferenceConfig.maxTokens` from `model.maxTokens` when user options omit `maxTokens`, then validated the fix with a real end-to-end test that reproduces and confirms removal of the length-stop truncation.

Why Track This

Why It Matters

Developers and operators using this SDK to call Bedrock no longer get long responses cut off mid-task at about 4096 tokens when they forget to pass `maxTokens`, so multi-thousand-token outputs (for coding, writing, and other long-form tasks) can complete more reliably. Previously, missing `maxTokens` let Bedrock enforce a server-side default cap that caused `stopReason:"length"`; the fix now applies the model’s declared token limit by default, and teams should watch for output-cost/latency growth on long prompts and verify new or updated models still expose correct `maxTokens` values.

Impact

What To Watch Next

Watch whether Amazon Bedrock becomes a repeated pattern.
Track follow-up changes around AI Coding Agents.
Compare future signals against this evidence trail.
Re-check risk flags: watch_output_cost_growth_for_long_prompts, verify_model_maxTokens_values_for_new_provider_models.

Open Topic Timeline Open Technical Event Open Original Sourcewatch_output_cost_growth_for_long_prompts / verify_model_maxTokens_values_for_new_provider_models / monitor_stopReason_length_regressions

Supporting Evidence

GITHUB PULL REQUESTHigh Trust

earendil-works/pi PR #4871: fix(ai): default Bedrock `inferenceConfig.maxTokens` to `model.maxTokens`

An e2e Bedrock test for `global.anthropic.claude-opus-4-7` failed with `stopReason="length"` and 4096-token output before the fix, then passed after adding `maxTokens: options.maxTokens ?? (model.maxTokens || undefined)` in `streamBedrock`, with completion ending naturally and output exceeding 4096 tokens.