CodeTracked since May 22, 2026

Keep provider raw responses when using StructuredLLM structured chat

This pull request fixes structured-output chat in run-llama/llama_index by preserving provider-native raw responses and metadata for `StructuredLLM.chat()` and `StructuredLLM.achat()`. Previously these paths rebuilt `ChatResponse` from only the parsed Pydantic payload, which dropped details like token usage and the original provider response object; the update adds an internal `StructuredPredictionResult` wrapper so `structured_predict()` and `astructured_predict()` keep returning parsed models while raw details are retained in the response path for OpenAI chat-completions and responses.parse structured outputs.

StructuredLLMchat()achat()structured_predict()

What Happened

This pull request fixes structured-output chat in run-llama/llama_index by preserving provider-native raw responses and metadata for `StructuredLLM.chat()` and `StructuredLLM.achat()`. Previously these paths rebuilt `ChatResponse` from only the parsed Pydantic payload, which dropped details like token usage and the original provider response object; the update adds an internal `StructuredPredictionResult` wrapper so `structured_predict()` and `astructured_predict()` keep returning parsed models while raw details are retained in the response path for OpenAI chat-completions and responses.parse structured outputs.
This pull request fixes structured-output chat in run-llama/llama_index by preserving provider-native raw responses and metadata for `StructuredLLM.chat()` and `StructuredLLM.achat()`. Previously these paths rebuilt `ChatResponse` from only the parsed Pydantic payload, which dropped details like token usage and the original provider response object; the update adds an internal `StructuredPredictionResult` wrapper so `structured_predict()` and `astructured_predict()` keep returning parsed models while raw details are retained in the response path for OpenAI chat-completions and responses.parse structured outputs.
1 evidence item attached for review.

What is Different

Before

Scattered source updates, isolated context, and manual follow-up across multiple feeds.

Now

Added an internal response wrapper in structured LLM chat paths that preserves provider-native raw response objects and metadata without changing the external parsed-model return contract for `structured_predict`/`astructured_predict`.

Why Track This

Why It Matters

Developers using StructuredLLM with OpenAI structured chat no longer lose response metadata on successful calls, so token-usage tracking, billing checks, and debug logs keep the context they need instead of appearing incomplete. This was implemented by replacing the prior ChatResponse reconstruction from only parsed output with a `StructuredPredictionResult` flow for sync/async structured chat and structured prediction paths, and teams should next watch for any downstream consumers relying on the old stripped-response shape and for extension to other providers where similar metadata could still be dropped.

Impact

What To Watch Next

Watch whether StructuredLLM becomes a repeated pattern.
Track follow-up changes around Structured Outputs.
Compare future signals against this evidence trail.
Re-check risk flags: consumer_code_relying_on_missing_raw_fields, non_openai_providers_still_drop_structured_metadata.

Open Topic Timeline Open Technical Event Open Original Sourceconsumer_code_relying_on_missing_raw_fields / non_openai_providers_still_drop_structured_metadata / downstream_metrics_pipeline_shape_validation

Supporting Evidence

GITHUB PULL REQUESTHigh Trust

run-llama/llama_index PR #21754: Preserve raw provider responses in StructuredLLM chat

Fixes #19845 by preventing raw provider metadata loss in structured LLM chat paths and retaining it via an internal `StructuredPredictionResult` wrapper.

Keep provider raw responses when using StructuredLLM structured chat

StructuredLLMchat()achat()structured_predict()

What Happened

1 evidence item attached for review.

What is Different

Before

Scattered source updates, isolated context, and manual follow-up across multiple feeds.

Now

Why Track This

Why It Matters

Impact