What Happened
- The PR adds support for running evals directly against a deployed agent, so model quality checks can target the exact service endpoint that users interact with.
- The PR adds support for running evals directly against a deployed agent, so model quality checks can target the exact service endpoint that users interact with.
- 1 evidence item attached for review.