What Happened
- The PR converts Databricks AI skill ground-truth definitions from MCP-prescriptive response checks to tool-agnostic natural-language checks, preserving metadata and restructuring cases so the same benchmarks can compare MCP-enabled and CLI-only workflows consistently in A/B evals.
- The PR converts Databricks AI skill ground-truth definitions from MCP-prescriptive response checks to tool-agnostic natural-language checks, preserving metadata and restructuring cases so the same benchmarks can compare MCP-enabled and CLI-only workflows consistently in A/B evals.
- 1 evidence item attached for review.