Skip to content

ci: implement multi-model evaluation support #169

Merged
omkargaikwad23 merged 8 commits intomainfrom
claude-ci-evals
May 8, 2026
Merged

ci: implement multi-model evaluation support #169
omkargaikwad23 merged 8 commits intomainfrom
claude-ci-evals

Conversation

@omkargaikwad23
Copy link
Copy Markdown
Contributor

@omkargaikwad23 omkargaikwad23 commented May 7, 2026

Description:

Updated the CI pipeline to support evaluating multiple models dynamically

Key changes include:

  • Added specific evaluation configurations for Claude (model, dataset, and run configs)

  • Renamed existing configuration files to be explicitly Gemini-specific

  • Updated cloudbuild.yaml and substitute_env.py to automatically process and launch evaluations for all run configurations present in the workspace

…configs with dynamic discovery and adding Claude/Gemini-specific configurations.
@omkargaikwad23 omkargaikwad23 requested a review from a team as a code owner May 7, 2026 06:02
@omkargaikwad23 omkargaikwad23 added the ci:run-evals Manually trigger the evaluation CI pipeline on a PR. label May 7, 2026
@omkargaikwad23 omkargaikwad23 requested a review from a team as a code owner May 7, 2026 06:02
Comment thread evals/claude_code_model.yaml Outdated
@prernakakkar-google
Copy link
Copy Markdown
Contributor

Add detailed description before merging

@omkargaikwad23 omkargaikwad23 merged commit 24b2db3 into main May 8, 2026
11 checks passed
@omkargaikwad23 omkargaikwad23 deleted the claude-ci-evals branch May 8, 2026 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:run-evals Manually trigger the evaluation CI pipeline on a PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants