Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,11 @@ All notable changes to `@tangle-network/agent-eval` and its sibling `agent-eval-

---

## [Unreleased] — eval fixture UX + manifest-correct campaign resume
## [0.100.3] — 2026-07-01 — product benchmark contract + eval fixture UX

### Added

- Published the `@tangle-network/agent-eval/product-benchmark` subpath so product agents can share one strict product-benchmark manifest, record, artifact, and integrity validator instead of copying Agent Lab or product-local schema code.
- **Vercel-style eval fixture loading in `/campaign`.** `discoverEvalFixtures`, `loadEvalFixture`, `loadEvalFixtureScenarios`, and `planEvalFixtureRun` let agents use the simple `evals/<name>/PROMPT.md + EVAL.ts + package.json` shape while still executing through the existing `runCampaign` primitive.
- **Dry-run planning for campaigns.** `planCampaignRun` reports `totalCells`, `cellsCached`, `cellsToRun`, per-cell cache paths, and miss reasons before any agent work starts. This is the cheap proof before spending tokens.
- **`dispatchRef` on `runCampaign`.** Callers can include the model/tool/prompt/runtime identity in the manifest when the same dispatch function name can run different behavior.
Expand Down
2 changes: 1 addition & 1 deletion clients/python/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "hatchling.build"

[project]
name = "agent-eval-rpc"
version = "0.100.2"
version = "0.100.3"
description = "Python RPC client for @tangle-network/agent-eval — judge content against rubrics over HTTP or stdio RPC. Eval logic runs in the Node runtime; this package is a thin wire client."
readme = "README.md"
requires-python = ">=3.10"
Expand Down
2 changes: 1 addition & 1 deletion clients/python/src/agent_eval_rpc/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@
try:
__version__ = version("agent-eval-rpc")
except PackageNotFoundError:
__version__ = "0.100.2"
__version__ = "0.100.3"

__all__ = [
"Client",
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@tangle-network/agent-eval",
"version": "0.100.2",
"version": "0.100.3",
"description": "Evaluate and improve AI agents from runs, traces, judges, and feedback. Compare candidates, cluster failures, measure lift, and gate releases.",
"homepage": "https://github.com/tangle-network/agent-eval#readme",
"repository": {
Expand Down
Loading