chore(release): agent-eval 0.100.3#295
Conversation
tangletools
left a comment
There was a problem hiding this comment.
✅ Auto-approved drewstone PR — 4b2fd5a9
This PR was opened by the trusted drewstone account.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: drewstone_author · 2026-07-01T15:09:51Z
tangletools
left a comment
There was a problem hiding this comment.
🔴 Value Audit — redundant-or-flawed
| Verdict | redundant-or-flawed |
| Concerns | 2 (2 strong-concern) |
| Heuristic | 0.0s |
| Duplication | 0.0s |
| Interrogation | 244.0s (2 bridge agents) |
| Total | 244.0s |
💰 Value — redundant-or-flawed
Bumps package versions to 0.100.3 consistently, but mishandles CHANGELOG.md by replacing the [Unreleased] section instead of moving its entries, causing a merge conflict with origin/main and dropping release notes for changes that will ship in the same release.
- What it does: Bumps
@tangle-network/agent-evalandagent-eval-rpcfrom 0.100.2 to 0.100.3 inpackage.json:3,clients/python/pyproject.toml:7, and thePackageNotFoundErrorfallback inclients/python/src/agent_eval_rpc/__init__.py:61; adds a[0.100.3]entry toCHANGELOG.md:7for the product-benchmark subpath. - Goals it achieves: Publishes the already-merged
product-benchmarksubpath (package.json:117-120,tsup.config.ts:26) under a new tag and keeps the npm and PyPI package versions locked, correcting the 0.100.2 release where__init__.pywas left behind and needed a follow-up fix (1aaa5f6). - Assessment: The version bumps are correct and complete (all three version locations are now in sync). However, the
CHANGELOG.mdedit replaces the[Unreleased]section that existed onorigin/mainrather than moving those entries into[0.100.3]. As a result, release notes for the eval-fixture/campaign changes (47d2a4e) are lost even though that code is on main and will ship with 0.100.3. Additionally, - Better / existing approach: Preserve the existing
[Unreleased]entries by moving them under the new[0.100.3]section alongside the product-benchmark note, then resolve the merge withorigin/main. This follows the Keep a Changelog convention already used in the file and ensures the release notes cover everything in the tag. - Model: opencode/kimi-for-coding/k2p7
- Bridge attempts: 1
🎯 Usefulness — sound
A clean release-mechanics bump to 0.100.3 that publishes the already-merged product-benchmark subpath; versions are in sync across all three files and the new module is correctly wired into exports, build, and the verify script.
- Integration: Fully reachable. The product-benchmark subpath being published is wired into package.json exports (line 117), tsup build config (tsup.config.ts:26), and the verify-package-exports guard script (lines 37, 112). All three version literals (package.json:3, pyproject.toml:7, agent_eval_rpc/init.py:61) are synchronized at 0.100.3. Imminent caller per CHANGELOG: product agents consuming @tangle-netw
- Fit with existing patterns: Follows the established release pattern exactly — identical shape to the 0.100.2 (3ba4252) and 0.100.1 (223b7f8) releases, and the subpath-export + re-export-from-src/index.ts + verify-package-exports triplet is the same pattern used by ./perf, ./multishot, ./campaign, and every other substrate subpath. No competing or duplicate surface.
- Real-world viability: Version-literal bump plus CHANGELOG entry; no runtime logic changed. The init.py fallback literal at line 61 only fires when package metadata is absent (dev/egg install), which is the intended robustness behavior established by fix #294. No concurrency, error-path, or edge-input surface to evaluate.
- Model: opencode/zai-coding-plan/glm-5.2
- Bridge attempts: 1
💰 Value Audit
🔴 Branch has a CHANGELOG.md merge conflict with origin/main [against-grain] ``
git merge-tree --write-tree origin/main HEADexits 1 withCONFLICT (content): Merge conflict in CHANGELOG.md. The release commit cannot land cleanly as-is. Evidence: merge-tree output shows three CHANGELOG.md versions (base, main, HEAD) and auto-merge fails.
🔴 CHANGELOG replaces [Unreleased] instead of moving its entries into [0.100.3] [against-grain] ``
origin/mainhas an[Unreleased]section documenting eval-fixture UX,planCampaignRun,dispatchRef, and amanifestHashresumability fix (47d2a4e,CHANGELOG.md:7-22on main). The release commit removes all of those notes and replaces them with a single product-benchmark line (CHANGELOG.md:7-12on HEAD). Since the code for those changes is on main and will be in the 0.100.3 tag, the release notes are now incomplete.
What this audit checks
It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.
| Pass | What it asks |
|---|---|
| Heuristic | Vague title? Whitespace-only or cruft-bearing diff? (content signals only) |
| Duplication | Do added function/class names already exist elsewhere in the repo? |
| Value Audit | What does it do? What goal does it achieve? Is it good? Better architecture or already-exists? |
| Usefulness Audit | Does it integrate and fit? Will it hold up in real use and actually get used? |
Findings are concerns, not blocks — the human reviewer decides what to do with them.
tangletools
left a comment
There was a problem hiding this comment.
✅ Auto-approved drewstone PR — 63d7e3e9
This PR was opened by the trusted drewstone account.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: drewstone_author · 2026-07-01T15:17:22Z
tangletools
left a comment
There was a problem hiding this comment.
✅ Auto-approved drewstone PR — 95a61cb5
This PR was opened by the trusted drewstone account.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: drewstone_author · 2026-07-01T15:21:20Z
Summary
Checks