feat(runtime): reasoning-aware routerChatWithUsage by drewstone · Pull Request #434 · tangle-network/agent-runtime

drewstone · 2026-07-02T02:38:16Z

What

Makes routerChatWithUsage safe for thinking models across providers:

opts.reasoningEffort?: 'none'|'low'|'medium'|'high' → forwarded as reasoning_effort. 'none' is the load-bearing value: binary/single-token decisions (routing, gating) on thinking models otherwise spend the entire token budget inside the think block — on CPU-local backends that turns into a client timeout, not just waste.
RouterChatResult.reasoning?: string + always-clean content: normalizes the two provider shapes (OpenRouter/DeepSeek-style separate reasoning/reasoning_content field; Groq/local-style inline <think> block). An unclosed think block (budget exhausted mid-thought) yields empty content — honest, since no final answer was emitted.

Why (twice-hit, per the lab's lift rule)

Two real experiment casualties in agent-lab:

R230 (local routing): 20/20 audit cells died as fetch failed — the model thought past the client timeout because the thinking-off flag had no path through the body.
R231 (provider invariance): the same qwen3-32b weights scored "10/20 on Groq, 20/20 on OpenRouter" — a pure serialization artifact; single-token parsers were reading Groq's inlined reasoning prose. Raw-content autopsy confirmed the model's reasoning was correct on both providers.

Compatibility

Additive only. No call-site changes; non-thinking responses byte-identical; reasoning_effort omitted from the body unless explicitly set. 9 new tests; runtime suite green; typecheck green.

🤖 Generated with Claude Code

Two gaps, each hit twice by real experiment runs in agent-lab (R230 local routing, R231 provider-invariance grid): 1. reasoning controls were droppable: the request body forwarded only model/messages/temperature/max_tokens, so a thinking model on a binary decision burned its whole budget inside the think block; on a slow (CPU-local) backend that becomes a client timeout, and 20/20 audit cells died as "fetch failed". opts.reasoningEffort now forwards as reasoning_effort ('none' is the load-bearing value for routing/gating). 2. reasoning and content were conflated: OpenRouter returns reasoning in a separate field with clean content; Groq inlines a <think> block into content. Downstream single-token parsers read the reasoning prose (which quotes both option tokens) and misread decisions, making the SAME weights look broken on one provider and fine on another (R231: groq qwen3-32b "10/20" vs openrouter "20/20", both actually correct). parseChatResult now splits both shapes into RouterChatResult.reasoning and always-clean content; an unclosed think block (budget exhausted mid-thought) yields empty content, which is honest: no answer was emitted. Additive: no call-site changes required; non-thinking responses are byte-identical. 9 new tests cover effort forwarding, both provider shapes, reasoning_content (DeepSeek/Kimi), unclosed think, and the unchanged non-thinking path.

tangletools

✅ Auto-approved drewstone PR — `a178dde1`

This PR was opened by the trusted drewstone account.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

_{tangletools · auto-approval · reason: drewstone_author · 2026-07-02T02:38:26Z}

tangletools approved these changes Jul 2, 2026

View reviewed changes

drewstone merged commit a08877a into main Jul 2, 2026
1 check failed

drewstone deleted the feat/reasoning-aware-router-client branch July 2, 2026 02:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(runtime): reasoning-aware routerChatWithUsage#434

feat(runtime): reasoning-aware routerChatWithUsage#434
drewstone merged 1 commit into
mainfrom
feat/reasoning-aware-router-client

drewstone commented Jul 2, 2026

Uh oh!

tangletools left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

drewstone commented Jul 2, 2026

What

Why (twice-hit, per the lab's lift rule)

Compatibility

Uh oh!

tangletools left a comment

Choose a reason for hiding this comment

✅ Auto-approved drewstone PR — a178dde1

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

✅ Auto-approved drewstone PR — `a178dde1`