Job_chat: Implement tool-use for code edits#501
Open
hanna-paasivirta wants to merge 12 commits into
Open
Conversation
Collaborator
|
This PR just does a single tool call turn - which means:
Suggest we add full loop later. Single step is the quickest win. Defer the opus update to later - I want a clean fix first, then we'll consider the opus update in isolation. |
josephjclark
approved these changes
Jun 4, 2026
Collaborator
josephjclark
left a comment
There was a problem hiding this comment.
I'll give this a final test and hopefully release tomorrow. Thanks @hanna-paasivirta
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Short Description
This PR adds
job_chat. This replaces the existing structured output format where the model was forced to answer entirely in JSON, with code edits first, then a text explanation. The model can now answer in natural language, and call a tool to request code changes. This results in much fewer refusals to answer (where the model ends the turn early); and empty responses.Fixes #497
We also need a tweak in Lightning to add a stream status between the text answer and the code edits. See: OpenFn/lightning#4833
Implementation details
The job code assistant can only call the code edit tools once for all changes. The conversation turn ends there. We picked this because it makes the turn a little faster and simpler than a fully agentic tool-use loop, as we constrain the assistant to fewer API calls.
As our use cases become more complex and our global assistant architecture evolves, we might decide to allow any number of tool calls. The Lightning-side change should be ready for different orders of streaming statuses.
Experiments
I tried the following to fix the answer refusals. Failures resulted in more answer refusals/answers containing '...' only/garbled, repetitive tokens/malformed JSON:
I added the structured outputs to the code edit tool, where it doesn’t seem to cause strange behaviour and so should provide the occasional guardrail it’s intended to be (instead of manhandling the output into a low-probability sequence).
AI Usage
Please disclose how you've used AI in this work (it's cool, we just want to know!):
You can read more details in our Responsible AI Policy