fix(gastown): distinguish null causes in PR status polling (#3149)#3156
fix(gastown): distinguish null causes in PR status polling (#3149)#3156jrf0110 wants to merge 1 commit into
Conversation
Replace PRStatusResult | null return type with discriminated PRStatusOutcome union in checkPRStatus. Each null cause (no token, HTTP error, invalid response, unrecognized URL, host mismatch) now surfaces a structured PRStatusError with actionable failure messages. - resolveGitHubToken returns GitHubTokenResolution with resolution chain - no_token and non-transient HTTP errors (401/403/404) fail immediately - invalid_response/unrecognized_url/host_mismatch fail after 3 strikes - Transient HTTP errors (5xx/429) keep existing 10-strike behavior - poll_null_count resets to 0 on successful poll at both call sites - failureKind persisted to bead metadata for analytics - AE event pr.poll_failed emitted on terminal failure - Unit tests for checkPRStatus, resolveGitHubToken, failureMessageFor, and threshold logic - Integration test for no_token immediate-fail path
| err | ||
| ); | ||
| } | ||
| } else if (!integrationId) { |
There was a problem hiding this comment.
WARNING: Missing tried entry when integrationId is set but GIT_TOKEN_SERVICE binding is absent
The condition on line 62 is else if (!integrationId), so when integrationId is non-empty but ctx.env.GIT_TOKEN_SERVICE is falsy (e.g. the binding isn't wired in a test/staging env), neither branch executes and the source label is silently dropped from tried. The no_token failure message shown to the user will omit that integration step, making the resolution chain incomplete and potentially misleading.
Consider:
} else if (integrationId) {
// GIT_TOKEN_SERVICE binding unavailable — record that we couldn't try it
tried.push(`${sourceLabel} (service unavailable)`);
} else {
tried.push('platform integration (none configured)');
}| const PR_POLL_NULL_THRESHOLD = 10; | ||
|
|
||
| /** Fail MR bead after this many consecutive non-transient errors (invalid_response, unrecognized_url, host_mismatch). */ | ||
| const PR_POLL_NON_TRANSIENT_THRESHOLD = 3; |
There was a problem hiding this comment.
WARNING: poll_null_count counter is shared between the two error-kind buckets (transient 10-strike vs non-transient 3-strike)
Both the shouldCountAsTransient branch (lines ~1291-1349) and the else branch (lines ~1350-1409) increment and read the same $.poll_null_count key. If error kinds alternate between poll cycles — e.g. a transient http_error (5xx) followed by an invalid_response — the counter from the first bucket bleeds into the second bucket's threshold check. An invalid_response that arrives after 2 transient errors will immediately fail on the first non-transient occurrence (counter is already 2, threshold is 3, so it fails after just one more), while also not correctly applying the 10-strike rule for the transient errors.
Consider using separate counter keys ($.poll_transient_count and $.poll_non_transient_count) so each bucket's threshold is independent and the PR bead doesn't fail prematurely due to cross-bucket counter accumulation.
Code Review SummaryStatus: 2 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Other Observations (not in diff)
Files Reviewed (6 files)
Fix these issues in Kilo Cloud Reviewed by claude-4.6-sonnet-20260217 · 651,027 tokens |
|
Superseded by #3160. Closing as duplicate from convoy retry loop. |
Summary
Replaces the
nullreturn fromcheckPRStatuswith a discriminatedPRStatusOutcomeunion ({ ok: true, result }|{ ok: false, error }), and similarly updatesresolveGitHubTokento returnGitHubTokenResolutionthat captures the resolution chain. This ensures each distinct failure mode (no token, HTTP errors, invalid response, unrecognized URL, host mismatch) surfaces a specific, actionable failure message instead of a generic "null" after 10 polls.Key changes:
checkPRStatusreturnsPRStatusOutcomewith structuredPRStatusErrorvariants (no_token, http_error, invalid_response, unrecognized_url, host_mismatch)resolveGitHubTokenreturnsGitHubTokenResolutiontracking which sources were triedpoll_null_countresets to 0 on successful pollpr.poll_failedevent emitted on terminal failurerefresh-git-token.handler.tsandTown.do.tsFixes #3149.
Verification
checkPRStatuserror discrimination (no_token, 401/403/404, 5xx/429 transient, invalid_response, unrecognized_url, host_mismatch)failureMessageForproducing actionable messages for each error kindshouldFailImmediatelyandshouldCountAsTransientthreshold logicVisual Changes
N/A
Reviewer Notes
The
platformIntegrationIdis not passed in theTown.do.tscheckPRStatuscontext — this matches the pre-existing behavior. Therefresh-git-token.handler.tsdoes pass it, which is the intended pattern for rig-level token resolution.