Skip to content

ssh: actionable error when binary upload is reset by network proxy#5204

Open
anton-107 wants to merge 2 commits intomainfrom
ssh/actionable-error-on-upload-stream-reset
Open

ssh: actionable error when binary upload is reset by network proxy#5204
anton-107 wants to merge 2 commits intomainfrom
ssh/actionable-error-on-upload-stream-reset

Conversation

@anton-107
Copy link
Copy Markdown
Contributor

Summary

  • When databricks ssh connect uploads the CLI binary to the workspace, a network intermediary (corporate egress proxy, VPN, firewall/WAF) can close the HTTP/2 stream mid-upload. The Go HTTP/2 client surfaces this as a cryptic stream error: stream ID N; NO_ERROR; received from peer, which is hard for end users to interpret.
  • This change detects that transport-level reset (typed http2.StreamError and the wrapped string form) and wraps the error with a clear, actionable message pointing at network-side restrictions.

Reported by a customer on *.cloud.databricks.com whose 50 MB CLI binary upload was reset within ~2 s — every step before the upload (cluster check, secrets, GitHub download) succeeded; only the final POST to /api/2.0/workspace-files/import-file/... failed with the stream reset.

New error message

failed to upload file ... to workspace: ...: stream error: stream ID N; NO_ERROR; received from peer

The connection was closed before the upload finished. This is usually caused by a network intermediary (corporate egress proxy, VPN, or firewall/WAF) enforcing a request-body size limit on POSTs to *.cloud.databricks.com. Try running this command from a network without such restrictions.

Test plan

  • go test ./experimental/ssh/internal/client/ -run TestIsStreamResetError -v — table-driven cases cover typed http2.StreamError, wrapped variant, raw string match, and unrelated errors
  • go test ./experimental/ssh/internal/client/ -count=1 — full client package green
  • go build ./experimental/ssh/... — clean
  • Manual smoke: bin/databricks ssh connect --cluster <id> against a working workspace still succeeds (no spurious detection on the success path)

This pull request and its description were written by Isaac.

When `databricks ssh connect` uploads the CLI binary to the workspace,
a network intermediary (corporate egress proxy, VPN, firewall/WAF) may
close the HTTP/2 stream mid-upload, surfacing as a cryptic
`stream error: stream ID N; NO_ERROR; received from peer`.

Detect this transport-level reset and wrap the error with a hint that
the connection was closed by an intermediary and the user should try
from a network without such restrictions.

Co-authored-by: Isaac
@anton-107 anton-107 temporarily deployed to test-trigger-is May 7, 2026 09:56 — with GitHub Actions Inactive
@anton-107 anton-107 temporarily deployed to test-trigger-is May 7, 2026 09:56 — with GitHub Actions Inactive
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

Waiting for approval

Based on git history, these people are best suited to review:

  • @ilia-db -- recent work in experimental/ssh/internal/client/

Eligible reviewers: @andrewnester, @denik, @pietern, @renaudhartert-db, @shreyas-goenka, @simonfaltum

Suggestions based on git history. See OWNERS for ownership rules.

The typed http2.StreamError check promoted golang.org/x/net to a direct
dependency, failing the lint check that go.mod is unchanged. Rely on
string match alone — http2.StreamError.Error() formats as
"stream error: stream ID N; ..." which the existing string match catches.

Co-authored-by: Isaac
@anton-107 anton-107 temporarily deployed to test-trigger-is May 7, 2026 09:59 — with GitHub Actions Inactive
@anton-107 anton-107 temporarily deployed to test-trigger-is May 7, 2026 09:59 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant