[superseded] Route only apt-packages/container legs to GitHub-hosted#96
Closed
ChrisRackauckas-Claude wants to merge 1 commit into
Closed
Conversation
The SciML self-hosted pool (demeter*/arctic*, ephemeral *-cxnps-*) is registered with the custom label `ubuntu-latest`, the same label GitHub-hosted runners answer. So every default test leg that requests `ubuntu-latest` is a coin-flip between GitHub-hosted and self-hosted scheduling. The persistent demeter*/arctic* runners lack passwordless sudo, so the apt-packages provisioning step (`sudo apt-get`) intermittently fails with "sudo: a terminal is required to read the password" whenever a leg lands on one of them (ChrisRackauckas/InternalJunk#52). Evidence: across 205 jobs in 4 recent OrdinaryDiffEq.jl runs (CI / Sublibrary CI / Downgrade / Downgrade Sublibraries), `ubuntu-latest` was answered by BOTH github-hosted runners (runner group "GitHub Actions") AND self-hosted runners (demeter*/arctic*/*-cxnps-*, group "default"). The self-hosted runners' only assigned labels are {self-hosted, Linux, X64, gpu, high-memory, ubuntu-latest} -- none carry the pinned `ubuntu-24.04`/`ubuntu-22.04` labels. A github-hosted job's setup log shows `ubuntu-latest` currently resolves to image `ubuntu-24.04`, so pinning to `ubuntu-24.04` keeps the identical environment while removing the self-hosted pool from the candidate set. Change: set the DEFAULT test-leg runner to the pinned `ubuntu-24.04` label, which only GitHub-hosted runners answer, forcing default legs onto GitHub-hosted (where `sudo apt-get` works via passwordless sudo). Applied to the per-group `runner` default in compute_affected_sublibraries.jl (the source for grouped-tests.yml / sublibrary-project-tests.yml matrices), the `os` default in tests.yml and downgrade.yml, and the hardcoded `runs-on` in sublibrary-downgrade.yml plus the detect/discover helper jobs. GPU / self-hosted groups are intentionally PRESERVED: any group that sets an explicit `runner` (e.g. ["self-hosted","Linux","X64","gpu"]) in its test_groups.toml overrides the default and is left untouched; the OS-axis overrides (os = ["ubuntu-latest", ...]) are likewise unchanged. Needs a v1 retag to take effect fleet-wide. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Author
|
This PR's original approach (pinning the default runner to |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Scoped: route only apt-packages/container legs to GitHub-hosted
Runner facts (unchanged)
The SciML self-hosted pool (
demeter*/arctic*, ephemeral*-cxnps-*) is registered with the custom labelubuntu-latest— the same label GitHub-hosted runners answer. Its full label set is{self-hosted, Linux, X64, gpu, high-memory, ubuntu-latest}; it does not carry the pinnedubuntu-24.04label. So:ubuntu-latest→ self-hosted-capable (kept as the default for throughput).ubuntu-24.04→ GitHub-hosted only (has passwordless sudo + docker).What this changes
Keep
ubuntu-latestas the default for normal test/downgrade legs, and force GitHub-hosted only for the legs that genuinely need passwordless sudo / docker — i.e. exactly the legs where the caller passesapt-packagesor acontainer. The persistentdemeter*/arctic*runners lack passwordless sudo, sosudo apt-get(apt provisioning) intermittently fails withsudo: a terminal is required to read the passwordwhenever such a leg lands there (ChrisRackauckas/InternalJunk#52); containers likewise need a Docker host.Conditional added at the reusable job's
runs-on:fromJSON('["ubuntu-24.04"]')is a non-empty (truthy) array, so the GitHub Actions&&/||ternary does not fall through; the default branch returns the existing value (an array viafromJson(inputs.runner), or a string'self-hosted'/inputs.os/'ubuntu-latest'). Both branches are validruns-onforms (array or string) — the same idiomtests.ymlalready used for therunner/osselection.actionlintvalidates it (exit 0).Where it landed
The reusables that accept
apt-packages/containerand have a direct job-levelruns-on:tests.ymltests(leaf)fromJson(inputs.runner)/self-hosted/inputs.os(ubuntu-latest)downgrade.ymldowngradeself-hosted/inputs.os(ubuntu-latest)sublibrary-downgrade.ymltestubuntu-latestgrouped-tests.ymlandsublibrary-project-tests.ymlroute their matrices throughtests.yml(passingapt-packages/containerthrough), so the single conditional intests.ymlcovers them.Reverted from #96 / left unchanged on purpose
scripts/compute_affected_sublibraries.jldefault runner and theosdefaults intests.yml/downgrade.ymlare back toubuntu-latest.detect/discoverhelper jobs (grouped-tests.yml,sublibrary-project-tests.yml,sublibrary-downgrade.ymldiscover) stayubuntu-latest.sublibrary-project-tests.ymlexposes noapt-packages/containerinput and passes none through, so its matrix legs never need the override.runner(GPU) or an OS-axis override set noapt-packages/container, so they stay self-hosted / on their OS.Affected repos
The only repos passing
apt-packages/containertoday:apt-packages: "python3-scipy"apt-packages: "r-base-dev r-cran-desolve"container: "cmhyett/julia-fenics:latest"Tests
test/runtests.jlre-asserts the default matrix runner isubuntu-latest, and a newruns-on conditionaltestset confirms the real expression is present in each reusable and (emulating GitHub Actions truthiness) thatapt-packages/containerresolve toubuntu-24.04while the default (incl. GPU self-hosted override) is preserved. Passing on Julia 1.10 and 1.12;actionlintclean. Live routing itself can only be proven by a retagged run.Deploy
Needs a
v1retag to take effect fleet-wide.🤖 Generated with Claude Code