A GitHub repository template for HPC and cloud workloads with Docker, Singularity, and Slurm
A GitHub repository template for cloud computing and HPC workloads. Combines Docker (for local/cloud machines) with Singularity/Apptainer (for HPC clusters where Docker is unavailable), plus Slurm job scripts.
TLDR: Search for
todoand update all occurrences to your desired name. Docker and Singularity are optional if all dependencies can be installed directly on the HPC shell.
- Docker — for local development and building images
- Singularity/Apptainer — for running on HPC clusters
- Change LICENSE if necessary
- Modify .pre-commit-config.yaml according to your needs
- Modify/add GitHub workflow status badges in README.md
Continue on a machine where you have Docker permission — HPC clusters usually restrict Docker access for security reasons.
-
Fill in all
todo-*placeholders directly in .env.example and commit — these are project-level constants, not secretsPlaceholder Description todo-docker-userYour Docker Hub account username todo-base-imageBase image the Dockerfile builds from (e.g. nvidia/cuda:13.0.0-cudnn-devel-ubuntu24.04)todo-image-nameName of the image you are building todo-image-userDefault user inside the image, used to determine the home folder -
Copy .env.example to
.envand add any user-specific secrets or local overrides:cp .env.example .env
.envis gitignored and will NOT be committed — it is the right place for secrets and per-user values. It is loaded automatically by docker compose. -
Modify the service name from
todo-service-nameto your service name in docker-compose.yml, add additional volume mounting options such as dataset directories -
Update Dockerfile and .dockerignore — the existing Dockerfile includes screen & tmux config, oh-my-zsh, cmake, and other basic tools
-
Run scripts to build, test, and push:
Script Action build_docker_image.sh Build and test the image locally (uses buildxfor multi-arch)run_docker_container.sh Run and test a built image ( docker compose up -dalso works)push_docker_image.sh Push the multi-arch image to Docker Hub The service mounts the entire repository onto
CODE_FOLDERinside the container — modifications inside are reflected outside, useful for VS Code remote development.
Continue on the actual HPC cluster environment.
-
Run pull_singularity_image.sh to build the Singularity image locally from the Docker image you pushed
You should see
todo-image-name_latest.defafter a successful build. -
Run run_singularity_instance.sh to test the image
- Add additional volume bind options (e.g. dataset directories) — define them in
.env, then export via variables.sh usingresolve_host_pathto convert relative paths to absolute paths - Singularity instances have less environment isolation than Docker containers by default unless you pass the additional flags shown in the script
- Add additional volume bind options (e.g. dataset directories) — define them in
-
Modify job specifications under
jobs/Slurm tips
- Query your cluster's partition layout with
sinfo - Tie resources to tasks for easy scaling:
--ntasks-per-node,--gpus-per-task,--cpus-per-task,--mem-per-gpu - All jobs use
-l(login) in the shebang so any command available in your login shell also works as a job
- Query your cluster's partition layout with
-
Submit and monitor jobs:
sbatch jobs/your-cluster/your-job.job
Output logs appear as
todo_your_job_name_<slurm_job_id>.outin the repository root. -
Recommend turm for job monitoring —
turm -u your-slurm-user
bash scripts/dev_setup.sh