feat(templates): add bev-model-training-robotics template + test by maxpumperla · Pull Request #719 · anyscale/templates

maxpumperla · 2026-05-28T14:51:23Z

Add BEV (Bird's-Eye View) model training template for robotics applications. This template demonstrates end-to-end distributed training pipeline using Ray Data for multi-camera preprocessing and Ray Train for DDP training.

Template includes:

README.ipynb with complete walkthrough of BEV training pipeline
Ray Data pipelines for NuScenes dataset preprocessing
Ray Train configuration with 2-worker DDP training on L4 GPUs
Three architecture diagrams (training pipeline, camera transform, distributed arch)
GPU compute configs for AWS (g6.4xlarge) and GCE (g2-standard-4)
Test script for validation

Family: Robotics
Target libraries: Ray Data, Ray Train
Workload types: Distributed training, Vision training

Add BEV (Bird's-Eye View) model training template for robotics applications. This template demonstrates end-to-end distributed training pipeline using Ray Data for multi-camera preprocessing and Ray Train for DDP training. Template includes: - README.ipynb with complete walkthrough of BEV training pipeline - Ray Data pipelines for NuScenes dataset preprocessing - Ray Train configuration with 2-worker DDP training on L4 GPUs - Three architecture diagrams (training pipeline, camera transform, distributed arch) - GPU compute configs for AWS (g6.4xlarge) and GCE (g2-standard-4) - Test script for validation Family: Robotics Target libraries: Ray Data, Ray Train Workload types: Distributed training, Vision training Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

maxpumperla · 2026-05-28T14:54:13Z

/test-template bev-model-training-robotics

Copilot

Pull request overview

Adds a new “BEV Model Training for Robotics” template to the templates catalog, intended to demonstrate an end-to-end distributed BEV training pipeline using Ray Data (CPU preprocessing) and Ray Train (DDP training), along with configs and a CI validation test.

Changes:

Introduces a new bev-model-training-robotics template (README.md/README.ipynb, requirements, metadata, diagrams).
Adds AWS/GCP compute configs for a 2-worker GPU setup.
Registers the template in BUILD.yaml and adds a papermill-based test runner.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
tests/bev-model-training-robotics/tests.sh	Adds papermill-based execution of the template notebook for validation.
templates/bev-model-training-robotics/requirements.txt	Declares template-specific Python dependencies.
templates/bev-model-training-robotics/README.md	Markdown walkthrough for dataset staging, Ray Data preprocessing, and Ray Train DDP training.
templates/bev-model-training-robotics/README.ipynb	Notebook version of the walkthrough intended to be executed in the workspace.
templates/bev-model-training-robotics/metadata.json	Adds template metadata (intent, structure, dependencies, diagrams).
templates/bev-model-training-robotics/diagrams/*.xml	Adds architecture/flow diagrams for the template.
configs/bev-model-training-robotics/aws.yaml	Adds AWS cluster sizing for the template.
configs/bev-model-training-robotics/gce.yaml	Adds GCP cluster sizing for the template.
BUILD.yaml	Registers the new template with image, compute configs, and test command.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+  },
+  "structure": {
+    "archetype": "single-notebook",
+    "primary_notebook": "bev_model_training_robotics.ipynb"


+   "name": "python",
+   "version": "3.12.0"


@@ -0,0 +1,4 @@
+#!/usr/bin/env bash
+set -euo pipefail
+pip install papermill


+        # Block entries that would escape target directory
+        if not str(member_path).startswith(str(path)):
+            raise RuntimeError(f"Blocked path traversal in tar member: {member.name}")
+
+    tar.extractall(path)


+            with torch.cuda.amp.autocast(enabled=(device.type == "cuda"), dtype=torch.float16):
+                logits = model(images)
+                loss = F.cross_entropy(logits, labels)


+            if train_mode:
+                # Gradient accumulation
+                scaler.scale(loss / grad_accum).backward()
+                step += 1
+                if step % grad_accum == 0:
+                    scaler.step(optimizer)
+                    scaler.update()
+                    optimizer.zero_grad(set_to_none=True)
+


Per repo convention, add completion time estimate to first notebook cell. Regenerated README.md to match.

Leftover from the template-generating skill — not read by tmpl-publish or any tooling, not part of the canonical template layout, and no other template carries one. Signed-off-by: Aydin Abiar <aydin@anyscale.com>

#311 failed `ModuleNotFoundError: nuscenes` (reqs never installed); full extract incl. unused sweeps/ overran 1800s. Adds pip-install cell; selective extract (skip sweeps/, guard kept, drop from idempotency check, nsweeps 5→1); parametrize epochs/subset (defaults 3/200, CI 1/16); materialize datasets; sweep (no_grad eval, torch.amp, worker-side alloc-conf, ~4 GB doc). Signed-off-by: Aydin Abiar <aydin@anyscale.com>

Aydin-ab · 2026-05-28T20:05:16Z

/test-template bev-model-training-robotics

nuscenes-devkit transitively pulls GUI opencv-python (needs libGL, absent on the headless ray-llm image); pin opencv-python-headless and force-reinstall it so the headless cv2 owns the namespace, fixing ImportError: libGL.so.1 at NuScenes init (build #333). Signed-off-by: Aydin Abiar <aydin@anyscale.com>

Aydin-ab · 2026-05-28T22:05:39Z

/test-template bev-model-training-robotics

nuscenes-devkit==1.1.11 pins matplotlib<3.6, which has no py311 wheel, so pip source-builds it (flaky freetype/sourceforge fetch) and the whole `pip install` aborts before nuscenes installs. Install nuscenes with --no-deps and bring its runtime deps as py311 wheels (pyquaternion, pycocotools, shapely<2.0, descartes, fire). Supersedes the prior commit's opencv handling: the ray-llm image already ships opencv-python-headless 4.13 (vllm requires >=4.13), so the earlier ==4.10.0.84 pin + --force-reinstall downgraded the image and broke vllm. Pin >=4.13 instead; --no-deps also keeps GUI opencv-python out, so cv2 stays headless without the force-reinstall. Signed-off-by: Aydin Abiar <aydin@anyscale.com>

Aydin-ab · 2026-05-29T00:22:31Z

/test-template bev-model-training-robotics

Listing opencv-python-headless>=4.13.0 in requirements.txt made pip honor its declared numpy>=2 during `pip install -r requirements.txt`, upgrading the base image's numpy 1.26 -> 2.x and corrupting the numpy-1.x stack (vllm/scipy/torch) — build #341. The base image already ships opencv-python-headless 4.13, and installing nuscenes-devkit --no-deps keeps GUI opencv-python out, so cv2 stays importable (no libGL) without listing opencv at all. Drop the pin and lock numpy<2 so nothing upgrades the base's numpy. Signed-off-by: Aydin Abiar <aydin@anyscale.com>

Aydin-ab · 2026-05-29T01:30:11Z

/test-template bev-model-training-robotics

nuscenes-devkit 1.1.11 targets matplotlib<3.6 and calls FigureCanvas.set_window_title in its render helpers; matplotlib removed it in 3.6 (the image has 3.7.4), so the lidar->camera projection render (render_pointcloud_in_image) raised AttributeError at build #342. Add a no-op set_window_title shim as a code cell before the first render so all visualization cells run headless. The camera and lidar+underlay_map renders already passed on 3.7.4, so this is the isolated remaining break. Signed-off-by: Aydin Abiar <aydin@anyscale.com>

Aydin-ab · 2026-05-29T02:08:49Z

/test-template bev-model-training-robotics

Ray Data preprocessing failed on the WORKER nodes with libGL.so.1 (#343): the notebook's runtime pip cell only ran on the head, and --no-deps isn't honored cluster-wide, so workers lacked nuscenes / libgl1. BYOD bakes the verified deps + libgl1 into every node instead. Switch BUILD.yaml to cluster_env.byod (us-docker.pkg.dev/anyscale-workspace-templates/workspace-templates/bev-model-training-robotics:2.55.1, digest sha256:ccb792b940460a20b72e37a4cc9097f1d8dacdc9689fb6c1d1f21cebd447eb36); drop the notebook pip cell + requirements.txt; keep the set_window_title shim. Signed-off-by: Aydin Abiar <aydin@anyscale.com>

Aydin-ab · 2026-05-29T03:36:01Z

/test-template bev-model-training-robotics

maxpumperla requested a review from Aydin-ab May 28, 2026 14:51

maxpumperla requested a review from a team as a code owner May 28, 2026 14:51

maxpumperla requested a review from Copilot May 28, 2026 14:54

Copilot started reviewing on behalf of maxpumperla May 28, 2026 14:54 View session

Copilot AI reviewed May 28, 2026

View reviewed changes

maxpumperla and others added 3 commits May 28, 2026 17:01

Add 'Time to complete: 60 min' marker to BEV template

f278994

Per repo convention, add completion time estimate to first notebook cell. Regenerated README.md to match.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(templates): add bev-model-training-robotics template + test#719

feat(templates): add bev-model-training-robotics template + test#719
maxpumperla wants to merge 9 commits into
mainfrom
mp_bev_template

maxpumperla commented May 28, 2026

Uh oh!

maxpumperla commented May 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Aydin-ab commented May 28, 2026

Uh oh!

Aydin-ab commented May 28, 2026

Uh oh!

Aydin-ab commented May 29, 2026

Uh oh!

Aydin-ab commented May 29, 2026

Uh oh!

Aydin-ab commented May 29, 2026

Uh oh!

Aydin-ab commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

maxpumperla commented May 28, 2026

Uh oh!

maxpumperla commented May 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Aydin-ab commented May 28, 2026

Uh oh!

Aydin-ab commented May 28, 2026

Uh oh!

Aydin-ab commented May 29, 2026

Uh oh!

Aydin-ab commented May 29, 2026

Uh oh!

Aydin-ab commented May 29, 2026

Uh oh!

Aydin-ab commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants