Skip to content

feat(producer): auto low-memory safe render profile#1225

Merged
jrusso1020 merged 2 commits into
mainfrom
feat/low-memory-safe-mode
Jun 5, 2026
Merged

feat(producer): auto low-memory safe render profile#1225
jrusso1020 merged 2 commits into
mainfrom
feat/low-memory-safe-mode

Conversation

@jrusso1020
Copy link
Copy Markdown
Collaborator

What

Adds an auto-detected low-memory safe render profile. On hosts at or below 8 GB total RAM, the render pipeline collapses to its cheapest shape instead of running multiple concurrent Chrome instances.

When lowMemoryMode is active and the user hasn't passed --workers, the orchestrator:

  • skips auto-worker calibration — no throwaway second Chrome just to time 5 frames;
  • pins to a single worker — so the probe Chrome is reused for capture, never N concurrent;
  • prefers screenshot capture over BeginFrame — avoids the BeginFrame protocol-timeout → relaunch churn on slow hardware;
  • logs a one-line explanation of what it did and how to override.

Builds on #1221 (merged), which fixed the calibration timeout cap, the <= 8192 boundary, and added the CLI timeout flags.

Why

Reported in #1218 / #1219: renders on 8 GB laptops sit at low progress for minutes or stall. Root cause (per the triage thread) is architectural — the default pipeline launches up to 4 Chrome instances sequentially/overlapping (probe, calibration, capture, screenshot-fallback), each ~256 MB+, on machines with ~3 GB free. The concurrent browsers drive memory pressure that makes every CDP call slow and spikes V8 GC pauses.

#1221 made the timeouts and memory flags apply correctly; this PR removes the expensive shape entirely on the machines that can't afford it, rather than tuning it. "Smarter by default."

How

  • packages/engine/src/services/systemMemory.ts (new): one shared isLowMemorySystem() / getSystemTotalMb(), de-duplicating the totalmem() reads previously copied in config.ts and browserManager.ts. Threshold is inclusive (<= 8192 MB) — real "8 GB" hardware reports ~7600–8192 MB after firmware/iGPU reservations, so a strict < would skip the optimisation on the very hardware that needs it.
  • config.ts: new lowMemoryMode field on EngineConfig, resolved tri-state — explicit override → PRODUCER_LOW_MEMORY_MODE (on/off) → auto-detect from total RAM.
  • renderOrchestrator.ts: gate calibration off, pin workers to 1, force screenshot capture, and emit a safe-mode log line when lowMemoryMode is set and --workers is absent.
  • render.ts: --low-memory-mode / --no-low-memory-mode override (sets the env var the producer's resolveConfig reads) + docs table entry.

Fully overridable: an explicit --workers N restores calibration-free parallelism; --no-low-memory-mode / PRODUCER_LOW_MEMORY_MODE=false restores the full default shape.

Deliberately deferred (separate PRs)

  • Reuse the probe session for calibration: only executes on the tier above 8 GB (safe-mode skips calibration on the target boxes). A correct BeginFrame-mode reuse would lose calibration's fast-fail-to-screenshot timeout — real risk on a path the reported scenario never hits. Better scoped on its own.
  • Retuning calculateOptimalWorkers's totalmem*0.5/256 memory model: hot path for all renders incl. servers/Lambda, outside this PR's local-laptop scope.

Test plan

  • Unit tests added/updated — systemMemory.test.ts (8192 boundary cases), config.test.ts (tri-state env resolution + explicit-override precedence). Engine suite passes (25 relevant tests).
  • tsc clean across engine/producer/cli; oxlint + oxfmt clean; removed an unused export so the fallow --fail-on-issues dead-code gate stays green.
  • Documentation updated — docs/packages/cli.mdx render-flags table.
  • Manual testing on a real ≤ 8 GB host — not yet run; behaviour is unit-covered and the safe path (1 worker + screenshot) is already a supported render shape.

Note: one pre-existing producer test (rejects a maliciously crafted key…) fails identically on main — environment-specific path test, unrelated to this change.

On memory-constrained hosts (<= 8 GB RAM) the default render shape runs
multiple concurrent Chrome instances — a probe browser, a throwaway
auto-worker calibration browser, and N capture workers — which drives
memory pressure that slows every CDP call and spikes V8 GC, surfacing as
the slow/stuck renders in #1218 / #1219.

Add an auto-detected low-memory safe profile that collapses the pipeline
to its cheapest shape: skip auto-worker calibration (no second browser),
pin to a single worker so the probe session is reused for capture, and
prefer screenshot capture over BeginFrame (avoiding the BeginFrame
protocol-timeout -> relaunch churn on slow hardware).

- engine: shared systemMemory helper (isLowMemorySystem / getSystemTotalMb),
  de-duplicating the totalmem reads previously copied in config.ts and
  browserManager.ts; inclusive <= 8192 MB threshold.
- engine/config: lowMemoryMode field, resolved tri-state
  (explicit override -> PRODUCER_LOW_MEMORY_MODE on/off -> auto-detect).
- producer/orchestrator: skip calibration, pin 1 worker, force screenshot
  capture when lowMemoryMode is active and no explicit --workers; log it.
- cli: --low-memory-mode / --no-low-memory-mode override + docs.

Fully overridable: explicit --workers restores calibration-free
parallelism; --no-low-memory-mode restores the full default shape.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mintlify
Copy link
Copy Markdown

mintlify Bot commented Jun 5, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
hyperframes 🟢 Ready View Preview Jun 5, 2026, 8:28 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

Copy link
Copy Markdown

@james-russo-rames-d-jusso james-russo-rames-d-jusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary: auto low-memory safe render profile — skips throwaway calibration Chrome, pins to 1 worker, prefers screenshot capture when total RAM ≤ 8 GB.

Ships Magi's third architectural direction (memory-aware mode) and explicitly defers the other two with sound rationale. CI fully green incl. cross-platform regression shards. No blockers — two concerns and a few nits below.

Concerns

  • services/systemMemory.ts (isLowMemorySystem) — auto-detect uses os.totalmem(), which reports the host's physical RAM, not cgroup/container limits. A Docker container with a 4 GB cgroup limit on a 32 GB host will NOT auto-flag as low-memory; conversely, a 4 GB container on a 64 GB host won't either. The PR scope is explicitly "local-laptop" (per the deferred section) and PRODUCER_LOW_MEMORY_MODE=true is the documented escape hatch — but the same render flags table also documents --docker, so a one-line callout in the threshold docstring or the docs row ("auto-detect reads host RAM; containerised callers should set the env var explicitly") will save the next operator a confused bug report.
  • Lambda / server-render contexts hit the gate too. A Lambda allocated 8 GB will auto-detect as low-memory → force screenshot + skip calibration + (if --workers unset) pin to 1. May actually be the right behaviour (single-tenant, fixed RAM, calibration overhead unjustified for a one-shot invocation) but the "smarter by default" framing was about laptops. Worth either (a) saying server contexts are in scope and this is intentional, or (b) gating the auto-detect on !process.env.AWS_LAMBDA_FUNCTION_NAME (and friends) so server callers opt in via env var. Right now the behaviour change is silent for them.

Nits

  • services/browserManager.ts (getGpuMemBudgetMb, getLowMemoryFlags) — now that LOW_MEMORY_TOTAL_MB_THRESHOLD is exported, these two still hardcode the bare 8192. Same value, so not a bug — but centralising the constant is half the point of the new module. (nit)
  • renderOrchestrator.ts worker ternary — cfg.lowMemoryMode && workers === undefined ? 1 : resolveRenderWorkerCount(...) bypasses resolveRenderWorkerCount entirely, so any "why workers=N" logging inside it silently doesn't fire on the safe-mode path. Consider threading a forceSingleWorker: true flag into the resolver so its log line stays coherent across all paths. (nit)
  • [Render] Low-memory system detected … — the log line reads as auto-detect framing but the same code path fires for explicit PRODUCER_LOW_MEMORY_MODE=true / --low-memory-mode. Tiny — "Low-memory profile active" covers both. (nit)

Context (not a finding for this PR)

The <= 8192 boundary is coherent with #1221's fixes across memoryAdaptiveCacheLimit / getGpuMemBudgetMb / getLowMemoryFlags — verified. The 4096 boundary asymmetry flagged on #1221's review still exists indirectly (a 4096-exact box auto-flags safe-mode here, but gets mid-tier cache/GPU values over there). Not regressed, just unchanged — carrying it forward.

Questions

  • PR body notes the rejects a maliciously crafted key… producer test "fails identically on main". gh pr checks is fully green so CI doesn't surface it — is that test quarantined/skipped in CI, or only failing in your local environment? Worth a tracker so it doesn't ride along as "expected red" forever.
  • Intent on Lambda/Docker server auto-detect (concern #2 above) — in scope, or accidental side-effect of host-only RAM reading?

What I didn't verify

  • Manual run on a real ≤ 8 GB host (PR body marks this as not-yet-run; safe shape is already a supported path and unit-covered).
  • Whether job.config.workers === undefined is the correct "user didn't pass --workers" signal across every entry into executeRenderJob (only traced through render.ts).
  • Asset-flicker lens: clean — this PR is producer/capture only, no player React surface.
  • Hardware benchmarking (no access).

— Rames D Jusso

…rent worker logging

Review follow-ups on the low-memory safe profile:

- Move the safe-mode single-worker pin into resolveRenderWorkerCount (it
  already receives cfg + log) so the "why workers=N" log line fires on the
  safe-mode path too; the orchestrator ternary collapses to a plain call.
- Centralise the 8192 MB cutoff: config.ts + browserManager.ts now use the
  exported LOW_MEMORY_TOTAL_MB_THRESHOLD instead of bare literals.
- Reword the orchestrator log "Low-memory system detected" ->
  "Low-memory render profile active" (the path also fires for explicit
  PRODUCER_LOW_MEMORY_MODE / --low-memory-mode, not just auto-detect).
- Document the host-RAM vs cgroup/container caveat in isLowMemorySystem's
  docstring and the CLI docs row: auto-detect reads os.totalmem() (host),
  so containerised/serverless callers should set PRODUCER_LOW_MEMORY_MODE
  explicitly. Server/Lambda auto-detect at <= 8 GB is intentional and
  overridable (no vendor env-var sniffing).
- Tests: low-memory pin + explicit-workers-bypass in resolveRenderWorkerCount.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jrusso1020
Copy link
Copy Markdown
Collaborator Author

Thanks for the thorough pass @james-russo-rames-d-jusso — all addressed in b75b2064.

Concerns

  • Host RAM vs cgroup/container limits — agreed, os.totalmem() reads host RAM, not the cgroup limit. Documented the caveat in both isLowMemorySystem's docstring and the CLI docs row: auto-detect reads host RAM, so containerised/serverless callers (incl. --docker) should set PRODUCER_LOW_MEMORY_MODE explicitly. (Reading the cgroup v2/v1 limit is a reasonable future enhancement but out of scope for this laptop-targeted PR.)
  • Lambda / server contexts — intentional. Any context whose total visible RAM is ≤ 8 GB gets the safe shape, and that's the right default for a constrained single render: a one-shot invocation doesn't justify spinning up a throwaway calibration Chrome, and pinning conservatively avoids OOM in a fixed-RAM sandbox. It's not silent — the [Render] Low-memory render profile active … log line announces it — and it's overridable via env/flag. I deliberately did not gate on AWS_LAMBDA_FUNCTION_NAME/etc.: vendor env-var sniffing by naming convention is exactly the kind of guesswork that fails silently, and ≤ 8 GB → safe is a cleaner, vendor-agnostic rule.

Nits — all fixed

  • Centralised 8192config.ts and browserManager.ts now use the exported LOW_MEMORY_TOTAL_MB_THRESHOLD (left the 4096 tier as a literal, per your "carry forward unchanged" note).
  • Worker logging coherence — moved the single-worker pin into resolveRenderWorkerCount (it already gets cfg + log), so it now logs Low-memory profile — pinning to 1 capture worker on that path. Orchestrator ternary is gone; it's a plain call again. Added two unit tests (pin when no --workers; bypass pin but keep safe mode when --workers N given).
  • Log wordingLow-memory system detectedLow-memory render profile active (covers explicit env/flag too).

Questions

  • The rejects a maliciously crafted key… test — not quarantined; it's green in CI and only red on my local box. The test asserts existsSync(join(workDir, "..", "..", "etc", "passwd")) is false; on my dev host that path resolves somewhere a real file exists, so it trips locally. CI's sandboxed FS doesn't hit it. So there's nothing riding along red in CI — no tracker needed, but I'll flag it to the test's owner as environment-fragile.
  • job.config.workers === undefined as the "no --workers" signal — traced the other entry points (server.ts, studio adapter, benchmark, regression harness): all build the job via createRenderJob, and workers is only ever set when explicitly provided, so undefined is the correct "user didn't pin workers" signal across paths.

Copy link
Copy Markdown

@james-russo-rames-d-jusso james-russo-rames-d-jusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 2 follow-up on b75b2064.

Refactor cleanly addresses every round-1 item — verified end-to-end against the new diff and the response comment:

  • LOW_MEMORY_TOTAL_MB_THRESHOLD is now consumed at all four call sites in config.ts (memoryAdaptiveCacheLimit, memoryAdaptiveCacheBytesMb) and browserManager.ts (getGpuMemBudgetMb, getLowMemoryFlags); the bare 8192 literals are gone. The 4096 tier is intentionally left as-is, per the "carry-forward unchanged" note from round 1.
  • The single-worker pin moved into resolveRenderWorkerCount (captureCost.ts:137–142), so the "why workers=N" log line fires coherently on the safe-mode path now. The orchestrator ternary collapses to a plain call. Two new unit tests in renderOrchestrator.test.ts lock the contract: workers=1 on lowMemoryMode without --workers, workers=4 when --workers 4 is explicit (only the pin is bypassed; safe mode still active). Exactly the shape I asked for.
  • Orchestrator log reworded from "Low-memory system detected" → "Low-memory render profile active" — covers explicit env/flag, not just auto-detect.
  • isLowMemorySystem's docstring now carries the host-RAM-vs-cgroup caveat (nine lines, clear), and the CLI docs row repeats the operator-facing version (incl. an explicit nod to --docker). Together these close the cgroup blind spot at both the developer-reading-the-code and operator-reading-the-docs surfaces.
  • The Lambda push-back is well-reasoned: vendor env-var sniffing IS the heuristic that fails silently, and "≤ 8 GB total RAM → safe" is a cleaner vendor-agnostic rule. The log line + override knob keeps it non-silent. Convinced — withdrawing the gate-on-AWS_LAMBDA_FUNCTION_NAME suggestion.
  • The rejects a maliciously crafted key… test footnote is the right disposition — green in CI, local-FS-shape artefact, flagged to the test owner. No tracker needed.
  • workers === undefined correctness — taken at face value; you traced the entry points (server, studio adapter, benchmark, regression harness) and I trust the trace.

Nothing blocking. Two tiny observations, neither needs action:

  • Both the orchestrator banner ("…screenshot capture, auto-worker calibration skipped, pinned to 1 worker. Override with…") and the new resolver log ("Low-memory profile — pinning to 1 capture worker (auto-worker calibration skipped).") fire on the auto-pin path. They're telling the same story in different framings — one-shot profile summary + per-decision breadcrumb — and the two-stage shape is actually nice for grep/debug. Just noting the surface-area overlap in case the lines drift in future copy edits.
  • The manual ≤ 8 GB host test-plan item is still unchecked. Same disposition as round 1 — unit-covered, safe shape is already a supported render path. Re-flagging for symmetry; not blocking the merge.

Looks good to me.

Review by Rames D Jusso

@jrusso1020 jrusso1020 merged commit bacfb17 into main Jun 5, 2026
47 checks passed
@jrusso1020 jrusso1020 deleted the feat/low-memory-safe-mode branch June 5, 2026 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants