Skip to content

Recover from Jython interpreter init failure during constant folding#439

Open
khatchad wants to merge 8 commits intowala:masterfrom
ponder-lab:fix/jython-init-catch-upstream
Open

Recover from Jython interpreter init failure during constant folding#439
khatchad wants to merge 8 commits intowala:masterfrom
ponder-lab:fix/jython-init-catch-upstream

Conversation

@khatchad
Copy link
Copy Markdown
Collaborator

Summary

Two-file change to keep module load alive when new PythonInterpreter() fails during constant folding. The current Python3Interpreter.getInterp() propagates the failure through every recursive ConstantFoldingRewriter.copyNodes frame and aborts the entire module load, leaving the class hierarchy with no .py entries and makeDefaultEntrypoints returning empty.

Symptom

java.io.FileNotFoundException: src/resources/frozen_importlib/_frozen_importlib.class
java.lang.NullPointerException: Cannot invoke "...PyObject.invoke(...)" because "sys.importlib" is null
    at org.python.core.imp.import_next(imp.java:735)
    at org.python.core.Py.importSiteIfSelected(Py.java:1922)
    at org.python.util.PythonInterpreter.<init>(PythonInterpreter.java:114)
    at com.ibm.wala.cast.python.util.Python3Interpreter.getInterp(Python3Interpreter.java:19)
    at com.ibm.wala.cast.python.loader.Python3Loader$4$1.eval(Python3Loader.java:112)
    at com.ibm.wala.cast.ir.translator.ConstantFoldingRewriter.copyNodes(...)
    [... recursive frames ...]

The downstream symptom is IllegalStateException: Could not create a entrypoint callsites: (with empty Warnings) at PropagationCallGraphBuilder.makeCallGraph:238, because the failed module load leaves the CHA without any .py entries.

Why It Surfaces Now

Both branches in this commit-pair were silently swallowing the failure prior to a recent change in the ponder-lab/ML fork: a try { ... } catch (Throwable e) {} block in getInterp() was removed, and PySystemState.initialize() (previously commented out) was uncommented. Removing the silent swallow exposed an env-dependent Jython bootstrap failure to consumers whose classloader/working-directory setup can't satisfy Jython's _frozen_importlib lookup.

Change

Two coupled edits, ~35 lines total:

  • Python3Interpreter.getInterp() — wrap new PythonInterpreter() in try/catch, log once at WARNING, memoize the failure via volatile boolean initFailed, return null instead of throwing. Memoizing matters because const-fold is invoked recursively per AST node; on a module with N const-eligible expressions, retrying the failing constructor each time would be O(N) Jython init attempts.
  • Python3Loader's ConstantFoldingRewriter.eval callback — treat a null interpreter as a folding miss (return null). Analysis remains correct, just less precise — constant folding is a precision-only feature.

Validation

mvn -pl com.ibm.wala.cast.python.ml.test -am test (with this fix on the ponder-lab fork): 610 tests, 0 failures, 0 errors, 3 skipped. No regression on the existing test surface.

Tycho-OSGi reproducer (Hybridize-Functions-Refactoring upgrade-tycho-5-java-25 branch consuming a SNAPSHOT with this fix): the 3 IllegalStateException failures (testAutoEncoder, testTensorFlowEagerExecution, testDatasetIteration4) no longer surface — const-fold is a no-op for those modules under OSGi but module load proceeds and downstream tensor analysis runs.

Branch

This branch is rooted at wala/ML:master directly (not ponder-lab fork master) so the diff is exactly the two-file change here, with no fork-only commits riding along.

`Python3Interpreter.getInterp()` calls `new PythonInterpreter()`, which
walks Jython's bootstrap path to set up `sys.importlib`. In some
environments (e.g., OSGi consumers running under Tycho-surefire) the
bootstrap resources aren't reachable from the current classloader and
the constructor throws `NullPointerException` from
`org.python.core.Py.importSiteIfSelected`. The exception was uncaught —
it propagated through every recursive `ConstantFoldingRewriter.copyNodes`
frame and aborted the entire module load, leaving the class hierarchy
with no `.py` entries.

Fix: catch the failure where it originates, memoize via
`volatile boolean initFailed`, log once at WARNING, and return `null`.
Update `Python3Loader`'s const-fold callback to treat a `null`
interpreter as a folding miss. Memoizing avoids re-running the failing
constructor on every const-fold attempt.

Constant folding is a precision-only feature (it shrinks symbolic
expressions to literals when possible); analysis remains correct
without it, just less precise.

## Reproducer

The downstream symptom is `IllegalStateException: Could not create a
entrypoint callsites:` (with empty `Warnings`) at
`PropagationCallGraphBuilder.makeCallGraph:238`. It surfaces in
Hybridize-Functions-Refactoring's testAutoEncoder /
testTensorFlowEagerExecution / testDatasetIteration4 against any
Ariadne version after `30c15e58` (which removed the silently-swallowing
`try { ... } catch (Throwable e) {}` block around the same code path).
Copilot AI review requested due to automatic review settings April 29, 2026 19:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes Jython-based constant folding resilient to Jython interpreter bootstrap failures so Python module loading can proceed (with reduced precision) instead of aborting the entire analysis pipeline.

Changes:

  • Memoize Jython interpreter initialization failure in Python3Interpreter.getInterp() and return null on subsequent calls, logging once.
  • Update Python3Loader’s constant-folding eval callback to treat a null interpreter as a folding miss.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
jython/com.ibm.wala.cast.python.jython3/source/com/ibm/wala/cast/python/util/Python3Interpreter.java Adds failure memoization + logging; changes getInterp() to return null on init failure.
jython/com.ibm.wala.cast.python.jython3/source/com/ibm/wala/cast/python/loader/Python3Loader.java Skips constant folding when getInterp() returns null.

khatchad and others added 4 commits April 29, 2026 15:42
…Integer`

Three concerns raised by the auto-review on the previous commit:

1. **Thread-safety**: `initFailed` and `interp` were checked/assigned
   without synchronization. Make `getInterp()` `synchronized` so the
   check and the constructor run as a unit through the static class
   monitor.

2. **`Throwable` is too broad**: catching `Throwable` swallows
   `Error` types (OOM, StackOverflow, LinkageError) and silently
   continues with `initFailed=true`, hiding genuine VM problems.
   Narrow to `Exception`. The Jython failure we're defending against
   is `RuntimeException` (NPE from `Py.importSiteIfSelected`), which
   is `Exception`'s subset, so the defensive intent still holds.

3. **`evalAsInteger` didn't null-check `getInterp()`**: with the new
   contract that `getInterp()` may return null after a memoized init
   failure, `evalAsInteger().eval(...)` would NPE. Add an
   `IllegalStateException` for that case so callers get a clear
   "interpreter unavailable" signal.

No behavior change on the success path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror of the fix-branch change applied to the fork PR (per review feedback
on #191).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eger`

Mirror of the fork-side change (per review on #191). The
previous `IllegalStateException` would abort callers that expect the
nullable-`Integer` contract; null lets them degrade gracefully.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror of the fork-side polish (per #191). Three items:

1. Memoize per-call WARNING in `evalAsInteger` (first call WARNING, subsequent FINE).
2. Broaden init-failure message to mention all interpreter-based evaluation, not just const-fold.
3. Move `logger.info("Evaluating: ...")` after the `getInterp()` null check in `Python3Loader.eval`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@khatchad
Copy link
Copy Markdown
Collaborator Author

One note here, @msridhar @juliandolby, the ML code is still not using JEP. So, although @juliandolby transitioned the core modules to JEP, the ML modules still use Jython for the moment.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes constant folding and interpreter-based evaluation resilient to environment-dependent Jython initialization failures, so Python module loading (and downstream call graph construction) can proceed even when new PythonInterpreter() cannot be created (e.g., under Tycho/OSGi).

Changes:

  • Add failure memoization to Python3Interpreter.getInterp() so a Jython init failure is logged once and subsequent calls return null cheaply.
  • Update Python3Loader constant folding evaluation to treat a null interpreter as a folding miss (return null) rather than aborting module load.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
jython/com.ibm.wala.cast.python.jython3/source/com/ibm/wala/cast/python/util/Python3Interpreter.java Memoize interpreter init failure, avoid repeated expensive init attempts, and degrade evalAsInteger behavior when interpreter is unavailable.
jython/com.ibm.wala.cast.python.jython3/source/com/ibm/wala/cast/python/loader/Python3Loader.java Skip constant folding evaluation when the interpreter is unavailable, preventing module-load aborts due to init failures.
Comments suppressed due to low confidence (2)

jython/com.ibm.wala.cast.python.jython3/source/com/ibm/wala/cast/python/util/Python3Interpreter.java:100

  • evalAsInteger() can still throw IllegalArgumentException when the expression isn’t an integer or when Jython evaluation raises PyException. Call sites like PythonInterpreter.interpretAsInt(expr) (e.g., in TensorType.shapeArg and PythonTurtleAnalysisEngine) don’t catch and instead rely on a null return to fall back gracefully, so these exceptions can still abort analysis on non-constant/invalid expressions. Consider returning null for “cannot evaluate” cases (and logging at FINE/WARNING as desired), or updating the call sites to handle the exception explicitly.
    try {
      PyObject val = ip.eval(expr);
      if (val.isInteger()) {
        return val.asInt();
      } else
        throw new IllegalArgumentException(
            "Python expression: " + expr + " cannot be evaluated to an integer.");
    } catch (PyException e) {
      LOGGER.log(Level.SEVERE, "Unable to interpret Python expression: " + expr, e);
      throw new IllegalArgumentException("Can't interpret Python expression: " + expr + ".", e);
    }

jython/com.ibm.wala.cast.python.jython3/source/com/ibm/wala/cast/python/loader/Python3Loader.java:127

  • eval(...) only catches PySyntaxError. Other runtime evaluation failures from Jython (e.g., PyException like ZeroDivisionError, TypeError, etc.) will propagate out of constant folding and can still abort module loading/call graph construction, even though constant folding is intended to be best-effort. Consider catching broader evaluation exceptions here (at least org.python.core.PyException, or Exception like the Python2Loader implementation) and returning null to treat it as a folding miss (optionally log at FINE).
                try {
                  x = ip.eval(unicode);
                } catch (PySyntaxError e) {
                  // Handle syntax errors gracefully.
                  logger.log(WARNING, e, () -> "Syntax error in expression: " + unicode);
                  return null;
                }

khatchad and others added 2 commits April 29, 2026 21:51
…lableWarned`

Mirror of the fork-side fix (per #191). The previous
`if (!unavailableWarned) { unavailableWarned = true; ... }` sequence
isn't atomic; switch to `AtomicBoolean.compareAndSet(false, true)` so
only the thread that flips the flag enters the WARNING branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror of the fork-side change (per #191 review feedback).
The "uses AtomicBoolean.compareAndSet for atomicity" prose is an
implementation detail, not contract; move it under `@implNote`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@khatchad
Copy link
Copy Markdown
Collaborator Author

The failing Build with Maven step is upstream of this PR's content — it's the --remote flag in CI's submodule-update step pulling WALA/ past 1.6.13-SNAPSHOT (now 1.7.2-SNAPSHOT on wala/WALA master), so mvn verify can't resolve the 1.6.13-SNAPSHOT artifacts that wala/ML's pom.xml requests:

[ERROR] Could not find artifact com.ibm.wala:com.ibm.wala.util:jar:1.6.13-SNAPSHOT

#443 removes that workflow step (root cause documented in #433). Once #443 lands and this branch is rebased on master, CI here should go green.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 30, 2026

Codecov Report

❌ Patch coverage is 30.43478% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.82%. Comparing base (e9f72e6) to head (5957155).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
.../ibm/wala/cast/python/util/Python3Interpreter.java 26.31% 13 Missing and 1 partial ⚠️
...com/ibm/wala/cast/python/loader/Python3Loader.java 50.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #439      +/-   ##
============================================
- Coverage     57.91%   57.82%   -0.10%     
  Complexity      624      624              
============================================
  Files           111      111              
  Lines          7671     7690      +19     
  Branches        856      860       +4     
============================================
+ Hits           4443     4447       +4     
- Misses         3050     3063      +13     
- Partials        178      180       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants