Skip to content

Extend JEP migration to the ml/ packages #441

@khatchad

Description

@khatchad

Summary

PR #300 ("Jumpstart jep merge") introduced JEP-based Python parsing for the plain Python analysis path (core/com.ibm.wala.cast.python consumed via jep/com.ibm.wala.cast.python.cpython). The ML packages (ml/com.ibm.wala.cast.python.ml/, ml/com.ibm.wala.cast.python.ml.test/) still depend on jython3. This issue tracks extending the JEP migration to the ML side.

Current State

ml/com.ibm.wala.cast.python.ml/pom.xml (head at 810fe5d5):

<dependencies>
  <dependency>
    <groupId>com.ibm.wala</groupId>
    <artifactId>jython3</artifactId>            <!-- Jython parser path -->
  </dependency>
  …
</dependencies>

jep/com.ibm.wala.cast.python.cpython/pom.xml:

<dependencies>
  <dependency>
    <artifactId>com.ibm.wala.cast.python</artifactId>   <!-- core, no Jython -->
  </dependency>
  <dependency>
    <groupId>black.ninia</groupId>
    <artifactId>jep</artifactId>
    <version>4.2.2</version>
  </dependency>
  …
</dependencies>

Files like JepPythonLoader.java, JepPythonLoaderFactory.java, CPythonAstToCAstTranslator.java, JepAstVisitor.java, CPythonInterpreter.java all live in jep/cpython/ and provide an alternative loader path. They aren't currently reachable from ml/.

What This Issue Tracks

The work to:

  1. Repoint ml/com.ibm.wala.cast.python.ml/pom.xml (and ml/com.ibm.wala.cast.python.ml.test/pom.xml) from the jython3 dependency to whatever exposes the JEP-based parser (cpython directly, or a new shared module if a refactor is preferred).
  2. Update PythonTensorAnalysisEngine and any Engine factory wiring on the ML side to use JepPythonLoaderFactory instead of Python3LoaderFactory (or whatever the equivalent indirection is post-migration).
  3. Update the ML test fixtures so the test runner can drive the JEP-based parser. This may include CI changes (JEP needs the CPython native library and the jep Python package on the test runner; see Jumpstart jep merge #300's CI workflow changes for the precedent).
  4. Run the ml/com.ibm.wala.cast.python.ml.test suite with the JEP parser. Expect divergences from the Jython-parser behavior (the parsers produce slightly different CAst trees in edge cases — string handling, slice semantics, etc.). Triage and fix.

Why This Matters

This is the durable resolution for two open issues:

  • #433 (1.6.13-SNAPSHOT resolution race) — once ml/ no longer depends on jython3, it's no longer locked to a Jython-supporting WALA tip. Released WALA artifacts become viable again.
  • #436 (Jython interpreter init NPE) — JEP doesn't have the _frozen_importlib bootstrap problem that Jython's PythonInterpreter is fragile around in modern JDK / OSGi contexts.

It's also tracked as the long-term path in #440.

Effort Estimate

Not yet scoped concretely. As a rough lower bound: the pom-dependency change is one line, but the Engine/factory wiring changes are non-trivial and the test-divergence triage is unknown until run. Probably days of focused work plus iteration on test failures, similar in scope to #300 but for the more behavior-heavy ML side.

Cross-Refs

  • #300 — "Jumpstart jep merge" (the precedent for plain-Python).
  • #433 — SNAPSHOT race (resolved as a side-effect of completing this).
  • #436 — Jython interp NPE (resolved as a side-effect of completing this).
  • #440 — JEP migration tracker (this issue is part of the broader work).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions