Summary
While investigating #437, three design issues in TensorGeneratorFactory.getFunction() and the surrounding dispatch chain surfaced. They're related — together they make resolution failures hard to recognize and hard to fix targeted-ly, and they were the proximate cause of #437 slipping through Ariadne's CI.
(1) getFunction() Silently Returns The Declared Class On Resolution Failure
getFunction() handles the LCodeBody/LRoot generic-dispatch case by inspecting the points-to set on the call's use(0):
if (declaredClass.getName().toString().equals("LCodeBody")
|| declaredClass.getName().toString().equals("LRoot")) {
int funcVn = call.getUse(0);
PointerKey funcKey = ...
for (InstanceKey ik : pointerAnalysis.getPointsToSet(funcKey)) {
if (ik instanceof ConcreteTypeKey) return ((ConcreteTypeKey) ik).getType().getReference();
if (ik instanceof AllocationSiteInNode) return ((AllocationSiteInNode) ik).getConcreteType().getReference();
}
}
return declaredClass; // ← silent fallthrough
When the loop yields zero matching keys (which happens when the function-object is a PythonPropertyRead with an empty PA points-to set), the method silently returns the declared LCodeBody/LRoot class. The dispatch table at line 888+ then sees this generic class and throws IllegalArgumentException("Unknown call: LCodeBody") — caught silently in processInstruction with no diagnostic.
Net effect: resolution failures look identical to "genuinely a generic call" to the caller, and the source is silently dropped from the analysis.
Suggested fix: change the return type to Optional<TypeReference> (or throw a typed UnresolvedCallTargetException), and force the caller to handle the unresolved case explicitly. That gives us a place to plug in fallbacks (per #437 / per (2) below).
(2) Two Parallel Dispatch Mechanisms That Aren't Bridged
dispatchByPropertyName() implements property-name dispatch via the PROPERTY_NAME_GENERATORS registry, which currently has one entry: astype.
Meanwhile, the class-hierarchy dispatch (line 888+) has 50+ entries keyed off MethodReference.getDeclaringClass().
For property-read invocations like tf.rank(...) — which are exactly the case dispatchByPropertyName is designed for — the property name is correctly resolved to a ConstantKey:rank but PROPERTY_NAME_GENERATORS has no entry, so dispatch falls through. There's no bridge between the two registries: an XML class registered for the class-hierarchy dispatch is not automatically reachable via property-name dispatch.
Suggested fix: unify the two dispatch paths. Either (a) the property-name registry is auto-populated from the XML class registry (so any class in the dispatch table is reachable both ways), or (b) dispatchByPropertyName falls back to the class-hierarchy dispatch when the property name corresponds to a registered class.
Note that XML-registered helper methods (like read_data) appear to live in a separate registry from getDeclaredMethods() — so a naive class-hierarchy scan for read_data doesn't work. They become trampoline classes nested under the parent (per the /class/-namespace pattern). Whatever the bridge looks like, it has to query the right registry.
(3) No Diagnostic Signal On Resolution Failure
Related to (1): when the PA loop in getFunction's LCodeBody/LRoot branch yields zero matching InstanceKey types, the loop just exits. No log, no diagnostic, no signal that resolution failed. The caller sees IllegalArgumentException("Unknown call: LCodeBody") thrown later (sometimes much later, through several recursive walk paths) and has to reverse-engineer what went wrong.
Suggested fix: log at INFO (or at least FINE) when:
- the LCodeBody/LRoot resolver loop exits empty
- the points-to set on the function-object reference is empty (independent of loop result)
- the InstanceKey types in the points-to set are all unknown to the existing if-cases
The information is cheap to log and would have shaved hours off the #437 investigation.
Why This Matters Now
#437's root cause is exactly the (1)+(2) combination: tf.rank(...) is a property-read whose function-object has empty PA — getFunction silently returns LCodeBody, dispatch fails, source is dropped. Fixing those design issues unblocks #437 (and similar future cases) at the right layer instead of papering over them per-op.
Investigation logs and a reproducer test are on branch investigate/437-rank-classification (commit 99161500).
Workaround Attempts For #437
A dispatchByPropertyName fallback was attempted (the design (2) angle): scan the class hierarchy for trampoline classes ending in /<propertyName>/read_data, dispatch as Constant. Recovers 10 of #437's 33 tests when applied to Hybridize, but is too aggressive — it shadows specific generators (Eye, Stack, etc.) that already have dispatch-table entries, causing 100+ regressions on Ariadne's own test suite.
The right fix needs (1)'s explicit-resolution-failure signaling so the fallback only fires when the dispatch table genuinely has nothing — i.e., (1) is a prerequisite for (2)'s fix. That's why this issue exists.
Cross-Refs
Summary
While investigating #437, three design issues in
TensorGeneratorFactory.getFunction()and the surrounding dispatch chain surfaced. They're related — together they make resolution failures hard to recognize and hard to fix targeted-ly, and they were the proximate cause of #437 slipping through Ariadne's CI.(1)
getFunction()Silently Returns The Declared Class On Resolution FailuregetFunction()handles theLCodeBody/LRootgeneric-dispatch case by inspecting the points-to set on the call'suse(0):When the loop yields zero matching keys (which happens when the function-object is a
PythonPropertyReadwith an empty PA points-to set), the method silently returns the declaredLCodeBody/LRootclass. The dispatch table at line 888+ then sees this generic class and throwsIllegalArgumentException("Unknown call: LCodeBody")— caught silently inprocessInstructionwith no diagnostic.Net effect: resolution failures look identical to "genuinely a generic call" to the caller, and the source is silently dropped from the analysis.
Suggested fix: change the return type to
Optional<TypeReference>(or throw a typedUnresolvedCallTargetException), and force the caller to handle the unresolved case explicitly. That gives us a place to plug in fallbacks (per #437 / per (2) below).(2) Two Parallel Dispatch Mechanisms That Aren't Bridged
dispatchByPropertyName()implements property-name dispatch via thePROPERTY_NAME_GENERATORSregistry, which currently has one entry:astype.Meanwhile, the class-hierarchy dispatch (line 888+) has 50+ entries keyed off
MethodReference.getDeclaringClass().For property-read invocations like
tf.rank(...)— which are exactly the casedispatchByPropertyNameis designed for — the property name is correctly resolved to aConstantKey:rankbutPROPERTY_NAME_GENERATORShas no entry, so dispatch falls through. There's no bridge between the two registries: an XML class registered for the class-hierarchy dispatch is not automatically reachable via property-name dispatch.Suggested fix: unify the two dispatch paths. Either (a) the property-name registry is auto-populated from the XML class registry (so any class in the dispatch table is reachable both ways), or (b)
dispatchByPropertyNamefalls back to the class-hierarchy dispatch when the property name corresponds to a registered class.Note that XML-registered helper methods (like
read_data) appear to live in a separate registry fromgetDeclaredMethods()— so a naive class-hierarchy scan forread_datadoesn't work. They become trampoline classes nested under the parent (per the/class/-namespace pattern). Whatever the bridge looks like, it has to query the right registry.(3) No Diagnostic Signal On Resolution Failure
Related to (1): when the PA loop in
getFunction'sLCodeBody/LRootbranch yields zero matchingInstanceKeytypes, the loop just exits. No log, no diagnostic, no signal that resolution failed. The caller seesIllegalArgumentException("Unknown call: LCodeBody")thrown later (sometimes much later, through several recursive walk paths) and has to reverse-engineer what went wrong.Suggested fix: log at
INFO(or at leastFINE) when:The information is cheap to log and would have shaved hours off the #437 investigation.
Why This Matters Now
#437's root cause is exactly the (1)+(2) combination:
tf.rank(...)is a property-read whose function-object has empty PA —getFunctionsilently returns LCodeBody, dispatch fails, source is dropped. Fixing those design issues unblocks #437 (and similar future cases) at the right layer instead of papering over them per-op.Investigation logs and a reproducer test are on branch
investigate/437-rank-classification(commit99161500).Workaround Attempts For #437
A
dispatchByPropertyNamefallback was attempted (the design (2) angle): scan the class hierarchy for trampoline classes ending in/<propertyName>/read_data, dispatch asConstant. Recovers 10 of #437's 33 tests when applied to Hybridize, but is too aggressive — it shadows specific generators (Eye, Stack, etc.) that already have dispatch-table entries, causing 100+ regressions on Ariadne's own test suite.The right fix needs (1)'s explicit-resolution-failure signaling so the fallback only fires when the dispatch table genuinely has nothing — i.e., (1) is a prerequisite for (2)'s fix. That's why this issue exists.
Cross-Refs
tf.rank/tf.stack/tf.linspace/tf.range/etc. —TensorTypeAnalysislacks entries despite unchanged XML #437 — the immediate regression these design gaps caused.read_data/read_datasetmarker allocations intensorflow.xml#380 — the in-progressread_data-pattern migration that creates the population of XML classes that need property-name-dispatch reachability.tensorflow.xmlextensions with the branch-267 rewrite #422 — the biggertensorflow.xmlreconciliation workstream.