Skip to content

[Bug]: tuple() does not validate __len__ / __length_hint__ before iterating #756

@jseop-lim

Description

@jseop-lim

Description

tuple(obj) starts iterating before validating __len__ / __length_hint__, so any exception raised by __getitem__ propagates instead of the TypeError / ValueError that CPython raises when the length hint is invalid. This is a CPython compatibility issue.

Root Cause

ConstructTupleNode.generic() obtains an iterator via PyObjectGetIter and immediately hands it to CreateStorageFromIteratorNode without passing a length hint. __len__ / __length_hint__ are never validated, so the Exception raised by __getitem__ during iteration propagates unchanged.

CPython's PySequence_Tuple instead calls PyObject_LengthHint(v, 10) before iteration, which validates the return value of __len__ / __length_hint__ and raises TypeError / ValueError first — before __getitem__ is ever invoked.

CPython reference: Objects/abstract.c#L91-L146, Objects/abstract.c#L2053

// PyObject_LengthHint validates __len__ / __length_hint__ before any iteration.
// __len__ path — a negative return value raises ValueError("__len__() should return >= 0").
if (_PyObject_HasLen(o)) {
    res = PyObject_Length(o);
    if (res < 0) {
        if (!_PyErr_ExceptionMatches(tstate, PyExc_TypeError)) {
            return -1;   // propagate non-TypeError (e.g. ValueError) as-is
        }
        _PyErr_Clear(tstate);
    } else {
        return res;
    }
}
// __length_hint__ path — a non-int return value raises TypeError.
if (!PyLong_Check(result)) {
    PyErr_Format(PyExc_TypeError, "__length_hint__ must be an integer, not %.100s",
        Py_TYPE(result)->tp_name);
    return -1;
}

Reproduction

class BadLengthHint:
    def __getitem__(self, index):
        raise Exception

    def __length_hint__(self):
        return None

tuple(BadLengthHint())

Output

GraalPy:

Exception

CPython:

TypeError: __length_hint__ must be an integer, not NoneType

Environment

  • GraalPy 25.0.2 (Python 3.12.8)
  • CPython v3.12.13
  • OS: Debian 12

Additional context

The same GraalPy already does this correctly on the list(iterable) path (ListBuiltins.listIterable), which uses IteratorNodes.GetLength. Only the tuple() path is missing the pre-validation.

static PNone listIterable(VirtualFrame frame, PList list, Object iterable,
@Bind Node inliningTarget,
// exclusive for truffle-interpreted-performance
@Exclusive @Cached ClearListStorageNode clearStorageNode,
@Cached IteratorNodes.GetLength lenNode,
@Cached PyObjectGetIter getIter,
@Cached CreateStorageFromIteratorNode storageNode) {
clearStorageNode.execute(inliningTarget, list);
int len = lenNode.execute(frame, inliningTarget, iterable);
Object iterObj = getIter.execute(frame, inliningTarget, iterable);
list.setSequenceStorage(storageNode.execute(frame, iterObj, len));
return PNone.NONE;
}

Applying the list()-style fix directly did not work because of the @Fallback(excludeForUncached = true) annotation on ConstructTupleNode.generic(), so this is being filed as an issue only.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions