Skip to content

Pipeline FFI types: missing Protocols, ABCs, and capsule helpers #1577

@timsaucer

Description

@timsaucer

Background

A v54 upstream coverage audit identified six FFI-pipeline gaps where DataFusion's FFI types are already imported on the Rust side but the Python surface is incomplete -- either missing a typed Protocol / ABC for users to implement against, or missing a from_pycapsule helper that the rest of the FFI surface uses. These are not v54-specific; they predate the v54 release. Filing this umbrella issue so the gaps are tracked together rather than disappearing into the audit report.

Items

  1. FFI_TableFunction -- no TableFunctionExportable(Protocol). The duck-typed hasattr check at user_defined.py:1161,1205 should be replaced by a typed protocol mirroring TableProviderExportable.

  2. FFI_TableProvider -- no Python TableProvider(ABC) analogous to CatalogProvider / SchemaProvider. Users implementing a custom table provider in Python have no abstract base class to subclass; the Exportable protocol is the only entry point.

  3. FFI_ExtensionOptions -- no ExtensionOptionsExportable(Protocol). Currently consumed only at the SessionConfig.with_extension call site without a typed protocol describing the expected capsule export.

  4. FFI_TaskContextProvider -- no Python Protocol and no example. Producer-only at PySessionContext.__datafusion_task_context_provider__; nothing imports one.

  5. FFI_TableProviderFactory -- no from_pycapsule helper on RustWrappedPyTableProviderFactory. The capsule decode is inlined at context.rs:728-749 rather than going through the from_pycapsule! macro the other FFI capsule importers use.

  6. WindowUDF -- no Python ABC equivalent to Accumulator for native Python window UDFs. Users can register a Rust-defined window UDF over FFI but cannot define one in pure Python.

Why deferred

Each item is small in isolation but they form a self-contained cleanup track. The audit grouped them so the work can be picked up together by a contributor focused on FFI ergonomics, rather than scattered across the v54 gap-closure PRs which focus on net-new surface area. Picking up any one item from this list is welcome.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions