Skip to content

Add Workflow Streams library#1423

Open
jssmith wants to merge 109 commits intomainfrom
contrib/pubsub
Open

Add Workflow Streams library#1423
jssmith wants to merge 109 commits intomainfrom
contrib/pubsub

Conversation

@jssmith
Copy link
Copy Markdown
Contributor

@jssmith jssmith commented Apr 7, 2026

What was changed

Adds temporalio.contrib.workflow_stream, a reusable primitive for streaming data through Temporal workflows. The module and its integrations with the OpenAI Agents and Google ADK plugins are marked experimental. Plugin streaming is opt-in: callers must set streaming_event_topic to enable publishing.

Why?

Streaming incremental results from long-running workflows (e.g., AI agent token streams, progress updates) is a common need with no built-in solution. This module provides a correct, reusable implementation so users don't have to roll their own poll/signal/dedup logic.

Checklist

  1. Closes — N/A (new contrib module, no existing issue)

  2. How was this tested:

    • 29 pytest tests in tests/contrib/workflow_stream/test_workflow_stream.py covering batching, flush safety, CAN serialization, replay guards, dedup (TTL pruning, truncation), offset-based resumption, max_batch_size, drain, and error handling, plus a payload round-trip prototype test
    • Demo application
    • Shared with prospective users
    • 8-hour load test
  3. Any docs updates needed?

    • Module includes README.md with usage examples and API reference
    • Design doc: DESIGN.md (covers CAN, dedup, and topic semantics)
    • docs.temporal.io updates are prepared on a separate branch and will land soon

jssmith and others added 15 commits April 5, 2026 21:33
A workflow mixin (PubSubMixin) that turns any workflow into a pub/sub
broker. Activities and starters publish via batched signals; external
clients subscribe via long-poll updates exposed as an async iterator.

Key design decisions:
- Payloads are opaque bytes for cross-language compatibility
- Topics are plain strings, no hierarchy or prefix matching
- Global monotonic offsets (not per-topic) for simple continuation
- Batching built into PubSubClient with Nagle-like timer + priority flush
- Structured concurrency: no fire-and-forget tasks, trio-compatible
- Continue-as-new support: drain_pubsub() + get_pubsub_state() + validator
  to cleanly drain polls, plus follow_continues on the subscriber side

Module layout:
  _types.py  — PubSubItem, PublishInput, PollInput, PollResult, PubSubState
  _mixin.py  — PubSubMixin (signal, update, query handlers)
  _client.py — PubSubClient (batcher, async iterator, CAN resilience)

9 E2E integration tests covering: activity publish + subscribe, topic
filtering, offset-based replay, interleaved workflow/activity publish,
priority flush, iterator cancellation, context manager flush, concurrent
subscribers, and mixin coexistence with application signals/queries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PubSubState is now a Pydantic model so it survives serialization through
Pydantic-based data converters when embedded in Any-typed fields. Without
this, continue-as-new would fail with "'dict' object has no attribute 'log'"
because Pydantic deserializes Any fields as plain dicts.

Added two CAN tests:
- test_continue_as_new_any_typed_fails: documents that Any-typed fields
  lose PubSubState type information (negative test)
- test_continue_as_new_properly_typed: verifies CAN works with properly
  typed PubSubState | None fields

Simplified subscribe() exception handling: removed the broad except
Exception clause that tried _follow_continue_as_new() on every error.
Now only catches WorkflowUpdateRPCTimeoutOrCancelledError for CAN follow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
README.md: usage-oriented documentation covering workflow mixin, activity
publishing, subscribing, continue-as-new, and cross-language protocol.

flush() safety: items are now removed from the buffer only after the
signal succeeds. Previously, buffer.clear() ran before the signal,
losing items on failure. Added test_flush_retains_items_on_signal_failure.

init_pubsub() guard: publish() and _pubsub_publish signal handler now
check for initialization and raise a clear RuntimeError instead of a
cryptic AttributeError.

PubSubClient.for_workflow() factory: preferred constructor that takes a
Client + workflow_id. Enables follow_continues in subscribe() without
accessing private WorkflowHandle._client. The handle-based constructor
remains for simple cases that don't need CAN following.

activity_pubsub_client() now uses for_workflow() internally with proper
keyword-only typed arguments instead of **kwargs: object.

CAN test timing: replaced asyncio.sleep(2) with assert_eq_eventually
polling for a different run_id, matching sdk-python test patterns.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_pubsub_poll and _pubsub_offset now call _check_initialized() for a
clear RuntimeError instead of cryptic AttributeError when init_pubsub()
is forgotten.

README CAN example now includes the required imports (@DataClass,
workflow) and @workflow.init decorator.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The poll validator accesses _pubsub_draining, which would AttributeError
if init_pubsub() was never called. Added _check_initialized() guard.

Fixed PubSubState docstring: the field must be typed as PubSubState | None,
not Any. The old docstring incorrectly implied Any-typed fields would work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
get_pubsub_state() and drain_pubsub() now call _check_initialized().
Previously drain_pubsub() could silently set _pubsub_draining on an
uninitialized instance, which init_pubsub() would then reset to False.

New tests:
- test_max_batch_size: verifies auto-flush when buffer reaches limit,
  using max_cached_workflows=0 to also test replay safety
- test_replay_safety: interleaved workflow/activity publish with
  max_cached_workflows=0, proving the mixin is determinism-safe

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Review comments (#@agent: annotations) capture design questions on:
- Topic offset model and information leakage (resolved: global offsets
  with BFF-layer containment, per NATS JetStream model)
- Exactly-once publish delivery (resolved: publisher ID + sequence number
  dedup, per Kafka producer model)
- Flush concurrency (resolved: asyncio.Lock with buffer swap)
- CAN follow behavior, poll rate limiting, activity context detection,
  validator purpose, pyright errors, API ergonomics

DESIGN-ADDENDUM-TOPICS.md: full exploration of per-topic vs global offsets
with industry survey (Kafka, Redis, NATS, PubNub, Google Pub/Sub,
RabbitMQ). Concludes global offsets are correct for workflow-scoped
pub/sub; leakage contained at BFF trust boundary.

DESIGN-ADDENDUM-DEDUP.md: exactly-once delivery via publisher ID +
monotonic sequence number. Workflow dedup state is dict[str, int],
bounded by publisher count. Buffer swap pattern with sequence reuse
on failure. PubSubState carries publisher_sequences through CAN.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Types:
- Remove offset from PubSubItem (global offset is now derived)
- Add publisher_id + sequence to PublishInput for exactly-once dedup
- Add base_offset + publisher_sequences to PubSubState for CAN
- Use Field(default_factory=...) for Pydantic mutable defaults

Mixin:
- Add _pubsub_base_offset for future log truncation support
- Add _pubsub_publisher_sequences for signal deduplication
- Dedup in signal handler: reject if sequence <= last seen
- Poll uses base_offset arithmetic for offset translation
- Class-body type declarations for basedpyright compatibility
- Validator docstring explaining drain/CAN interaction
- Module docstring gives specific init_pubsub() guidance

Client:
- asyncio.Lock + buffer swap for flush concurrency safety
- Publisher ID (uuid) + monotonic sequence for exactly-once delivery
- Sequence advances on failure to prevent data loss when new items
  merge with retry batch (found via Codex review)
- Remove follow_continues param — always follow CAN via describe()
- Configurable poll_interval (default 0.1s) for rate limiting
- Merge activity_pubsub_client() into for_workflow() with auto-detect
- _follow_continue_as_new is async with describe() check

Tests:
- New test_dedup_rejects_duplicate_signal
- Updated flush failure test for new sequence semantics
- All activities use PubSubClient.for_workflow()
- Remove PubSubItem.offset assertions
- poll_interval=0 in test helper for speed

Docs:
- DESIGN-v2.md: consolidated design doc superseding original + addenda
- README.md: updated API reference
- DESIGN-ADDENDUM-DEDUP.md: corrected flush failure semantics

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrite the client-side dedup algorithm to match the formally verified
TLA+ protocol: failed flushes keep a separate _pending batch and retry
with the same sequence number. Only advance the confirmed sequence on
success. TLC proves NoDuplicates and OrderPreserved for the correct
algorithm, and finds duplicates in the old algorithm.

Add TTL-based pruning of publisher dedup entries during continue-as-new
(default 15 min). Add max_retry_duration (default 600s) to bound client
retries — must be less than publisher_ttl for safety. Both constraints
are formally verified in PubSubDedupTTL.tla.

Add truncate_pubsub() for explicit log prefix truncation. Add
publisher_last_seen timestamps for TTL tracking. Preserve legacy state
without timestamps during upgrade.

API changes: for_workflow→create, flush removed (use priority=True),
poll_interval→poll_cooldown, publisher ID shortened to 16 hex chars.

Includes TLA+ specs (correct, broken, inductive, multi-publisher TTL),
PROOF.md with per-action preservation arguments, scope and limitations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New analysis document evaluates whether publishing should use signals
or updates, examining Temporal's native dedup (Update ID per-run,
request_id for RPCs) vs the application-level (publisher_id, sequence)
protocol. Conclusion: app-level dedup is permanent for signals but
could be dropped for updates once temporal/temporal#6375 is fixed.
Non-blocking flush keeps signals as the right choice for streaming.

Updates DESIGN-v2.md section 6 to be precise about the two Temporal
guarantees that signal ordering relies on: sequential send order and
history-order handler invocation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Analyzes deduplication through the end-to-end principle lens. Three
types of duplicates exist in the pipeline, each handled at the layer
that introduces them:

- Type A (duplicate LLM work): belongs at application layer — data
  escapes to consumers before the duplicate exists, so only the
  application can resolve it
- Type B (duplicate signal batches): belongs in pub/sub workflow —
  encapsulates transport details and is the only layer that can
  detect them correctly
- Type C (duplicate SSE delivery): belongs at BFF/browser layer

Concludes the (publisher_id, sequence) protocol is correctly placed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… design

Fill gaps identified during design review:
- Document why per-topic offsets were rejected (trust model, cursor
  portability, unjustified complexity) inline rather than only in historical
  addendum
- Expand BFF section with the four reconnection options considered and
  the decision to use SSE Last-Event-ID with BFF-assigned gapless IDs
- Add poll efficiency characteristics (O(new items) common case)
- Document BFF restart fallback (replay from turn start)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wire types (PublishEntry, _WireItem, PollResult, PubSubState) encode
data as base64 strings for cross-language compatibility across all
Temporal SDKs. User-facing types (PubSubItem) use native bytes.

Conversion happens inside handlers:
- Signal handler decodes base64 → bytes on ingest
- Poll handler encodes bytes → base64 on response
- Client publish() accepts bytes, encodes for signal
- Client subscribe() decodes poll response, yields bytes

This means Go/Java/.NET ports get cross-language compat for free since
their JSON serializers encode byte[] as base64 by default.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread temporalio/contrib/pubsub/_types.py Outdated
jssmith and others added 14 commits April 7, 2026 20:10
Remove the bounded poll wait from PubSubMixin and trim trailing
whitespace from types. Update DESIGN-v2.md with streaming plugin
rationale (no fencing needed, UI handles repeat delivery).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add opt-in streaming code path to both agent framework plugins.
When enabled, the model activity calls the streaming LLM endpoint,
publishes TEXT_DELTA/THINKING_DELTA/TOOL_CALL_START events via
PubSubClient as a side channel, and returns the complete response
for the workflow to process (unchanged interface).

OpenAI Agents SDK:
- ModelActivityParameters.enable_streaming flag
- New invoke_model_activity_streaming method on ModelActivity
- ModelResponse reconstructed from ResponseCompletedEvent
- Uses @_auto_heartbeater for periodic heartbeats
- Routing in _temporal_model_stub (rejects local activities)

Google ADK:
- TemporalModel(streaming=True) constructor parameter
- New invoke_model_streaming activity using stream=True
- Registered in GoogleAdkPlugin

Both use batch_interval=0.1s for near-real-time token delivery.
No pubsub module changes needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Pydantic BaseModel was introduced as a workaround for Any-typed fields
losing type information during continue-as-new serialization. The actual fix
is using concrete type annotations (PubSubState | None), which the default
data converter handles correctly for dataclasses — no Pydantic dependency
needed.

This removes the pydantic import from the pubsub contrib module entirely,
making it work out of the box with the default data converter. All 18 tests
pass, including both continue-as-new tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implements DESIGN-ADDENDUM-ITEM-OFFSET.md. The poll handler now annotates
each item with its global offset (base_offset + position in log), enabling
subscribers to track fine-grained consumption progress for truncation.
This is needed for the voice-terminal agent where audio chunks must not be
truncated until actually played, not merely received.

- Add offset field to PubSubItem and _WireItem (default 0)
- Poll handler computes offset from base_offset + log_offset + enumerate index
- subscribe() passes wire_item.offset through to yielded PubSubItem
- Tests: per-item offsets, offsets with topic filtering, offsets after truncation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents the motivation and design for adding offset fields to
PubSubItem and _WireItem, enabling subscribers to track consumption
at item granularity rather than batch boundaries. Driven by the
voice-terminal agent's need to truncate only after audio playback,
not just after receipt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three changes:

1. Poll handler: replace ValueError with ApplicationError(non_retryable=True)
   when requested offset has been truncated. This fails the UPDATE (client
   gets the error) without crashing the WORKFLOW TASK — avoids the poison
   pill during replay that caused permanent workflow failures.

2. Poll handler: treat from_offset=0 as "from the beginning of whatever
   exists" (i.e., from base_offset). This lets subscribers recover from
   truncation by resubscribing from 0 without knowing the current base.

3. PubSubClient.subscribe(): catch WorkflowUpdateFailedError with type
   TruncatedOffset and retry from offset 0, auto-recovering.

New tests:
- test_poll_truncated_offset_returns_application_error
- test_poll_offset_zero_after_truncation
- test_subscribe_recovers_from_truncation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verify that PubSubClient can subscribe to events from a different
workflow (same namespace) and that Nexus operations can start pub/sub
broker workflows in a separate namespace with cross-namespace
subscription working end-to-end. No library changes needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Poll responses now estimate wire size (base64 data + topic) and stop
adding items once the response exceeds 1MB. The new `more_ready` flag
on PollResult tells the subscriber that more data is available, so it
skips the poll_cooldown sleep and immediately re-polls. This avoids
unnecessary latency during big reloads or catch-up scenarios while
keeping individual update payloads within Temporal's recommended limits.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codify the four wire evolution rules that have been followed implicitly
through four addenda: additive-only fields with defaults, immutable
handler names, forward-compatible PubSubState, and no application-level
version negotiation. Includes a precedent table showing all past changes
and reasoning for why version fields in payloads would cause silent data
loss on signals.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After max_retry_duration expires, the client dropped the pending batch
without advancing _sequence. The next batch reused the same sequence
number, which could be silently deduplicated by the workflow if the
timed-out signal was actually delivered — causing permanent data loss
for those items.

The fix advances _sequence to _pending_seq before clearing _pending,
ensuring subsequent batches always get a fresh sequence number.

TLA+ verification:
- Added DropPendingBuggy/DropPendingFixed actions to PubSubDedup.tla
- Added SequenceFreshness invariant: (pending=<<>>) => (confirmed_seq >= wf_last_seq)
- BuggyDropSpec FAILS SequenceFreshness (confirmed_seq=0 < wf_last_seq=1)
- FixedDropSpec PASSES all invariants (489 distinct states)
- NoDuplicates passes for both — the bug causes data loss, not duplicates

Python test:
- test_retry_timeout_sequence_reuse_causes_data_loss demonstrates the
  end-to-end consequence: reused seq=1 is rejected, fresh seq=2 accepted

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts:
#	temporalio/contrib/google_adk_agents/_model.py
This is a new release with no legacy to support. Changes:

- _mixin.py: Remove ts-is-None fallback that retained publishers without
  timestamps. All publishers always have timestamps, so this was dead code.
- _types.py: Clean up docstrings referencing addendum docs
- DESIGN-v2.md: Remove backward-compat framing, addendum references, and
  historical file listing. Keep the actual evolution rules.
- PROOF.md: "Legacy publisher_id" → "Empty publisher_id"
- README.md: Reference DESIGN-v2.md instead of deleted addendum
- Delete DESIGN.md and 4 DESIGN-ADDENDUM-*.md files (preserved in
  the top-level streaming-comparisons repo)
- Delete stale TLA+ trace .bin files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Simplify the README to focus on essential API patterns. Rename
for_workflow() to create() throughout, condense the topics section,
remove the exactly-once and type-warning sections (these details
belong in DESIGN-v2.md), and update the API reference table with
current parameter signatures. Also fix whitespace alignment in
DESIGN-v2.md diagram.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…de pubsub state

The CAN example only showed pubsub_state being passed through, which could
mislead readers into thinking that's all that's needed. Updated to include
a representative application field (items_processed) to make it clear that
your own workflow state must also be carried across the CAN boundary.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@@ -0,0 +1,1403 @@
# Temporal Workflow Streams — Design Document
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to check this whole document in? It seems like keeping it up to date could be a burden.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been debating that—I think we can remove it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 22ad024. The canonical guide stays on docs.temporal.io; the long-form design notes are preserved out-of-tree for future reference. README and _types.py no longer reference the file.

activity is configured with ``streaming_event_topic=None``.

Registers a :class:`WorkflowStream` so the test can subscribe and
verify the activity did not publish anything.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we may not want to fail because we would have to do so at runtime, potentially well after the workflow has started, but I think at least a warning is in order. It doesn't really make a lot of sense to do this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went stricter than a warning — TemporalOpenAIRunner.run_streamed now raises AgentsWorkflowError before delegating to the agents framework when streaming_event_topic is unset, and same for use_local_activity=True (1f4099a). I'd originally put the check inside _TemporalModelStub.stream_response, but the framework runs the model in a background task and stuffs errors into RunResultStreaming._stored_exception, which gets dropped if the queue completion sentinel is read before the task is observed as done — so the error never surfaced and the workflow silently returned final_output=None. Validating in the runner short-circuits before the framework starts the task. ADK uses ApplicationError(non_retryable=True) in TemporalModel.generate_content_async for the same reason (7c910de). Two new regression tests cover both cases. If you'd rather it be a warning, easy to soften.

normalization."""
async with AgentEnvironment(
model=StreamingTestModel(),
model_params=ModelActivityParameters(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should consider a way to push these parameters down to the specific model usage, or at least the runner. At least as an option. @JasonSteving99 - That's not a thing we should address here though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged as a follow-up — out of scope for this PR.

jssmith and others added 2 commits April 29, 2026 09:29
Adds a Future Work pointer to docs/pubsub-design-analysis/final-flag-prune.md,
which proposes a `final: bool` field on PublishInput so cleanly-exited
publishers can have their dedup PublisherState pruned on a tighter schedule
than the full publisher_ttl. Deferred for now.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Workflow Streams user guide now lives at
https://docs.temporal.io/develop/python/workflows/workflow-stream
and is the primary reference. Replace the long quick-start /
API-reference README with a short motivating summary, key
technical highlights, and a prominent link to the docs site.
DESIGN.md gets one extra sentence at the top reframing it as the
contributor/internals doc and pointing readers at the user docs.
Copy link
Copy Markdown
Member

@Sushisource Sushisource left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking good to me, I only focused on API not impl

Comment on lines +85 to +88
update/signal handlers that read ``WorkflowStream`` state can
observe pre-publish state when both land in the same activation.
Make such handlers ``async`` and ``await asyncio.sleep(0)`` before
reading state. See the "Gotcha" section of this module's
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this wording very confusing

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will address

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrote the Note in 4646f6d to lead with the dynamic-registration cause and broaden to both class-level signal and update handlers; dropped the stale README "Gotcha" pointer.

Comment on lines +99 to +100
The check inspects the immediate caller's frame and requires the
function name to be ``__init__``.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo rather than this it might make more sense for us to just set a local around our own invocation of __init__ from the SDK, but, that involves touching the core SDK so maybe we don't wanna bother for now

Comment on lines +177 to +179
Prunes publisher dedup entries older than ``publisher_ttl``. The
TTL must exceed the ``max_retry_duration`` of any client that
may still be retrying a failed flush.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not immediately clear what "publisher dedup entries" means

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will address

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded in 4646f6d — "dedup state for publishers idle longer than publisher_ttl".

Comment on lines +204 to +205
def drain(self) -> None:
"""Unblock all waiting poll handlers and reject new polls.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not a huge fan of this name. Maybe finalize? Not blocking since I don't really have a much better idea

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed drainstop_polling (19d2e18), then again to detach_pollers (553a83b) — closer to what it actually does.

Comment thread temporalio/contrib/workflow_streams/_stream.py Outdated
"""

topic: str
data: Any
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might want this to be Payload | Decoded(T), there is an (admittedly very niche, but totally possible) edge case where the user wants T to be Payload, and in that case we might do the wrong thing because of the type confusion?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closed the ambiguity by construction in 1b46541: subscribe(result_type=Payload) now raises (mirroring the existing stream.topic(name, type=Payload) rejection on both workflow and client sides). Payload was already not bindable on a topic handle; the direct-subscribe path now matches. Users who want raw payloads pass result_type=RawValue. Also filled a test gap — the workflow-side type=Payload rejection wasn't covered (only the client-side was).

@@ -0,0 +1,1419 @@
# Temporal Workflow Streams — Design Document

Consolidated design document reflecting the current implementation. This
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to check this whole thing in?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 22ad024 (same as tconley's thread).

"""Create a stream client from a Temporal client and workflow ID.

Use this when the caller has an explicit ``Client`` and
``workflow_id`` in hand (starters, BFFs, other workflows'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best friends forever?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backend-for-frontend (but also best friends forever works just as well 😆)

Alternatively sometimes called a "frontend server"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, k, if I wasn't familiar with it our users may not be too

)

@classmethod
def from_activity(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is maybe more like from_within_activity?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 40d3bff.

while self._pending is not None or self._buffer:
await self._flush()

def publish(self, topic: str, value: Any, force_flush: bool = False) -> None:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that you can publish different Ts to the same topic is something we could rectify with topic handles.

Like:

topic = client.topic(topic, type=T)
topic.publish("hi")

Internally this could enforce that you can't create multiple handles to the same topic with different T.

This is somewhere between the choices we discussed earlier about topics-as-streams or topics-in-streams.

Kind of a big change late in the game, but, I think it'd be nice. We don't have to do it now, but, maybe worth considering iterating on.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 908bc9bWorkflowStream.topic(name, type=T) binds a topic to one type per stream and rejects mismatches.

jssmith and others added 11 commits April 29, 2026 11:09
Reviewer flagged the 1419-line design doc as a maintenance burden — the
implementation has already drifted from a few sections, and keeping it
synchronized in-tree adds churn for every refactor. Move the canonical
design notes out of the SDK; the README continues to point at the
docs.temporal.io guide for users, and the design file is preserved in
the streaming-comparisons project for future reference.

Strip the ``DESIGN.md`` references from README.md and _types.py so no
in-tree pointer remains.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address tconley feedback that (a) the non-streaming activity should not
carry inputs only meaningful to the streaming path, and (b) invoking
streaming with topic=None is a footgun with no real benefit (the
workflow gets the chunked list batched at activity completion either
way; "no-publish streaming" doesn't deliver real-time value to anyone).

Changes:

- Split ``ActivityModelInput`` (TypedDict) into the base shape used by
  ``invoke_model_activity`` and a ``StreamingActivityModelInput`` subclass
  with ``streaming_event_topic: Required[str]`` and the batch interval
  used only by ``invoke_model_activity_streaming``. The streaming
  activity now always opens a ``WorkflowStreamClient``; the
  ``topic is None`` branch and its docstring caveat are removed.

- Validate at the runner before delegating to the agents framework.
  ``TemporalOpenAIRunner.run_streamed`` raises ``AgentsWorkflowError``
  when ``model_params.streaming_event_topic`` is unset, or when
  ``use_local_activity=True`` (local activities have no heartbeat or
  signal channel). Both checks must happen here rather than inside the
  stub's ``stream_response``: the agents framework runs the model in a
  background task and silently captures errors into
  ``RunResultStreaming._stored_exception``, which can be lost when the
  queue completion sentinel is read before the task is observed as
  done — failing in the runner short-circuits before the framework
  starts the task. The stub keeps a defensive guard for direct callers.

- Update ``ModelActivityParameters`` and the integration README so the
  documented contract matches the runtime behavior.

- Replace the ``StreamingWithoutStreamTopicWorkflow`` test with
  ``StreamingRequiresTopicWorkflow`` covering the topic-missing path,
  and add ``test_streaming_rejects_local_activity`` for the
  use_local_activity case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address tconley feedback on the ADK streaming activity — activities
should take a single dataclass input, and invoking streaming without a
topic is a footgun for the same reasons as on the OpenAI side (the
workflow only sees chunks batched at activity completion, so the
"streaming without publishing" path delivers no real-time value).

Changes:

- Wrap ``invoke_model_streaming`` inputs in
  ``StreamingInvokeInput`` (llm_request + streaming_event_topic +
  streaming_event_batch_interval). Drop the ``topic is None`` branch;
  the activity always opens a ``WorkflowStreamClient`` and publishes
  each chunk. ``invoke_model`` (non-streaming) keeps its existing
  ``LlmRequest`` argument since that already satisfies the
  single-input convention.

- Validate in ``TemporalModel.generate_content_async`` before
  scheduling the streaming activity. Raise
  ``ApplicationError(non_retryable=True)`` so the failure surfaces as
  a terminal workflow failure without needing plugin-level
  ``workflow_failure_exception_types`` registration.

- Update the constructor docstring to reflect the now-required topic.

- Add ``StreamingAdkRequiresTopicWorkflow`` plus
  ``test_streaming_requires_topic`` covering the no-topic path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per PR review feedback (Sushisource), rename the
WorkflowStreamClient classmethod and update all call sites,
references, error messages, comments, and tests.

The new name reads more clearly at the call site — it documents
that the method must be invoked from inside an activity rather
than that it builds something derived from one — and matches how
we already describe it in the docstring ("must be called from
within an activity").
Per PR review feedback, drop the special case where
subscribe() with no result_type yields a raw Payload, and instead
delegate to the payload converter's default Any decoding (the
same behavior as signal/update/query handlers without a type
hint). Callers that want the original Payload pass
result_type=temporalio.common.RawValue, mirroring the standard
Temporal convention.

The only caller-visible change for typed callers is the
no-result_type path: a JSON-converter consumer that previously
got back a Payload now gets back a Python dict/list/scalar (or
bytes for binary payloads). Heterogeneous-topic dispatchers
relying on Payload.metadata should switch to result_type=RawValue
and read item.data.payload.metadata.

Adds a regression test covering both default decode (dict) and
RawValue passthrough (Payload bytes preserved).
Public surface moves from temporalio.contrib.workflow_stream to
temporalio.contrib.workflow_streams. Internal signal/update/query
names (__temporal_workflow_stream_*) are wire identifiers and stay
unchanged so existing histories and mixed-version clients keep working.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add TopicHandle[T] (client-side: publish + subscribe) and
WorkflowTopicHandle[T] (workflow-side: publish only), constructed
via WorkflowStreamClient.topic(name, type=T) and
WorkflowStream.topic(name, type=T) respectively.

Each publisher instance binds a topic name to exactly one T;
re-binding to a different type raises RuntimeError, and re-binding
to the same type is idempotent. The check uses Python equality on
the type object — primitives, dataclasses, generic aliases, and
unions all compare structurally. No subtype/union-superset
recognition; no cross-process coordination.

Pre-built Payload values still pass through topic.publish
regardless of the bound type (zero-copy fast path preserved); the
signature accepts T | Payload to model this. typing.Any is the
documented escape hatch for heterogeneous topics.

The existing publish() methods are kept for now (delegated to a
shared _publish_to_topic) and marked as preferred-replaced-by
TopicHandle in their docstrings. A follow-up commit removes them
and migrates call sites.

Tests cover: workflow-side publish + client-side typed subscribe,
client-side type-uniqueness and idempotency, in-workflow-init
type-uniqueness via TopicHandleUniquenessWorkflow, and pre-built
Payload passthrough on a typed handle.
Per Codex review feedback on the topic-handle introduction
commit: binding a topic to type=Payload would let publish work
(via the existing isinstance(value, Payload) zero-copy path) but
quietly break TopicHandle.subscribe — the payload converter has
no Payload decode path, so JSON items would fail conversion and
binary items would decode to bytes rather than Payload.

Reject type=Payload at WorkflowStreamClient.topic() and
WorkflowStream.topic() with a clear error pointing callers at
the right idioms: type=typing.Any for heterogeneous topics,
pre-built Payload values published via any-typed handle (zero-
copy fast path), and result_type=RawValue on
WorkflowStreamClient.subscribe for raw payload access.

Adds a regression assertion to test_topic_handle_client_uniqueness
and pulls a stray Payload-recommendation out of the topic()
docstrings.
…opic handles

Drops the un-typed publish(topic, value) entry points on
WorkflowStream and WorkflowStreamClient. Publishers now go through
WorkflowStream.topic(name, type=T) and
WorkflowStreamClient.topic(name, type=T) which return typed
TopicHandle / WorkflowTopicHandle objects.

Topic handles are the only supported publish API. Per Codex
review on PR #1423: per-instance type uniqueness is a
factory-level invariant — handle-based publishing on a single
publisher cannot mix Ts on a topic, while still allowing escape
hatches (type=typing.Any for heterogeneous topics, pre-built
Payload values via the zero-copy fast path on any-typed handle).

Migrates the internal users:
- temporalio/contrib/openai_agents/_invoke_model_activity.py uses
  type=Any (TResponseStreamEvent is an annotated union, not a class)
- temporalio/contrib/google_adk_agents/_model.py uses
  type=LlmResponse
- All workflow_streams tests migrated to the handle form,
  preserving existing topic names and types
CI's pydoctor step on Python 3.14 ubuntu-latest failed because
the previous commit removed the public publish() methods on
WorkflowStream and WorkflowStreamClient but left
:meth:\`WorkflowStream.publish\`, :meth:\`WorkflowStreamClient.publish\`,
and an unqualified :py:meth:\`publish\` reference in module/method
docstrings. Repoint them at the surviving topic-handle methods.
Splits the openai_agents streaming integration and the
google_adk_agents streaming integration out of PR #1423 — they will
land in their own follow-up PRs that depend on the workflow_streams
contrib module shipping first.

Reverts to origin/main:
- temporalio/contrib/openai_agents/_invoke_model_activity.py
- temporalio/contrib/openai_agents/_mcp.py
- temporalio/contrib/openai_agents/_model_parameters.py
- temporalio/contrib/openai_agents/_openai_runner.py
- temporalio/contrib/openai_agents/_temporal_model_stub.py
- temporalio/contrib/openai_agents/_temporal_openai_agents.py
- temporalio/contrib/openai_agents/README.md
- temporalio/contrib/google_adk_agents/_model.py
- temporalio/contrib/google_adk_agents/_plugin.py

Removes the streaming-specific test files added on this branch:
- tests/contrib/openai_agents/test_openai_streaming.py
- tests/contrib/google_adk_agents/test_adk_streaming.py

This PR is now scoped to the workflow_streams contrib module only.
Comment on lines +376 to +380
# TResponseStreamEvent is a typing.Annotated[Union[...]] — a typing
# special form, not a class — so it cannot be passed as type[T].
# Use Any here; subscribers that want typed decode can pass
# result_type=TResponseStreamEvent on their own subscribe call.
events_topic = stream.topic(topic, type=cast("type[Any]", cast(object, Any)))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is pretty gnarly. In a case like this presumably it'd be better to recommend users just use stream.publish(topic, event) instead I'd think (since the topic is just a convenience anyways right?). I could honestly see people tripping up on figuring out this incantation

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this to a separate PR.

while self._pending is not None or self._buffer:
await self._flush()

def publish(self, topic: str, value: Any, force_flush: bool = False) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a corresponding comment on the events_topic = stream.topic(topic, type=cast("type[Any]", cast(object, Any))) line.

I think we should probably either find a better way to create a handle for a topic for which we don't have a great type class, or we should continue exposing WorkflowStreamClient::publish publicly

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made topic(type=...) optional, defaults to Any in 1a19ef5 — no more cast incantation.

jssmith and others added 5 commits April 29, 2026 15:02
Per PR review feedback, rename the workflow-side state-transition
method that releases waiting subscribers and rejects new poll
updates. The previous name implied "wait for buffered items to
flush"; the operation actually evicts pollers and refuses new
ones, while keeping publishes and get_state/continue_as_new
valid for the rest of the run. The new name describes that
precisely.

Updates the continue_as_new helper, the explicit-recipe docstring,
and the one test that drives the explicit recipe. Internal
_draining state flag stays as-is (private implementation detail).
Per discussion: heterogeneous topics and dynamic-topic forwarders
previously had to write
type=cast(type[Any], cast(object, Any)) at the call site to
satisfy pyright (typing.Any is a special form, not a class). Make
the type kwarg optional and default to typing.Any so the natural
form is just client.topic("name") / stream.topic("name").

The type-uniformity invariant is unchanged: each instance binds a
topic name to exactly one type; mixing untyped (= Any) with a
specific type still raises. typing.Any can also be passed
explicitly, with the cast still required for type-strict callers.

Adds overloads so callers that pass type=T still get a typed
TopicHandle[T] from pyright; the no-type form returns
TopicHandle[Any] / WorkflowTopicHandle[Any]. Updates the existing
test to exercise both the omitted-type and explicit-type=Any
paths.
Better captures the operation. The method releases the in-flight
__temporal_workflow_stream_poll update handlers (subscribers were
"attached" to the stream's drain signal; this detaches them so
they return to the caller, who can then follow continue-as-new
or stop) and rejects new poll attachments. "stop_polling"
ambiguously suggested the stream itself was the one polling;
"detach_pollers" names the actor (the pollers / subscribers) and
captures the relationship.

Updates the continue_as_new helper and the explicit-recipe
docstring/test accordingly.
- Class Note: lead with the dynamic-registration cause and broaden
  to cover both signal and update class-level handlers; drop stale
  pointer to a removed README "Gotcha" section.
- get_state: reword "publisher dedup entries" → "dedup state for
  publishers idle longer than publisher_ttl".
- continue_as_new: replace the three-line arrow recipe with a
  proper code-block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@Sushisource Sushisource left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

class-level signal/update handlers are scheduled. Define
such handlers as ``async`` and ``await asyncio.sleep(0)``
before reading stream state, so the publish signal is
processed first.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, OK I get what this means now. I still don't like the advice of telling people to await on an otherwise-forbidden API. We can probably just provide a workflow_stream.wait_for_init() helper that does the right thing.

- WorkflowStreamClient.subscribe now raises RuntimeError when called
  with result_type=Payload, matching the topic-handle layer's
  type=Payload rejection. Closes the direct-subscribe escape hatch
  for the Payload-vs-decoded-T ambiguity flagged in PR #1423 review.
- New regression test test_subscribe_with_payload_result_type_rejected.
- Extend TopicHandleUniquenessWorkflow probe to also cover the
  workflow-side WorkflowStream.topic(type=Payload) rejection — the
  client-side equivalent was already covered, the workflow side was
  not.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jssmith added a commit that referenced this pull request Apr 30, 2026
Re-applies the openai_agents streaming integration originally split out
of PR #1423 on commit 59c7582, updated for the post-PR API:
TResponseStreamEvent is a typing-special form, not a class, so the
topic stays untyped (default Any) and subscribers pass
result_type=TResponseStreamEvent on their own subscribe call.

Opt in via `OpenAIAgentsPlugin(model_params=ModelActivityParameters(
streaming_event_topic="..."))`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make WorkflowStreamItem generic in T so subscribers get a typed
data field that matches the result_type passed to subscribe:
  - subscribe(result_type=T)        -> WorkflowStreamItem[T]
  - subscribe()                     -> WorkflowStreamItem[Any]
  - subscribe(result_type=RawValue) -> WorkflowStreamItem[RawValue]

Adds def-style overloads to WorkflowStreamClient.subscribe (matching
the existing TopicHandle/WorkflowTopicHandle generic style) and
tightens TopicHandle.subscribe to AsyncIterator[WorkflowStreamItem[T]].
The internal workflow-side _log is annotated as
list[WorkflowStreamItem[Payload]] since the workflow does not decode.

No runtime behavior change; existing tests (which use unparameterized
WorkflowStreamItem) continue to type-check as WorkflowStreamItem[Any].

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jssmith added a commit that referenced this pull request Apr 30, 2026
Re-applies the openai_agents streaming integration originally split out
of PR #1423 on commit 59c7582, updated for the post-PR API:
TResponseStreamEvent is a typing-special form, not a class, so the
topic stays untyped (default Any) and subscribers pass
result_type=TResponseStreamEvent on their own subscribe call.

Opt in via `OpenAIAgentsPlugin(model_params=ModelActivityParameters(
streaming_event_topic="..."))`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jssmith added a commit that referenced this pull request Apr 30, 2026
Re-applies the google_adk_agents streaming integration originally split
out of PR #1423 on commit 59c7582. The bridge honors `stream=True`
and publishes raw `LlmResponse` chunks through a typed topic handle.

Opt in via the plugin's `streaming_event_topic` parameter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants