Fix HunyuanVideo 1.5 modular glyph regex for curly quotes by Ricardo-M-L · Pull Request #13523 · huggingface/diffusers

Ricardo-M-L · 2026-04-21T04:02:54Z

What does this PR do?

extract_glyph_texts in the HunyuanVideo 1.5 modular pipeline had a copy-paste typo in its regex:

# src/diffusers/modular_pipelines/hunyuan_video1_5/encoders.py
pattern = r"\"(.*?)\"|\"(.*?)\""

Both alternatives match ASCII double quotes. The second branch is redundant, and any text quoted with Chinese / smart curly quotes (“…”) is silently dropped — no glyph extraction happens, so the model never receives the formatted Text "..." instruction.

The canonical, non-modular pipeline has the intended pattern (https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/hunyuan_video1_5/pipeline_hunyuan_video1_5.py#L93):

pattern = r"\"(.*?)\"|“(.*?)”"

HunyuanVideo 1.5 is a bilingual Tencent model, so Chinese prompts that use “…” to mark on-screen text are a realistic and documented input shape. Under the modular path they currently skip glyph extraction entirely and produce mis-rendered poster/caption text.

Reproduction

import re

buggy = r"\"(.*?)\"|\"(.*?)\""          # modular path today
fixed = r"\"(.*?)\"|“(.*?)”"            # non-modular path / this fix

prompt = 'A poster with “你好”'
print(re.findall(buggy, prompt))  # -> []      (silently ignored)
print(re.findall(fixed, prompt))  # -> [('', '你好')]

Fix

Align the modular pipeline's pattern with the existing, correct non-modular pattern. One-character change.

Before submitting

This PR fixes a typo or improves the docs.
Did you read the contributor guideline?
Did you make sure to update the documentation with your changes? (N/A — internal helper, no doc references)
Did you write any new necessary tests? (N/A — trivially verifiable regex constant)

Who can review?

@DN6 @yiyixuxu

The `extract_glyph_texts` helper in the HunyuanVideo 1.5 modular pipeline had a copy-paste typo: both alternatives of the pattern matched ASCII double quotes (`\"(.*?)\"|\"(.*?)\"`), making the second branch redundant and silently dropping any text wrapped in the Chinese/smart curly quotes `“…”`. The non-modular counterpart in `src/diffusers/pipelines/hunyuan_video1_5/pipeline_hunyuan_video1_5.py` uses the correct pattern `\"(.*?)\"|“(.*?)”`, which is what the model expects for glyph rendering — HunyuanVideo 1.5 is a bilingual model, so Chinese prompts that quote text with `“…”` would skip glyph extraction entirely under the modular path and produce mis-rendered poster text. This aligns the modular pipeline's pattern with the canonical one.

github-actions Bot added modular-pipelines size/S PR with diff < 50 LOC labels Apr 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix HunyuanVideo 1.5 modular glyph regex for curly quotes#13523

Fix HunyuanVideo 1.5 modular glyph regex for curly quotes#13523
Ricardo-M-L wants to merge 1 commit intohuggingface:mainfrom
Ricardo-M-L:fix-hunyuan-video15-modular-glyph-regex

Ricardo-M-L commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ricardo-M-L commented Apr 21, 2026

What does this PR do?

Reproduction

Fix

Before submitting

Who can review?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant