Improve TorchAO quantization test coverage and XPU support by jiqing-feng · Pull Request #13530 · huggingface/diffusers

jiqing-feng · 2026-04-21T05:59:22Z

What does this PR do?

This PR improves the TorchAO quantization testing infrastructure with several fixes: enabling int4wo tests on Intel XPU, implementing _dequantize for TorchAO, fixing input dtype mismatches, and fixing training gradient underflow.

Changes

Enable int4wo tests on XPU: Removed the _int4wo_skip marker that restricted int4wo tests to CUDA only, allowing them to run on all accelerator backends.
XPU-specific int4 packing format: Added XPU-specific handling in _get_quant_config() — Intel XPU requires int4_packing_format="plain_int32" for Int4WeightOnlyConfig.
Fix input dtype casting: Introduced _get_dummy_inputs_for_model(model) helper in QuantizationTesterMixin to automatically cast floating-point input tensors to the model's parameter dtype, preventing dtype mismatches during quantized model inference.
Implement _dequantize for TorchAO: Added _dequantize() method in TorchAoHfQuantizer that iterates all nn.Linear modules, calls weight.dequantize() on TorchAOBaseTensor weights, and replaces them with standard nn.Parameter. Also fixed _verify_if_layer_quantized to check isinstance(module.weight, TorchAOBaseTensor) so dequantized layers are correctly detected as non-quantized.
Fix training gradient underflow: Changed autocast dtype from float16 to bfloat16 in _test_quantization_training. Float16's limited dynamic range (max ~65504, min subnormal ~5.96e-8) causes gradients to underflow to zero when passing through quantized tensor subclass operations; bfloat16 shares float32's exponent range and avoids this issue.
Reduce WanAnimate TorchAO test input sizes: Shrunk dummy inputs in TestWanAnimateTransformer3DTorchAo to avoid OOM on devices without FlashAttention (e.g. XPU, which falls back to math SDPA and materializes the full O(S²) attention matrix). Reduced hidden_states from (1,36,21,64,64) to (1,36,5,16,16) and face_pixel_values from (1,3,77,512,512) to (1,3,13,512,512), bringing self-attention sequence length from 21,504 to 320 and peak attention memory from ~74 GiB to ~16 MB. Face frame count (13) is chosen so the face encoder's two stride-2 convolutions produce temporal output 4, plus 1 padding = 5, matching hidden_states temporal dim.

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

jiqing-feng · 2026-04-21T05:59:56Z

Hi @sayakpaul . Would you please review this PR? Thanks!

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

sayakpaul · 2026-04-21T14:59:43Z

There are a bunch of things going on in this PR. I would suggest breaking the PR into smaller PRs.

jiqing-feng · 2026-04-22T02:10:00Z

Hi @sayakpaul . Thanks for the review! I've split this PR into 5 smaller independent PRs as suggested:

Enable TorchAO int4wo quantization tests on XPU — #13537
- Remove _int4wo_skip marker on XPU
- Add int4_packing_format="plain_int32" for XPU int4 quantization
Implement _dequantize for TorchAO quantizer — #13538
- Add _dequantize() method to TorchAoHfQuantizer
- Fix _verify_if_layer_quantized to check TorchAOBaseTensor weight type
Add _get_dummy_inputs_for_model helper — #13539
- Cast floating-point input tensors to model's parameter dtype automatically
- Defined in both QuantizationTesterMixin and QuantizationCompileTesterMixin
Fix training gradient underflow — #13540
- Change autocast from float16 to bfloat16 to prevent gradient underflow in quantized training tests
Reduce WanAnimate test input sizes — #13541
- Reduce spatial/temporal dimensions to prevent OOM on devices without FlashAttention (SDPA math backend materializes O(S²) attention matrix)

Each PR is independent and can be reviewed/merged separately. Will close this PR once the split PRs are up.

sayakpaul · 2026-04-22T10:31:28Z

    def is_compileable(self) -> bool:
        return True
+
+    def _dequantize(self, model):


We shouldn't have dequantize here in this PR right?

Yes, please review the change here: #13538

sayakpaul · 2026-04-22T10:32:44Z

        return {
            "hidden_states": randn_tensor(
-                (1, 36, 21, 64, 64), generator=self.generator, device=torch_device, dtype=self.torch_dtype
+                (1, 36, 5, 16, 16), generator=self.generator, device=torch_device, dtype=self.torch_dtype


Explain the changes.

It's for avoiding OOM, details see: #13541. Please let me know if you want comments in the code.

sayakpaul · 2026-04-22T10:33:11Z

+    def _get_dummy_inputs_for_model(self, model):
+        inputs = self.get_dummy_inputs()
+        model_dtype = next(model.parameters()).dtype
+        return {
+            k: v.to(model_dtype) if isinstance(v, torch.Tensor) and v.is_floating_point() else v
+            for k, v in inputs.items()
+        }


Why do we need to override?

QuantizationCompileTesterMixin is an independent mixin that doesn't inherit from QuantizationTesterMixin. Test classes may use either one or both, so the method needs to be defined in both places.

Alternatively, I can extract it into a shared base class or a standalone utility function to avoid code duplication. Let me know which approach you prefer. Please review this change in #13539

jiqing-feng · 2026-04-23T01:27:49Z

Hi @sayakpaul . I have separated this PR into 5 small PRs, please review them 1 by 1 if it is easier for you. Thanks!

jiqing-feng added 7 commits April 20, 2026 18:55

enable int4wo tests on XPU

bef284c

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix typo

ca507a8

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix input dtype

c51708e

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix int4 config for xpu

6df4b31

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix format

8a9013d

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

only int4wo need specific format

81e7015

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix config name

4e4e759

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

github-actions Bot added tests size/M PR with diff < 200 LOC labels Apr 21, 2026

fix dequantize and training

8180979

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

github-actions Bot added quantization size/M PR with diff < 200 LOC and removed size/M PR with diff < 200 LOC labels Apr 21, 2026

jiqing-feng changed the title ~~Enable TorchAO int4 weight-only quantization tests on Intel XPU~~ Improve TorchAO quantization test coverage and XPU support Apr 21, 2026

fix test size

d210d4a

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

github-actions Bot added size/M PR with diff < 200 LOC and removed size/M PR with diff < 200 LOC labels Apr 21, 2026

fix size

0ba8682

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

github-actions Bot added size/M PR with diff < 200 LOC and removed size/M PR with diff < 200 LOC labels Apr 21, 2026

Merge branch 'main' into torchao

98aba52

github-actions Bot added size/M PR with diff < 200 LOC and removed size/M PR with diff < 200 LOC labels Apr 22, 2026

sayakpaul reviewed Apr 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve TorchAO quantization test coverage and XPU support#13530

Improve TorchAO quantization test coverage and XPU support#13530
jiqing-feng wants to merge 11 commits intohuggingface:mainfrom
jiqing-feng:torchao

jiqing-feng commented Apr 21, 2026 •

edited

Loading

Uh oh!

jiqing-feng commented Apr 21, 2026

Uh oh!

sayakpaul commented Apr 21, 2026

Uh oh!

jiqing-feng commented Apr 22, 2026 •

edited

Loading

Uh oh!

sayakpaul Apr 22, 2026

Uh oh!

jiqing-feng Apr 23, 2026

Uh oh!

sayakpaul Apr 22, 2026

Uh oh!

jiqing-feng Apr 23, 2026

Uh oh!

sayakpaul Apr 22, 2026

Uh oh!

jiqing-feng Apr 23, 2026

Uh oh!

jiqing-feng commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jiqing-feng commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Changes

Uh oh!

jiqing-feng commented Apr 21, 2026

Uh oh!

sayakpaul commented Apr 21, 2026

Uh oh!

jiqing-feng commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

jiqing-feng Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

sayakpaul Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

jiqing-feng Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

sayakpaul Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

jiqing-feng Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

jiqing-feng commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jiqing-feng commented Apr 21, 2026 •

edited

Loading

jiqing-feng commented Apr 22, 2026 •

edited

Loading