Skip to content

fix: fp8 group_fuse_moe#1323

Merged
shihaobai merged 2 commits into
mainfrom
fuse_moe_fp8
May 27, 2026
Merged

fix: fp8 group_fuse_moe#1323
shihaobai merged 2 commits into
mainfrom
fuse_moe_fp8

Conversation

@shihaobai
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the Triton kernel for grouped fused MoE to apply a mask and default value when loading a_scale_ptrs, preventing potential out-of-bounds memory accesses. Additionally, it introduces and updates several autotune kernel configuration JSON files for Triton 3.5.1 on NVIDIA H100 GPUs, covering operations like grouped matmul, fused mrope, and silu_and_mul. There are no review comments to address, and I have no further feedback to provide.

@shihaobai shihaobai merged commit 375ad57 into main May 27, 2026
1 check passed
@shihaobai shihaobai deleted the fuse_moe_fp8 branch May 27, 2026 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant