fix: fp8 group_fuse_moe by shihaobai · Pull Request #1323 · ModelTC/LightLLM

shihaobai · 2026-05-27T07:44:59Z

No description provided.

gemini-code-assist

Code Review

This pull request updates the Triton kernel for grouped fused MoE to apply a mask and default value when loading a_scale_ptrs, preventing potential out-of-bounds memory accesses. Additionally, it introduces and updates several autotune kernel configuration JSON files for Triton 3.5.1 on NVIDIA H100 GPUs, covering operations like grouped matmul, fused mrope, and silu_and_mul. There are no review comments to address, and I have no further feedback to provide.

shihaobai added 2 commits May 27, 2026 05:28

fix fp8 group moe

db98f40

H100 122b fp8 tuning config

eeec627

gemini-code-assist Bot reviewed May 27, 2026

View reviewed changes

shihaobai merged commit 375ad57 into main May 27, 2026
1 check passed

shihaobai deleted the fuse_moe_fp8 branch May 27, 2026 07:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: fp8 group_fuse_moe#1323

fix: fp8 group_fuse_moe#1323
shihaobai merged 2 commits into
mainfrom
fuse_moe_fp8

shihaobai commented May 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shihaobai commented May 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant