forked from NVIDIA/cutlass
-
Notifications
You must be signed in to change notification settings - Fork 61
Open
Description
Summary
Some models use Float scales even for FP8 weights. FP8 ScaledMM should be able to support FP32 weights
Details
Currently, FP16 scales are being used in FP8 scaledMM.
I tried to use FP32 scales by using an FP32 copy-atom for scales, which is similar to the FP16 copy-atom being used to copy scales, but it didn't help
Please advise how to go about adding support for Float32 scales in FP8 scaledMM.
Thank you!
Metadata
Metadata
Assignees
Labels
No labels