helion.language.dot_scaled#

helion.language.dot_scaled(mat1, mat1_scale, mat1_format, mat2, mat2_scale, mat2_format, acc=None, out_dtype=None)[source]#

Performs a block-scaled matrix multiplication using Triton’s tl.dot_scaled.

This operation performs matrix multiplication with block-scaled inputs in formats such as FP4 (e2m1), FP8 (e4m3, e5m2), BF16, and FP16. Each input tensor has an associated scale factor tensor and format string.

Parameters:

mat1 (Tensor) – First matrix (2D tensor of packed data)
mat1_scale (Tensor) – Scale factors for mat1 (2D tensor)
mat1_format (str) – Format string for mat1 (one of “e2m1”, “e4m3”, “e5m2”, “bf16”, “fp16”)
mat2 (Tensor) – Second matrix (2D tensor of packed data)
mat2_scale (Tensor) – Scale factors for mat2 (2D tensor)
mat2_format (str) – Format string for mat2 (one of “e2m1”, “e4m3”, “e5m2”, “bf16”, “fp16”)
acc (Tensor | None) – Optional accumulator tensor (2D, float32 or float16)
out_dtype (dtype | None) – Optional output dtype for the multiplication

Return type:

Tensor

Returns:

Result of block-scaled matrix multiplication.