Skip to content

[mlir][ROCDL] Remove unneeded bf16 expansion in LowerGPUToROCDL #139603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 12, 2025

Conversation

krzysz00
Copy link
Contributor

The umbrella pass fol lowering GPU ops to ROCDL (aka lowering to LLVM

  • the AMDGPU-specific setup) would call the arith patterns that manually implemented extf and truncf on bfloat because the LLVM AMDGPU backend used to not suppport those operaitons.

Since the backend does now support these operations and has for quite some time, remove these patterns from the default lowering flow.

The umbrella pass fol lowering GPU ops to ROCDL (aka  lowering to LLVM
+ the AMDGPU-specific setup) would call the arith patterns that
manually implemented extf and truncf on bfloat because the LLVM AMDGPU
backend used to not suppport those operaitons.

Since the backend does now support these operations and has for quite
some time, remove these patterns from the default lowering flow.
@krzysz00 krzysz00 changed the title [mlir][ROCDL] Remove unneeded bf16 expansion in LowerGPUToRocdl [mlir][ROCDL] Remove unneeded bf16 expansion in LowerGPUToROCDL May 12, 2025
@krzysz00 krzysz00 requested a review from kuhar May 12, 2025 18:32
@llvmbot
Copy link
Member

llvmbot commented May 12, 2025

@llvm/pr-subscribers-mlir

Author: Krzysztof Drewniak (krzysz00)

Changes

The umbrella pass fol lowering GPU ops to ROCDL (aka lowering to LLVM

  • the AMDGPU-specific setup) would call the arith patterns that manually implemented extf and truncf on bfloat because the LLVM AMDGPU backend used to not suppport those operaitons.

Since the backend does now support these operations and has for quite some time, remove these patterns from the default lowering flow.


Full diff: https://github.com/llvm/llvm-project/pull/139603.diff

1 Files Affected:

  • (modified) mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp (-1)
diff --git a/mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp b/mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
index dd16ec4b73e9f..c52bf505de4a5 100644
--- a/mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
+++ b/mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
@@ -319,7 +319,6 @@ struct LowerGpuOpsToROCDLOpsPass final
     {
       RewritePatternSet patterns(ctx);
       populateGpuRewritePatterns(patterns);
-      arith::populateExpandBFloat16Patterns(patterns);
       (void)applyPatternsGreedily(m, std::move(patterns));
     }
 

@llvmbot
Copy link
Member

llvmbot commented May 12, 2025

@llvm/pr-subscribers-mlir-gpu

Author: Krzysztof Drewniak (krzysz00)

Changes

The umbrella pass fol lowering GPU ops to ROCDL (aka lowering to LLVM

  • the AMDGPU-specific setup) would call the arith patterns that manually implemented extf and truncf on bfloat because the LLVM AMDGPU backend used to not suppport those operaitons.

Since the backend does now support these operations and has for quite some time, remove these patterns from the default lowering flow.


Full diff: https://github.com/llvm/llvm-project/pull/139603.diff

1 Files Affected:

  • (modified) mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp (-1)
diff --git a/mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp b/mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
index dd16ec4b73e9f..c52bf505de4a5 100644
--- a/mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
+++ b/mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
@@ -319,7 +319,6 @@ struct LowerGpuOpsToROCDLOpsPass final
     {
       RewritePatternSet patterns(ctx);
       populateGpuRewritePatterns(patterns);
-      arith::populateExpandBFloat16Patterns(patterns);
       (void)applyPatternsGreedily(m, std::move(patterns));
     }
 

@@ -319,7 +319,6 @@ struct LowerGpuOpsToROCDLOpsPass final
{
RewritePatternSet patterns(ctx);
populateGpuRewritePatterns(patterns);
arith::populateExpandBFloat16Patterns(patterns);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but looks like i don't have "approval" access

@krzysz00 krzysz00 merged commit 2880859 into llvm:main May 12, 2025
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants