Skip to content

[SimplifyLibCalls] Shrink sin, cos to sinf, cosf when allowed #139082

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 9, 2025

Conversation

guy-david
Copy link
Contributor

@guy-david guy-david commented May 8, 2025

This optimization already exists, but for the libcall versions of these functions and not for their intrinsic form.
Solves #139044.

There are probably more opportunities for other intrinsics, because the switch-case in LibCallSimplifier::optimizeCall covers only pow, exp2, log, log2, log10, sqrt, memset, memcpy and memmove.

@llvmbot
Copy link
Member

llvmbot commented May 8, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Guy David (guy-david)

Changes

This optimization already exists, but for the libcall versions of these functions and not for their intrinsic form.


Full diff: https://github.com/llvm/llvm-project/pull/139082.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp (+5)
  • (added) llvm/test/Transforms/InstCombine/simplify-intrinsics.ll (+61)
diff --git a/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp b/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
index 941e787f91eff..94a79ad824370 100644
--- a/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
@@ -4136,6 +4136,11 @@ Value *LibCallSimplifier::optimizeCall(CallInst *CI, IRBuilderBase &Builder) {
       return optimizeMemCpy(CI, Builder);
     case Intrinsic::memmove:
       return optimizeMemMove(CI, Builder);
+    case Intrinsic::sin:
+    case Intrinsic::cos:
+      if (UnsafeFPShrink)
+        return optimizeUnaryDoubleFP(CI, Builder, TLI, /*isPrecise=*/true);
+      return nullptr;
     default:
       return nullptr;
     }
diff --git a/llvm/test/Transforms/InstCombine/simplify-intrinsics.ll b/llvm/test/Transforms/InstCombine/simplify-intrinsics.ll
new file mode 100644
index 0000000000000..8d7f70b6d1a61
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/simplify-intrinsics.ll
@@ -0,0 +1,61 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -passes=instcombine -S                             | FileCheck %s --check-prefixes=ANY,NO-FLOAT-SHRINK
+; RUN: opt < %s -passes=instcombine -enable-double-float-shrink -S | FileCheck %s --check-prefixes=ANY,DO-FLOAT-SHRINK
+
+target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
+
+declare double @llvm.cos.f64(double)
+declare float @llvm.cos.f32(float)
+
+declare double @llvm.sin.f64(double)
+declare float @llvm.sin.f32(float)
+
+; cos -> cosf
+
+; ANY-LABEL: define float @cos_no_fastmath(float %f) {
+; NO-FLOAT-SHRINK: call double @llvm.cos.f64
+; DO-FLOAT-SHRINK: %[[RES:[0-9]+]] = call float @llvm.cos.f32
+; DO-FLOAT-SHRINK: ret float %[[RES]]
+; ANY: }
+define float @cos_no_fastmath(float %f) {
+  %d = fpext float %f to double
+  %result = call double @llvm.cos.f64(double %d)
+  %truncated_result = fptrunc double %result to float
+  ret float %truncated_result
+}
+
+; ANY-LABEL: define float @cos_fastmath(float %f) {
+; ANY: %[[RES:[0-9]+]] = call fast float @llvm.cos.f32
+; ANY: ret float %[[RES]]
+; ANY: }
+define float @cos_fastmath(float %f) {
+  %d = fpext fast float %f to double
+  %result = call fast double @llvm.cos.f64(double %d)
+  %truncated_result = fptrunc fast double %result to float
+  ret float %truncated_result
+}
+
+; sin -> sinf
+
+; ANY-LABEL: define float @sin_no_fastmath(float %f) {
+; NO-FLOAT-SHRINK: call double @llvm.sin.f64
+; DO-FLOAT-SHRINK: %[[RES:[0-9]+]] = call float @llvm.sin.f32
+; DO-FLOAT-SHRINK: ret float %[[RES]]
+; ANY: }
+define float @sin_no_fastmath(float %f) {
+  %d = fpext float %f to double
+  %result = call double @llvm.sin.f64(double %d)
+  %truncated_result = fptrunc double %result to float
+  ret float %truncated_result
+}
+
+; ANY-LABEL: define float @sin_fastmath(float %f) {
+; ANY: %[[RES:[0-9]+]] = call fast float @llvm.sin.f32
+; ANY: ret float %[[RES]]
+; ANY: }
+define float @sin_fastmath(float %f) {
+  %d = fpext fast float %f to double
+  %result = call fast double @llvm.sin.f64(double %d)
+  %truncated_result = fptrunc fast double %result to float
+  ret float %truncated_result
+}

@fhahn fhahn requested review from arsenm and davemgreen May 8, 2025 13:41
; ANY: }
define float @sin_fastmath(float %f) {
%d = fpext fast float %f to double
%result = call fast double @llvm.sin.f64(double %d)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this just valid with all fast-math flags, or some subset?

Copy link
Contributor Author

@guy-david guy-david May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing optimization for libcalls requires all fast-math flags to be present.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should fix that, this shouldn't require all of them (in particular nsz or contract)

@guy-david guy-david force-pushed the users/guy-david/shrink-sin-cos-f64-to-f32 branch from 04e11b7 to 31c76dd Compare May 8, 2025 13:47
@guy-david guy-david added the floating-point Floating-point math label May 8, 2025
@guy-david guy-david force-pushed the users/guy-david/shrink-sin-cos-f64-to-f32 branch from 31c76dd to 5ce64df Compare May 8, 2025 14:16
@guy-david guy-david force-pushed the users/guy-david/shrink-sin-cos-f64-to-f32 branch from 5ce64df to 76489c3 Compare May 8, 2025 17:11
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but this can handle the vector case as well

%result = call fast double @llvm.sin.f64(double %d)
%truncated_result = fptrunc double %result to float
ret float %truncated_result
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also test vector case

Copy link
Contributor Author

@guy-david guy-david May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work, I think because of this check:

if (!CI->getType()->isDoubleTy() || !CalleeFn)

Let's build up on it in a subsequent PR.

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

Would be good to do this for all relevant FP functions

@guy-david guy-david merged commit a1beb61 into main May 9, 2025
9 of 11 checks passed
@guy-david guy-david deleted the users/guy-david/shrink-sin-cos-f64-to-f32 branch May 9, 2025 04:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants