Calls to sin() don't get converted to sinf, leading to inefficient use of vector variants with -fveclib=ArmPL #139044

willlovett-arm · 2025-05-08T08:15:24Z

#include <math.h>

void foo(float *p)
{
for (int x=0; x<999; ++x) {
    p[x] = sin(p[x]);
  }
}

generates calls to armpl_vsinq_f64 instead of armpl_vsinq_f32.

If we replace that code with sinf it works fine:

#include <math.h>

void foo(float *p)
{
for (int x=0; x<999; ++x) {
    p[x] = sinf(p[x]);
  }
}

I'm reasonably sure it's legitimate to implicitly switch to sinf(), and this would lead to double the vector throughput.

The text was updated successfully, but these errors were encountered:

llvmbot · 2025-05-08T08:15:41Z

@llvm/issue-subscribers-backend-aarch64

Author: Will Lovett (willlovett-arm)

https://godbolt.org/z/fE16nGPxP

#include &lt;math.h&gt;

void foo(float *p)
{
for (int x=0; x&lt;999; ++x) {
    p[x] = sin(p[x]);
  }
}

generates calls to armpl_vsinq_f64 instead of armpl_vsinq_f32.

If we replace that code with sinf it works fine:

#include &lt;math.h&gt;

void foo(float *p)
{
for (int x=0; x&lt;999; ++x) {
    p[x] = sinf(p[x]);
  }
}

I'm reasonably sure it's legitimate to implicitly switch to sinf(), and this would lead to double the vector throughput.

davemgreen · 2025-05-08T09:24:18Z

We do this if the function is a @sin, but not for @llvm.sin intrinsics.
https://godbolt.org/z/MTKefx5Yr

This optimization already exists, but for the libcall versions of these functions and not for their intrinsic form. Solves #139044. There are probably more opportunities for other intrinsics, because the switch-case in `LibCallSimplifier::optimizeCall` covers only `pow`, `exp2`, `log`, `log2`, `log10`, `sqrt`, `memset`, `memcpy` and `memmove`.

…wed (#139082) This optimization already exists, but for the libcall versions of these functions and not for their intrinsic form. Solves llvm/llvm-project#139044. There are probably more opportunities for other intrinsics, because the switch-case in `LibCallSimplifier::optimizeCall` covers only `pow`, `exp2`, `log`, `log2`, `log10`, `sqrt`, `memset`, `memcpy` and `memmove`.

fhahn · 2025-05-09T09:32:50Z

Should be fixed by #139082

willlovett-arm · 2025-05-09T15:00:55Z

Amazing turnaround! Thanks @guy-david , all 🙏

willlovett-arm added backend:AArch64 SVE ARM Scalable Vector Extensions labels May 8, 2025

davemgreen added llvm:instcombine floating-point Floating-point math and removed backend:AArch64 SVE ARM Scalable Vector Extensions labels May 8, 2025

MacDue added the missed-optimization label May 8, 2025

guy-david mentioned this issue May 8, 2025

[SimplifyLibCalls] Shrink sin, cos to sinf, cosf when allowed #139082

Merged

fhahn closed this as completed May 9, 2025

EugeneZelenko added llvm:transforms and removed llvm:instcombine labels May 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calls to sin() don't get converted to sinf, leading to inefficient use of vector variants with -fveclib=ArmPL #139044

Calls to sin() don't get converted to sinf, leading to inefficient use of vector variants with -fveclib=ArmPL #139044

willlovett-arm commented May 8, 2025

llvmbot commented May 8, 2025

davemgreen commented May 8, 2025 •

edited

Loading

fhahn commented May 9, 2025

willlovett-arm commented May 9, 2025

Calls to sin() don't get converted to sinf, leading to inefficient use of vector variants with -fveclib=ArmPL #139044

Calls to sin() don't get converted to sinf, leading to inefficient use of vector variants with -fveclib=ArmPL #139044

Comments

willlovett-arm commented May 8, 2025

llvmbot commented May 8, 2025

davemgreen commented May 8, 2025 • edited Loading

fhahn commented May 9, 2025

willlovett-arm commented May 9, 2025

davemgreen commented May 8, 2025 •

edited

Loading