Skip to content

[AArch64] can use XAR for vector rotates where possible #137162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
k-arrows opened this issue Apr 24, 2025 · 14 comments · Fixed by #137629
Closed

[AArch64] can use XAR for vector rotates where possible #137162

k-arrows opened this issue Apr 24, 2025 · 14 comments · Fixed by #137629
Labels
backend:AArch64 good first issue https://github.com/llvm/llvm-project/contribute missed-optimization

Comments

@k-arrows
Copy link

Here is the code from gcc testsuite.
https://godbolt.org/z/fejTchexj

typedef char __attribute__ ((vector_size (16))) v16qi;
typedef unsigned short __attribute__ ((vector_size (16))) v8hi;
typedef unsigned int __attribute__ ((vector_size (16))) v4si;
typedef unsigned long long __attribute__ ((vector_size (16))) v2di;
typedef char __attribute__ ((vector_size (8))) v8qi;
typedef unsigned short __attribute__ ((vector_size (8))) v4hi;
typedef unsigned int __attribute__ ((vector_size (8))) v2si;

v2di
G1 (v2di r) {
    return (r >> 39) | (r << 25);
}

v4si
G2 (v4si r) {
    return (r >> 23) | (r << 9);
}

v8hi
G3 (v8hi r) {
    return (r >> 5) | (r << 11);
}

v16qi
G4 (v16qi r)
{
  return (r << 2) | (r >> 6);
}

v2si
G5 (v2si r) {
    return (r >> 22) | (r << 10);
}

v4hi
G6 (v4hi r) {
    return (r >> 7) | (r << 9);
}

v8qi
G7 (v8qi r)
{
  return (r << 3) | (r >> 5);
}
@llvmbot
Copy link
Member

llvmbot commented Apr 24, 2025

@llvm/issue-subscribers-backend-aarch64

Author: None (k-arrows)

Here is the code from gcc testsuite. https://godbolt.org/z/fejTchexj ```c typedef char __attribute__ ((vector_size (16))) v16qi; typedef unsigned short __attribute__ ((vector_size (16))) v8hi; typedef unsigned int __attribute__ ((vector_size (16))) v4si; typedef unsigned long long __attribute__ ((vector_size (16))) v2di; typedef char __attribute__ ((vector_size (8))) v8qi; typedef unsigned short __attribute__ ((vector_size (8))) v4hi; typedef unsigned int __attribute__ ((vector_size (8))) v2si;

v2di
G1 (v2di r) {
return (r >> 39) | (r << 25);
}

v4si
G2 (v4si r) {
return (r >> 23) | (r << 9);
}

v8hi
G3 (v8hi r) {
return (r >> 5) | (r << 11);
}

v16qi
G4 (v16qi r)
{
return (r << 2) | (r >> 6);
}

v2si
G5 (v2si r) {
return (r >> 22) | (r << 10);
}

v4hi
G6 (v4hi r) {
return (r >> 7) | (r << 9);
}

v8qi
G7 (v8qi r)
{
return (r << 3) | (r >> 5);
}

</details>

@davemgreen
Copy link
Collaborator

AArch64DAGToDAGISel::trySelectXAR could be updated where the xor is not present, generating a mov 0 instead.

@davemgreen davemgreen added the good first issue https://github.com/llvm/llvm-project/contribute label Apr 24, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 24, 2025

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

  1. Check that no other contributor has already been assigned to this issue. If you believe that no one is actually working on it despite an assignment, ping the person. After one week without a response, the assignee may be changed.
  2. In the comments of this issue, request for it to be assigned to you, or just create a pull request after following the steps below. Mention this issue in the description of the pull request.
  3. Fix the issue locally.
  4. Run the test suite locally. Remember that the subdirectories under test/ create fine-grained testing targets, so you can e.g. use make check-clang-ast to only run Clang's AST tests.
  5. Create a Git commit.
  6. Run git clang-format HEAD~1 to format your changes.
  7. Open a pull request to the upstream repository on GitHub. Detailed instructions can be found in GitHub's documentation. Mention this issue in the description of the pull request.

If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below.

@llvmbot
Copy link
Member

llvmbot commented Apr 24, 2025

@llvm/issue-subscribers-good-first-issue

Author: None (k-arrows)

Here is the code from gcc testsuite. https://godbolt.org/z/fejTchexj ```c typedef char __attribute__ ((vector_size (16))) v16qi; typedef unsigned short __attribute__ ((vector_size (16))) v8hi; typedef unsigned int __attribute__ ((vector_size (16))) v4si; typedef unsigned long long __attribute__ ((vector_size (16))) v2di; typedef char __attribute__ ((vector_size (8))) v8qi; typedef unsigned short __attribute__ ((vector_size (8))) v4hi; typedef unsigned int __attribute__ ((vector_size (8))) v2si;

v2di
G1 (v2di r) {
return (r >> 39) | (r << 25);
}

v4si
G2 (v4si r) {
return (r >> 23) | (r << 9);
}

v8hi
G3 (v8hi r) {
return (r >> 5) | (r << 11);
}

v16qi
G4 (v16qi r)
{
return (r << 2) | (r >> 6);
}

v2si
G5 (v2si r) {
return (r >> 22) | (r << 10);
}

v4hi
G6 (v4hi r) {
return (r >> 7) | (r << 9);
}

v8qi
G7 (v8qi r)
{
return (r << 3) | (r >> 5);
}

</details>

@gxyd
Copy link

gxyd commented Apr 24, 2025

Hi, I'm new to LLVM community, do you think if I can try fixing this issue? ✋

@davemgreen
Copy link
Collaborator

Sure thing - let us know if you run into any issues.

@Rajveer100
Copy link
Contributor

Rajveer100 commented Apr 28, 2025

@davemgreen
Consider this example:

$$X = 0 0 1 0 0 0 1 0 0 1$$ $$Y = 0 0 1 1 1 1 0 0 0 1$$ $$V = XOR(X, Y) = 0 0 0 1 1 1 1 0 0 0$$ $$BITS = 10$$ $$IMM = 7$$ $$ROTR(V, IMM) = 1 1 1 1 0 0 0 0 0 0$$ $$N0 = SHL(V, BITS-IMM) = 1 1 1 1 0 0 0 0 0 0$$ $$N1 = SRL(V, IMM) = 0 0 0 0 0 0 0 0 0 0$$ $$OR(N0, N1) = 1 1 1 1 0 0 0 0 0 0$$

@davemgreen
Copy link
Collaborator

Can you explain what you mean?

@Rajveer100
Copy link
Contributor

Rajveer100 commented Apr 28, 2025

I was facing an assertion error so was just making sure I understood the transformation correctly.

I presume the OR result should be equal to the ROTR result?

More info in the changes I made.

Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue Apr 28, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
@gxyd
Copy link

gxyd commented Apr 28, 2025

I'm sorry, I was a little busy with my work priorities, seems like someone has already sent a patch fixing the issue?

Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue Apr 28, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue Apr 28, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
@gxyd
Copy link

gxyd commented Apr 29, 2025

@davemgreen , kindly let me know if there is a guideline about a timeline within which I've to complete an issue, otherwise the issue can be taken up by someone else in llvm project, even though the issue is assigned to me?

Also, let me know if I now should be looking at some other issue instead maybe?

Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue Apr 30, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue Apr 30, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
@davemgreen
Copy link
Collaborator

@gxyd Sorry, I'm not sure I'm afraid. The only details I found were the brief details in https://www.llvm.org/docs/Contributing.html#bug-fixes. As you might imagine, we get some people sign up for first time issues only to disappear into the void never to be heard from again. The usual process is that if someone else is interested they ping the issue and checks if it is still being working on, or if the person has become busy with other things. If you want something else to look into, perhaps #119921 is a good one that requires some new tablegen patterns.

@Rajveer100 I wasn't sure what the output meant exactly, perhaps explain more about what you are trying to prove if you still need to. I'm pretty sure the transform is OK. It looks like your patch is a good start, I'll comment there.

@gxyd
Copy link

gxyd commented May 1, 2025

@gxyd Sorry, I'm not sure I'm afraid. The only details I found were the brief details in https://www.llvm.org/docs/Contributing.html#bug-fixes. As you might imagine, we get some people sign up for first time issues only to disappear into the void never to be heard from again. The usual process is that if someone else is interested they ping the issue and checks if it is still being working on, or if the person has become busy with other things. If you want something else to look into, perhaps #119921 is a good one that requires some new tablegen patterns.

Understood, really appreciate the response. Then I'll un-assign myself from this issue, and once I've more bandwidth I'll work more actively.

Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue May 1, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue May 3, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue May 6, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue May 7, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue May 7, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue May 7, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue May 8, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue May 8, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
Rajveer100 added a commit to Rajveer100/llvm-project that referenced this issue May 8, 2025
Resolves llvm#137162

For cases when there isn't any `XOR` in the transformation,
replace with a zero register.
davemgreen pushed a commit that referenced this issue May 9, 2025
Resolves #137162

For cases when there isn't any `XOR` in the transformation, replace with
a zero register.
@davemgreen
Copy link
Collaborator

I created #139229 as a followup if you are interested in making more of the fixed-width types work too. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 good first issue https://github.com/llvm/llvm-project/contribute missed-optimization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants