-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[AArch64] Use SVE XAR for fixed-length operations. #139229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@llvm/issue-subscribers-backend-aarch64 Author: David Green (davemgreen)
There is a Neon SHA3 v2i64 XAR operation, but not for v4i32, v8i16 and v16i8. If sve2-sha3 is available we can use the SVE instructions instead.
https://godbolt.org/z/9hdqKoWMx (G1 and F1 are already OK). https://godbolt.org/z/fejTchexj
See #137162, this is an extension to that issue. |
The third link needs all architectural features to be enabled, else we don't see the transformation for G1: |
The first two links also need correction:
|
Godbolt uses a cache and sometimes need to refresh when changes happen to trunk. There is a little refresh button on the bottom left of the compilation window (it should not be cached to the updated version). |
There is a Neon SHA3 v2i64 XAR operation, but not for v4i32, v8i16 and v16i8. If sve2-sha3 is available we can use the SVE instructions instead.
https://godbolt.org/z/9hdqKoWMx (G1 and F1 are already OK).
vs with scalable vectors: https://godbolt.org/z/GhazeoaWY
https://godbolt.org/z/fejTchexj
See #137162, this is an extension to that issue.
The text was updated successfully, but these errors were encountered: