-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
map in +/- for Arrays
#59961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
map in +/- for Arrays
#59961
Conversation
Are these representative? The arrays being passed in are exactly the same array after all, so it's not unlikely that there is some special casing going on with |
|
That's a good point! I've re-run the benchmarks, and some of these do hold up in more general cases: julia> @btime A + B setup=(A = rand(3,3); B = rand(3,3));
39.452 ns (2 allocations: 144 bytes) # v"1.13.0-DEV.1387"
27.789 ns (2 allocations: 144 bytes) # this PR
julia> @btime A + B setup=(A = rand(3,3000); B = rand(3,3000));
10.130 μs (3 allocations: 70.40 KiB) # v"1.13.0-DEV.1387"
5.026 μs (3 allocations: 70.40 KiB) # this PRThe difference in the The main benefit comes in the wide matrix case, where the first dimension is too small for vectorization to kick in. Using linear indexing offers a significant speed-up. This was suggested in #47873 (comment). |
0ecb681 to
6cd702b
Compare
|
This seems to have broken |
|
maybe |
mapis a simpler operation and uses linear indexing forArrays. This often improves performance (occasionally enabling vectorization) and improves TTFX in common cases. It also automatically returns the correct result for 0-D arrays, unlike broadcasting that returns a scalar.Performance:
Similarly for
-.TTFX:
These are measured on