-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8266431: Dual-Pivot Quicksort improvements (Radix sort) #13568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
👋 Welcome back lbourges! A progress list of the required criteria for merging this PR into |
Webrevs
|
Here are JMH ArraysSort benchmark results on openjdk20 (23.03.27): I will run again benchmarks on latest master vs PR tomorrow |
The changes look fine for me, approve this PR. |
@AlanBateman @rose00 @mbreinhold Could any core-libs reviewer have a look ? |
|
@bourgesl This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Please @AlanBateman could you review this PR? |
Hello, please see an implementation of x86 SIMD sort for Java here: #14227 |
These benchmarks are all small arrays. We need to check for possible regressions here. |
Keeping this PR alive |
|
Sorry, all large arrays. I suspect that the average-length sort is not represented.
How does keep alive work? Will the data from that external web site still be here for the reader in 15 years' time? |
Hm... I thought that at least full webrevs are stored somewhere on https://cr.openjdk.org/. Turns out they aren't. They are stored on https://openjdk.github.io/cr/. Okay, a cheap thing to do now is to ask the Web Archive to store it, which I just did:
|
Hi team, There are a big efforts now to improve sorting with x86_64 AVX512 But at the same time this PR offers algorithmic improvements, Alan, you mentioned that DualPivotQuicksort will need detailed review. I think it would be logical to integrate new DualPivotQuicksort first, |
As before, I think the main question with this change is whether adding radix sort to the mix is worth the complexity and additional code to maintain. Also as we discussed in the previous PR, the additional memory needed for the radix sort may have an effect on other things that are going on concurrently. I know it has been updated to handle OOME but I think potential reviewers would need to be comfortable with that part. |
Hi Alan (@AlanBateman) and team, I understand your doubts about Radix sort, let's me to provide my thoughts. Code of Radix sort is about 10% of all code and very clear for understanding: scan digits + then place elements according histogram. Other sorting algorithms have more complexity (like merging sort, quicksort partitioning, heap sort). Therefore, I don't expect problems with maintaining. Why Radix sort is important? Even Quicksort is very fast, but it relies on comparability of elements only and knows nothing about nature of elements to be sorted. Radix sort knows the structure of elements (32 or 64 bits) and therefore works faster (on random data). I was asked many times why we don't use Radix sort, so the idea is not new. Introducing of Radix sort is like using of counting sort for byte/short/char, the same approach. Existing version of sort() will request additional memory, if tryMergingSort() detects already sorted subsequences. In this case Radix sort or Quicksort will not be invoked at all. Radix sort is invoked, if merging sort is not completed (and therefore, no allocated memory). Additional memory is requested by either merging sort or Radix sort, not by both algorithms on the same input. It means these algorithms never request double memory. Memory effect (merging sort and Radix sort) is not summarized. In case of parallel sorting the additional buffer allocated for parallel merge sort will be reused by tryMergingSort() as well as by Radix sort. Summary: the size of total requested memory always equals to the size of part to be sorted (not more), no any duplicates. Need to say that Radix sort is invoked not on all inputs, many inputs are covered by Quicksort or merging sort. If Quicksort suggests O(n*ln(n)), Radix sort shows linear, much better, O(n) time! As it was mentioned before, we added check against OOM, sorting will be completed successfully in any case. Existing implementation will fail with error, if there is no enough memory. In few words, Radix sort is simple and it impacts not more than already introduced algorithms in JDK. I encourage reviewers to look at this PR. Thank you, |
Hi team, @AlanBateman Alan, Reviewers, We have two improvements of sorting: algorithmic approach (this PR) and based on AVX512 (#14227). Thank you, |
Vladimir, please do not consider me a reviewer. I'm merely a passer-by: I didn't review a single line of this PR, nor do I have expertise to do so. |
I too share concerns about the potential increased use of memory for sorting ints/longs/floats/doubles. With modern SIMD hardware and data parallel techniques we can apply quicksort much more efficiently. I think it is important to determine to what extent this reduces the need for radix sort. To determine that we would need to carefully measure against the AVX-512 implementation (with ongoing improvements) with appropriately initialized data to sort, and further measure against an AVX2 version. |
Hi @PaulSandoz Paul, @AlanBateman Alan, I agree that additional memory must not change current behavior of the core. When parallel sort was introduced in JDK 8, parallel implementation with additional We take care of memory usages in sequential sorting, therefore I suggest the There are no memory changes in sequential sorting and in parallel sorting Radix Paul, Alan, |
Hi Paul (@PaulSandoz), Alan (@AlanBateman), |
delay due to holidays and vacations |
@bourgesl This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Keep alive |
@bourgesl This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Keep alive |
@bourgesl This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Sorting of short/byte/char is in progress |
Good luck vladimir... |
@bourgesl This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Keep alive for few cycles more |
@bourgesl this pull request can not be integrated into git checkout dpqs23
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
@bourgesl This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Sorting of short/byte/char is still in progress |
@bourgesl This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Sorting of short/byte/char is almost finished |
@bourgesl This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Sorting of short/byte/char is almost finished |
Hi. I've seen this PR being worked on for a long time. Did you discuss the motivation and objectives for this PR in the relevant mailing list as indicated in the JDK issue? Reviewing this PR seems like a handful and given new features like Valhalla and the Vector API will soon be available, did you consider waiting for those new features rather than pressing forward with this PR? |
Hi @minborg, Thank you for your comments! You're right, this is a long period of activity, please see the short history of it. Dual-Pivot Quicksort was suggested and integrated into JDK 7 by Josh Bloch, John Bentley and me in September-November 2009. Later I improved the sorting several times faster in JDK 7, 8 and 14. In April 2021 Laurent Bourges suggested adding Radix sort which is several times faster than Quicksort on a large array of random data. In May 2021 I created my PR, but my account was not enabled very fast, so I continued optimization. Later we had to move to Laurent's PR with new portions of optimizations. As we discussed with Alan Bateman and Paul Sandoz, Radix sort will be applied to parallel sort only. Even though Radix sort is not called for sequential sorting, Dual-Pivot Quicksort became faster for many data types, like double/float, char/byte/short, on various inputs (both, sequential and parallel cases). In August 2023 Vamsi Parsa improved the sorting using AVX512 instructions. It made sorting faster on Linux on Intel processors with AVX512. I can say my current changes speedup this AVX512-ed version, it happens due to different types of optimizations (algorithmic vs. instruction). My algorithmic changes don't correlate with new features (Valhalla / Vector API / etc.) and can be developed / integrated independently. Now I have the final version of Dual-Pivot Quicksort with fine benchmarking results. I need time to check test coverage. I plan to create new PR under my account and publish updated classes and benchmarks within one month. I hope it makes sense. @minborg, what do you think, what is your vision? Best regards, |
Thanks for the update @iaroslavski . I am a bit worried that it will be difficult to find reviewers for this PR since it has a relatively large spanning scope. But I might be wrong about that. |
@bourgesl This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Need time to check test coverage |
@bourgesl This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Need more time to check test coverage |
@bourgesl This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a |
keep alive, need time to complete |
@bourgesl This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a |
keep alive, need time to finish |
I am going on previous PR by Vladimir Yaroslavskyi: #3938
Progress
Error
Warnings
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/13568/head:pull/13568
$ git checkout pull/13568
Update a local copy of the PR:
$ git checkout pull/13568
$ git pull https://git.openjdk.org/jdk.git pull/13568/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 13568
View PR using the GUI difftool:
$ git pr show -t 13568
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/13568.diff
Using Webrev
Link to Webrev Comment