You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The merge sort algorithm is a comparison-based sorting algorithm. In this sample, we use a top-down implementation, which recursively splits the list into two halves (called sublists) until each sublist is size 1. We then merge sublists two at a time to produce a sorted list. This sample could run in serial or parallel with OpenMP* Tasking #pragma omp task and #pragma omp taskwait.
3
+
The `MergeSort OMP` sample uses merge sort, which is a comparison-based sorting algorithm. In this sample, we use a top-down implementation, which recursively splits the list into two halves (called sublists) until each sublist is size 1.
4
4
5
-
For more details, see the wiki on [merge sort](http://en.wikipedia.org/wiki/Merge_sort) algorithm and top-down implementation.
5
+
>**Note**: For more details, see the [Merge sort](http://en.wikipedia.org/wiki/Merge_sort) article on the algorithm and top-down implementation.
6
6
7
-
| Optimized for | Description
8
-
|:--- |:---
9
-
| OS | MacOS Catalina or newer
10
-
| Hardware | Skylake with GEN9 or newer
11
-
| Software | Intel® oneAPI C++ Compiler Classic
12
-
| What you will learn | How to accelerate a scalar program using OpenMP* tasks
13
-
| Time to complete | 15 minutes
14
-
15
-
Performance number tabulation
16
-
17
-
| Version | Performance data
18
-
|:--- |:---
19
-
| Scalar baseline | 1.0
20
-
| OpenMP Task | 4.0x speedup
7
+
| Area | Description
8
+
|:--- |:---
9
+
| What you will learn | How to accelerate a scalar program using OpenMP* tasks
10
+
| Time to complete | 15 minutes
21
11
22
12
23
13
## Purpose
24
14
25
-
Merge sort is a highly efficient recursive sorting algorithm. Known for its
26
-
greater efficiency over other common sorting algorithms, it can compute in
27
-
O(nlogn) time instead of O(n^2), making it a common choice for sorting
28
-
implementations that deal with large quantities of elements. While it is
29
-
already a very fast algorithm-- capable of sorting lists in a fraction of the
30
-
time it would take an algorithm such as quicksort or insertion sort, it can be
31
-
further accelerated with parallelism using OpenMP.
15
+
Merge sort is a highly efficient recursive sorting algorithm. Known for its greater efficiency over other common sorting algorithms, it can compute in O(nlogn) time instead of O(n^2), making it a common choice for sorting implementations that deal with large quantities of elements. While it is a very fast algorithm and capable of sorting lists faster than other algorithms, like quicksort or insertion sort, you can accelerate merge sort more with parallelism using OpenMP.
32
16
33
-
This code sample demonstrates how to convert a scalar implementation of merge
34
-
sort into a parallelized version with minimal changes to the original, using
35
-
OpenMP pragmas.
17
+
We then merge sublists two at a time to produce a sorted list. This sample could run in serial or parallel with OpenMP* Tasking `#pragma omp task` and `#pragma omp taskwait`.
36
18
19
+
## Prerequisites
20
+
21
+
| Optimized for | Description
22
+
|:--- |:---
23
+
| OS | macOS* Catalina or newer
24
+
| Hardware | Skylake with GEN9 or newer
25
+
| Software | Intel® oneAPI DPC++ Compiler
37
26
38
27
## Key Implementation Details
39
28
40
-
The OpenMP* version of the merge sort implementation uses the #pragma omp task
41
-
in its recursive calls, which allows the recursive calls to be handled by
42
-
different threads. The #pragma omp taskawait preceding the function call to
43
-
merge() ensures the two recursive calls are completed before the merge() is
44
-
executed. Through this use of OpenMP* pragmas, the recursive sorting algorithm
45
-
can effectively run in parallel, where each recursion is a unique task able to
46
-
be performed by any available thread.
29
+
This code sample demonstrates how to convert a scalar implementation of merge sort into a parallelized version with minimal changes to the original, using OpenMP pragmas.
47
30
48
-
## License
31
+
The OpenMP* version of the merge sort implementation uses the `#pragma omp task` in its recursive calls, which allows the recursive calls to be handled by different threads. The `#pragma omp taskawait` preceding the function call to `merge()` ensures the two recursive calls complete before the `merge()` is executed. Through this use of OpenMP* pragmas, the recursive sorting algorithm can effectively run in parallel, where each recursion is a unique task able to be performed by any available thread.
49
32
50
-
Code samples are licensed under the MIT license. See
51
-
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
33
+
Performance number tabulation.
52
34
53
-
Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt)
35
+
| Version | Performance Data
36
+
|:--- |:---
37
+
| Scalar baseline | 1.0
38
+
| OpenMP* Task | 4.0x speedup
54
39
55
40
41
+
## Build the `MergeSort OMP` Program
56
42
57
-
### Using Visual Studio Code* (Optional)
43
+
> **Note**: If you have not already done so, set up your CLI
44
+
> environment by sourcing the `setvars` script in the root of your oneAPI installation.
45
+
>
46
+
> - For system wide installations: `. /opt/intel/oneapi/setvars.sh`
47
+
> - For private installations: ` . ~/intel/oneapi/setvars.sh`
48
+
> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'`
49
+
>
50
+
> For more information on configuring environment variables, see [Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html).
51
+
52
+
### Use Visual Studio Code* (VS Code) (Optional)
58
53
59
-
You can use Visual Studio Code (VS Code) extensions to set your environment,
54
+
You can use Visual Studio Code* (VS Code) extensions to set your environment,
60
55
create launch configurations, and browse and download samples.
61
56
62
57
The basic steps to build and run a sample using VS Code include:
63
-
- Download a sample using the extension **Code Sample Browser for Intel oneAPI Toolkits**.
64
-
- Configure the oneAPI environment with the extension **Environment Configurator for Intel oneAPI Toolkits**.
65
-
- Open a Terminal in VS Code (**Terminal>New Terminal**).
66
-
- Run the sample in the VS Code terminal using the instructions below.
58
+
1. Configure the oneAPI environment with the extension **Environment Configurator for Intel® oneAPI Toolkits**.
59
+
2. Download a sample using the extension **Code Sample Browser for Intel® oneAPI Toolkits**.
60
+
3. Open a terminal in VS Code (**Terminal > New Terminal**).
61
+
4. Run the sample in the VS Code terminal using the instructions below.
67
62
68
-
To learn more about the extensions and how to configure the oneAPI environment, see
69
-
[Using Visual Studio Code with Intel® oneAPI Toolkits](https://software.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html).
63
+
To learn more about the extensions and how to configure the oneAPI environment, see the
64
+
[Using Visual Studio Code with Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html).
70
65
71
-
After learning how to use the extensions for Intel oneAPI Toolkits, return to this readme for instructions on how to build and run a sample.
66
+
### On macOS*
72
67
73
-
## Building the `Merge Sort` Program
68
+
1. Build the program
69
+
```
70
+
make
71
+
```
72
+
Alternatively, you can enable the performance tabulation mode then build the program.
74
73
75
-
> **Note**: If you have not already done so, set up your CLI
76
-
> environment by sourcing the `setvars` script located in
>For more information on environment variables, see Use the setvars Script for [Linux or macOS](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html), or [Windows](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-windows.html).
74
+
```
75
+
export perf_num=1
76
+
make
77
+
```
86
78
87
-
Perform the following steps:
88
-
1. Build the program using the following `make` commands.
89
-
```
90
-
$ export perf_num=1 *optional, will enable performance tabulation mode
91
-
$ make
92
-
```
79
+
#### Troubleshooting
80
+
81
+
If you receive an error message, troubleshoot the problem using the **Diagnostics Utility for Intel® oneAPI Toolkits**. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the [Diagnostics Utility for Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) for more information on using the utility.
82
+
83
+
## Run the `MergeSort OMP` Program
93
84
94
-
##Running the Sample
85
+
### Configurable Parameters
95
86
96
-
2. Run the program:
97
-
```
98
-
make run
99
-
```
87
+
There are two configurable options defined in the source code. Both parameters affect program performance.
100
88
101
-
3. Clean the program using:
102
-
```
103
-
make clean
104
-
```
89
+
-`constexpr int task_threshold` - This determines the minimum size of the list passed to the OpenMP merge sort function required to call itself and not the scalar version recursively. Its purpose is to reduce the threading overhead as it gets less efficient on smaller list sizes. Setting this value too small can reduce the OpenMP implementation's performance as it has more threading overhead for smaller workloads.
90
+
-`constexpr int n` - This determines the size of the list used to test the merge sort functions. Setting it larger will result in longer runtime and is useful for analyzing the algorithm's runtime growth rate.
105
91
106
-
If an error occurs, troubleshoot the problem using the Diagnostics Utility for
There are two configurable options defined near the top of the code, both of
113
-
which affect the program's performance:
99
+
2. Clean the program. (Optional)
100
+
```
101
+
make clean
102
+
```
114
103
115
-
- constexpr int task_threshold - This determines the minimum size of the list passed to the OpenMP merge sort function required to call itself and not the scalar version recursively. Its purpose is to reduce the threading overhead as it gets less efficient on smaller list sizes. Setting this value too small can reduce the OpenMP implementation's performance as it has more threading overhead for smaller workloads.
116
-
- constexpr int n - This determines the size of the list used to test the merge sort functions. Setting it larger will result in longer runtime and is useful for analyzing the algorithm's runtime growth rate.
104
+
## Example Output
117
105
106
+
You are prompted to select a test type. The following example output shows the results of choosing `[0] all tests`.
118
107
119
-
### Example of Output
120
108
```
121
109
N = 100000000
122
110
Merge Sort Sample
@@ -137,3 +125,10 @@ Shuffling the array
137
125
Sorting
138
126
Sort succeeded in 3.17086 seconds.
139
127
```
128
+
129
+
## License
130
+
131
+
Code samples are licensed under the MIT license. See
132
+
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
133
+
134
+
Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt).
0 commit comments