You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Tools/Advisor/matrix_multiply_advisor/README.md
+132-2Lines changed: 132 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -24,15 +24,19 @@ Code samples are licensed under the MIT license. See
24
24
25
25
Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt)
26
26
27
-
## How to Build
27
+
28
+
### Running Samples In DevCloud
29
+
Running samples in the Intel DevCloud requires you to specify a compute node. For specific instructions, jump to [Run the Matrix Multiply Advisor sample on the DevCloud](#run-matmul-advisor-on-devcloud)
30
+
31
+
## How to Build
28
32
29
33
This sample contains 3 version of matrix multiplication using DPC++:
30
34
31
35
multiply1 – basic implementation of matrix multiply using DPC++
32
36
multiply1_1 – basic implementation that replaces the buffer store with a local accessor “acc” to reduce memory traffic
33
37
multiply1_2 – the basic implementation, plus adding the local accessor and matrix tiling
34
38
35
-
Edit the line in multiply.h to select the version of the multiply function:
39
+
Edit the line in src/multiply.hpp to select the version of the multiply function:
36
40
#define MULTIPLY multiply1
37
41
38
42
@@ -68,8 +72,134 @@ Edit the line in multiply.h to select the version of the multiply function:
68
72
69
73
Elapsed Time: 0.539631s
70
74
75
+
76
+
## Running an Intel Advisor analysis
77
+
------------------------------------------
78
+
79
+
See the Advisor Cookbook here: https://software.intel.com/en-us/advisor-cookbook
80
+
81
+
82
+
### Running the Matrix Multiply Advisor sample in the DevCloud<aname="run-matmul-advisor-on-devcloud"></a>
83
+
This sample contains 3 version of matrix multiplication using DPC++:
84
+
85
+
multiply1 – basic implementation of matrix multiply using DPC++
86
+
multiply1_1 – basic implementation that replaces the buffer store with a local accessor “acc” to reduce memory traffic
87
+
multiply1_2 – the basic implementation, plus adding the local accessor and matrix tiling
88
+
89
+
Edit the line in src/multiply.hpp to select the version of the multiply function:
4. Change directories to the Matrix Multiply Advisor sample directory.
103
+
```
104
+
cd ~/oneAPI-samples/Tools/Advisor/matrix_multiply_advisor
105
+
```
106
+
#### Build and run the sample in batch mode
107
+
The following describes the process of submitting build and run jobs to PBS.
108
+
A job is a script that is submitted to PBS through the qsub utility. By default, the qsub utility does not inherit the current environment variables or your current working directory. For this reason, it is necessary to submit jobs as scripts that handle the setup of the environment variables. In order to address the working directory issue, you can either use absolute paths or pass the -d \<dir\> option to qsub to set the working directory.
109
+
110
+
#### Create the Job Scripts
111
+
1. Create a build.sh script with your preferred text editor:
Jobs submitted in batch mode are placed in a queue waiting for the necessary resources (compute nodes) to become available. The jobs will be executed on a first come basis on the first available node(s) having the requested property or label.
141
+
1. Build the sample on a gpu node.
142
+
143
+
```
144
+
qsub -l nodes=1:gpu:ppn=2 -d . build.sh
145
+
```
146
+
147
+
Note: -l nodes=1:gpu:ppn=2 (lower case L) is used to assign one full GPU node to the job.
148
+
Note: The -d . is used to configure the current folder as the working directory for the task.
149
+
150
+
2. In order to inspect the job progress, use the qstat utility.
151
+
```
152
+
watch -n 1 qstat -n -1
153
+
```
154
+
Note: The watch -n 1 command is used to run qstat -n -1 and display its results every second. If no results are displayed, the job has completed.
155
+
156
+
3. After the build job completes successfully, run the sample on a gpu node:
157
+
```
158
+
qsub -l nodes=1:gpu:ppn=2 -d . run.sh
159
+
```
160
+
4. When a job terminates, a couple of files are written to the disk:
161
+
162
+
<script_name>.sh.eXXXX, which is the job stderr
163
+
164
+
<script_name>.sh.oXXXX, which is the job stdout
165
+
166
+
Here XXXX is the job ID, which gets printed to the screen after each qsub command.
167
+
168
+
5. Inspect the output of the sample.
169
+
```
170
+
cat run.sh.oXXXX
171
+
```
172
+
You should see output similar to this:
173
+
174
+
```
175
+
Scanning dependencies of target run
176
+
Address of buf1 = 0x7f570456f010
177
+
Offset of buf1 = 0x7f570456f180
178
+
Address of buf2 = 0x7f5703d6e010
179
+
Offset of buf2 = 0x7f5703d6e1c0
180
+
Address of buf3 = 0x7f570356d010
181
+
Offset of buf3 = 0x7f570356d100
182
+
Address of buf4 = 0x7f5702d6c010
183
+
Offset of buf4 = 0x7f5702d6c140
184
+
Using multiply kernel: multiply1
185
+
Running on Intel(R) UHD Graphics P630 [0x3e96]
186
+
Elapsed Time: 1.79388s
187
+
Built target run
188
+
```
189
+
190
+
6. Remove the stdout and stderr files and clean-up the project files.
191
+
```
192
+
rm build.sh.*; rm run.sh.*; make clean
193
+
```
194
+
7. Disconnect from the Intel DevCloud.
195
+
```
196
+
exit
197
+
```
71
198
## Running an Intel Advisor analysis
72
199
------------------------------------------
73
200
74
201
See the Advisor Cookbook here: https://software.intel.com/en-us/advisor-cookbook
75
202
203
+
### Build and run additional samples
204
+
Several sample programs are available for you to try, many of which can be compiled and run in a similar fashion to this sample. Experiment with running the various samples on different kinds of compute nodes or adjust their source code to experiment with different workloads.
0 commit comments