Skip to content

Commit e3a7e29

Browse files
committed
AMD Open Source version of XSBench for C++AMP
1 parent 5105128 commit e3a7e29

File tree

15 files changed

+2644
-0
lines changed

15 files changed

+2644
-0
lines changed

xsbench-amp/CHANGES

Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
=====================================================================
2+
NEW IN VERSION 13
3+
=====================================================================
4+
- (Feature) Added in the ability for XSBench to write out a binary
5+
file containing a randomized XS dataset. The code is also capable
6+
of reading in this file instead of generating a new XS dataset
7+
each time the program is run. This feature may be useful for those
8+
running in simulation environments where walltime minimization
9+
is key for logistical reasons.
10+
11+
- Minor refactoring/reorganization of code to make the code clearer
12+
and easier to read. After many updates, the code had become a
13+
little bloated and difficult to read, so a cleanup was in order.
14+
15+
- Removed synthetic delay injection (via dummy FLOPS or loads).
16+
These were not very useful or accurate and had not been used by
17+
anyone after the initial analyses were done with them. As they were
18+
definitely adding to the code bloat of the program, they were
19+
removed.
20+
21+
=====================================================================
22+
NEW IN VERSION 12
23+
=====================================================================
24+
- (Bugfix) The XL and XXL runtime options didn't work correctly.
25+
The unionized energy grid overflowed the bounds of normal 4 byte
26+
integers, and actually required use of 8 byte integers.
27+
28+
The variables "n_isotopes" and "n_gridpoints" have been refactored
29+
to 8 byte long integers. All variables that use n_isotopes and
30+
n_gridpoints as input have also been refactored to 8 byte longs.
31+
32+
Note that a simple "patch" from version 11 to version 12 can be
33+
manually done by simply changing line 73 of GridInit.c to be a
34+
long instead of an int. The more thorough refactoring done in v12
35+
is done to "future proof" the code.
36+
37+
=====================================================================
38+
NEW IN VERSION 11
39+
=====================================================================
40+
41+
- Updated & greatly improved the PAPI capability of XSBench. Now
42+
events can be tallied during multi-core. See README for more
43+
info.
44+
45+
- Added in option for thread sleep pause in between macro XS lookups.
46+
Very similar to adding dummy flops, but a little cleaner.
47+
48+
With as small as a 0.1 ms sleep, we get linear scaling with threads.
49+
While this initially appears to confirm our initial suspicions
50+
regarding memory contention / latency problems, I think the delays
51+
resulting from the sleeps could potentially just be washing out
52+
the scaling numbers. Even with just 0.1, over 15 million lookups,
53+
the majority of the runtime (>90%) is just sleep, so scaling numbers
54+
aren't very expressive anymore. Need to implement timers that
55+
ignore the sleep parts.
56+
57+
- Specified OpenMP schedule mode as 'dynamic'. This is the default
58+
on most systems, but now it's set explicitly since it's a lot
59+
faster than 'static' or other modes.
60+
61+
- Added in a "benchmarking" mode, which will attempt all possible
62+
thread combinations between 1 <= nthreads <= max_threads.
63+
This helps to save considerable benchmarking time, as the
64+
data structures can be re-used between runs rather than regenerated
65+
each time. Benchmarking mode is enabled in the makefile.
66+
67+
=====================================================================
68+
NEW IN VERSION 10
69+
=====================================================================
70+
71+
- Changed verification mode to be more portable. The verification
72+
strategy introduced in version 9 had discrepancies on different
73+
platforms and compilers. This was due to reliance on the compiler
74+
provided rand() function producing a different series of random
75+
numbers than other implementations. Also, there were some issues
76+
with the associativity of floating point arithmetic. These issues
77+
have now all been solved, and the verification hash is consistent
78+
across all tested platforms.
79+
80+
- Revised "XL" size parameters, as well as adding in an "XXL" size
81+
option. The XL size now uses 120GB of XS data. The XXL mode uses
82+
252GN of XS data. More details are in the verification section of the
83+
readme.
84+
85+
=====================================================================
86+
NEW IN VERSION 9
87+
=====================================================================
88+
89+
- Added in new code verification mode. This can be toggled on in
90+
the makefile. When code is compiled and run, a hash of the results
91+
will be generated which can then be compared to other versions and
92+
configurations of XSBench. See readme for more details.
93+
94+
- Moved PAPI def to makefile. Makes it easier to toggle.
95+
96+
- Added -l command line option to set the number of cross section
97+
lookups performed by XSBench.
98+
99+
=====================================================================
100+
NEW IN VERSION 8
101+
=====================================================================
102+
103+
- Simplified command line interface (CLI) read in process. XSBench
104+
now supports a more traditional CLI, as follows:
105+
106+
Usage: ./XSBench <options>
107+
Options include:
108+
-n <threads> Number of OpenMP threads to run
109+
-s <size> Size of H-M Benchmark to run (small, large, XL)
110+
-g <gridpoints> Number of gridpoints per isotope
111+
Default is equivalent to: -s large
112+
113+
- Updated README with new CLI usage details.
114+
115+
- Fixed several typos in the XSBench Theory PDF.
116+
117+
=====================================================================
118+
NEW IN VERSION 7
119+
=====================================================================
120+
121+
- Added MPI support. Multithreaded run executes on all ranks.
122+
Problem size or data is not subdivided - the exact same problem
123+
is solved in parallel by all ranks. Only MPI communication is
124+
a single reduce at the end to aggregate timing data.
125+
126+
To enable MPi mode, simply change the MPI flag in the makefile
127+
to "MPI = yes". Make sure mpicc is available on your system.
128+
129+
- Added in "XL" size option for a giant 277 GB energy grid. This
130+
is unlikely to fit on a single node, but is useful for
131+
experimentation purposes.
132+
133+
- Removed "BGQ mode" CLI argument option, as it wasn't being used
134+
by anything in the code anymore.
135+
136+
=====================================================================
137+
NEW IN VERSION 6
138+
=====================================================================
139+
140+
- Fixed small bug in calculate_micro_xs() function. Occasionally,
141+
the index returned would be the last nuclidegridpoint for that
142+
nuclide, causing the "high" energy point to be off the end of the
143+
grid (likely into the next nuclide's energy grid). Added a check
144+
to correct for when this occurs.
145+
146+
Note that this bug did not affect performance - only made the
147+
calculation of XS's more "correct".
148+
149+
=====================================================================
150+
NEW IN VERSION 5
151+
=====================================================================
152+
153+
- Added ChangeLog
154+
155+
- Moved source code files to src/ directory.
156+
157+
- Updated README.txt file to enhance documentation
158+
159+
- Added significant documentation with regards to theory
160+
in the docs/XSBench_Theory.pdf file. The README.txt file is now
161+
more of a quick-start & users guide, whereas the XSBench_Theory.pdf
162+
guide covers the details and theory behind the code.
163+
164+
=====================================================================

xsbench-amp/LICENSE

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
Copyright (c) 2012-2013 Argonne National Laboratory
2+
3+
Permission is hereby granted, free of charge, to any person obtaining a copy of
4+
this software and associated documentation files (the "Software"), to deal in
5+
the Software without restriction, including without limitation the rights to
6+
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
7+
the Software, and to permit persons to whom the Software is furnished to do so,
8+
subject to the following conditions:
9+
10+
The above copyright notice and this permission notice shall be included in all
11+
copies or substantial portions of the Software.
12+
13+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
15+
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
16+
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
17+
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
18+
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

0 commit comments

Comments
 (0)