Skip to content

Commit ba0e925

Browse files
greenrongreenRon GreenRon GreenRon Greenmkitez
authored
Fortran misc edits to conform to dir struc and naming (oneapi-src#55)
* Initial commit for macOS Fortran Samples 08 July 2020 Signed-off-by: Ron Green <[email protected]> * Update to README.md added link to online User Guide for OpenMP * Test README.md link syntax * Edits misc * Edited json files to add CI stanzas * rwg - updates 3 README.md files to Joe O's outline * Fix for sample.json syntax * fixed syntax in jsons * Moved to directory structure * Renamed directories to conform to standards * Misc edits to fix issues found in REVIEW by Barbara Perz. Co-authored-by: Ron Green <[email protected]> Co-authored-by: Ron Green <[email protected]> Co-authored-by: Ron Green <[email protected]> Co-authored-by: mkitez <[email protected]> Co-authored-by: Ron Green <[email protected]> Co-authored-by: Ron Green <[email protected]> Co-authored-by: Ron Green <[email protected]>
1 parent 3336c22 commit ba0e925

File tree

16 files changed

+1028
-0
lines changed

16 files changed

+1028
-0
lines changed
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Copyright 2020 Intel Corporation
2+
3+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4+
5+
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6+
7+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
## =============================================================
2+
## Copyright © 2020 Intel Corporation
3+
##
4+
## SPDX-License-Identifier: MIT
5+
## =============================================================
6+
##
7+
##
8+
##******************************************************************************
9+
## Content:
10+
##
11+
## Build for openmp_sample
12+
##******************************************************************************
13+
14+
FC = ifort
15+
16+
release: openmp_sample.exe
17+
18+
debug: openmp_sample_dbg.exe
19+
20+
run: release ; @export DYLD_LIBRARY_PATH="$(LIBRARY_PATH)" ; ./openmp_sample.exe
21+
22+
debug_run: debug ; @export DYLD_LIBRARY_PATH="$(LIBRARY_PATH)" ; ./openmp_sample_dbg.exe
23+
24+
openmp_sample.exe: openmp_sample.o
25+
$(FC) -O2 -fpp -qopenmp $^ -o $@
26+
27+
openmp_sample_dbg.exe: openmp_sample_dbg.o
28+
$(FC) -O0 -g -fpp -qopenmp $^ -o $@
29+
30+
%.o: src/%.f90
31+
$(FC) -O2 -c -fpp -qopenmp -o $@ $<
32+
33+
%_dbg.o: src/%.f90
34+
$(FC) -O0 -g -c -fpp -qopenmp -o $@ $<
35+
36+
clean:
37+
/bin/rm -f core.* *.o *.exe
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# `OpenMP Primes`
2+
This sample is designed to illustrate how to use
3+
the OpenMP* API with the Intel® Fortran Compiler.
4+
5+
This program finds all primes in the first 40,000,000 integers,
6+
the number of 4n+1 primes, and the number of 4n-1 primes in the same range.
7+
It illustrates two OpenMP* directives to help speed up the code.
8+
9+
10+
| Optimized for | Description
11+
|:--- |:---
12+
| OS | macOS* with Xcode* installed
13+
| Software | Intel&reg; oneAPI Intel Fortran Compiler (Beta)
14+
| What you will learn | How to build and run a Fortran OpenMP application using Intel Fortran compiler
15+
| Time to complete | 10 minutes
16+
17+
## Purpose
18+
19+
This program finds all primes in the first 40,000,000 integers, the number of 4n+1 primes,
20+
and the number of 4n-1 primes in the same range. It illustrates two OpenMP* directives
21+
to help speed up the code.
22+
23+
First, a dynamic schedule clause is used with the OpenMP* for directive.
24+
Because the DO loop's workload increases as its index gets bigger,
25+
the default static scheduling does not work well. Instead, dynamic scheduling
26+
is used to account for the increasing workload.
27+
But dynamic scheduling itself has more overhead than static scheduling,
28+
so a chunk size of 10 is used to reduce the overhead for dynamic scheduling.
29+
30+
Second, a reduction clause is used instead of an OpenMP* critical directive
31+
to eliminate lock overhead. A critical directive would cause excessive lock overhead
32+
due to the one-thread-at-time update of the shared variables each time through the DO loop.
33+
Instead the reduction clause causes only one update of the shared variables once at the end of the loop.
34+
35+
The sample can be compiled unoptimized (-O0 ), or at any level of
36+
optimization (-O1 through -O3 ). In addition, the following compiler options are needed.
37+
38+
The option -qopenmp enables compiler recognition of OpenMP* directives.
39+
This option can also be omitted, in which case the generated executable will be a serial program.
40+
41+
The option -fpp enables the Fortran preprocessor.
42+
Read the Intel® Fortran Compiler Documentation for more information about these options.
43+
44+
## Key Implementation Details
45+
The Intel&reg; oneAPI Intel Fortran Compiler (Beta) includes all libraries and headers necessary to compile and run OpenMP* enabled Fortran applications. Users simply use the -qopenmp compiler option to compile and link their OpenMP enabled applications.
46+
47+
## License
48+
This code sample is licensed under MIT license
49+
50+
## Building the `Fortran OpenMP*` sample
51+
52+
### Experiment 1: Unoptimized build and run
53+
* Build openmp_samples
54+
55+
cd openmp_samples
56+
make clean
57+
make debug
58+
59+
* Run the program
60+
61+
make debug_run
62+
63+
* What did you see?
64+
65+
Did the debug, unoptimized code run slower?
66+
67+
### Experiment 2: Default Optimized build and run
68+
69+
* Build openmp_samples
70+
71+
make
72+
* Run the program
73+
74+
make run
75+
76+
### Experiment 3: Controlling number of threads
77+
By default an OpenMP application creates and uses as many threads as there are "processors" in a system. A "processor" is the number of logical processors which on hyperthreaded cores is twice the number of physical cores.
78+
79+
OpenMP uses environment variable 'OMP_NUM_THREADS' to set number of threads to use. Try this!
80+
81+
export OMP_NUM_THREADS=1
82+
make run
83+
note the number of threads reported by the application. Now try 2 threads:
84+
85+
export OMP_NUM_THREADS=2
86+
make run
87+
Did the make the application run faster? Experiment with the number of threads and see how it affects performance.
88+
89+
### Clean up
90+
* Clean the program
91+
make clean
92+
93+
## Further Reading
94+
Interested in learning more? We have a wealth of information
95+
on using OpenMP with the Intel Fortran Compiler in our
96+
[OpenMP section of Developer Guide and Reference][1]
97+
98+
[1]: https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top/optimization-and-programming-guide/openmp-support.html "Developer Guide and Reference"
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
{
2+
"name": "openmp-primes",
3+
"categories": [ "Toolkit/Intel® oneAPI HPC Toolkit" ],
4+
"description": "Fortran Tutorial - Using OpenMP",
5+
"toolchain": [ "ifort" ],
6+
"languages": [ { "fortran": {} } ],
7+
"targetDevice": [ "CPU" ],
8+
"os": [ "darwin" ],
9+
"builder": [ "make" ],
10+
"ciTests":{
11+
"darwin": [
12+
{
13+
"id": "fort_release_cpu",
14+
"steps": [
15+
"make release",
16+
"make run",
17+
"make clean"
18+
]
19+
},
20+
{
21+
"id": "fort_debug_cpu",
22+
"steps": [
23+
"make debug",
24+
"make debug_run",
25+
"make clean"
26+
]
27+
}
28+
]
29+
}
30+
}
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
! ==============================================================
2+
! Copyright © 2020 Intel Corporation
3+
!
4+
! SPDX-License-Identifier: MIT
5+
! =============================================================
6+
!
7+
! [DESCRIPTION]
8+
! This code finds all primes in the first 40,000,000 integers, the number of
9+
! 4n+1 primes, and the number of 4n-1 primes in the same range.
10+
!
11+
! This source illustrates two OpenMP directives to help speed up
12+
! the code. First, a dynamic "schedule" clause is used with the OpenMP "for"
13+
! directive. Because the "for" loop's workload increases as its index
14+
! gets bigger, the default "static" scheduling does not work well.
15+
! Instead dynamic scheduling is used to account for the increasing
16+
! workload. But dynamic scheduling itself has more overhead than
17+
! static scheduling, so a "chunk size" of 10 is used to reduce the
18+
! overhead for dynamic scheduling. Second, a "reduction" clause is
19+
! used instead of an OpenMP "critical" directive to eliminate lock overhead.
20+
! A "critical" directive would cause excessive lock overhead due to
21+
! the one-thread-at-time update of the shared variables each
22+
! time through the "for" loop. Instead the reduction clause causes only
23+
! one update of the shared variables once at the end of the loop.
24+
!
25+
! [COMPILE]
26+
! Use the following compiler options to compile both multi- and
27+
! single-threaded versions.
28+
!
29+
! Parallel compilation:
30+
!
31+
! Windows*: /Qopenmp /fpp
32+
!
33+
! Linux* and macOS*: -qopenmp -fpp
34+
!
35+
! Serial compilation:
36+
!
37+
! Use the same command, but omit the -fopenmp (Linux* and macOS*)
38+
! or /Qopenmp (Windows) option.
39+
!
40+
41+
program ompPrime
42+
43+
#ifdef _OPENMP
44+
include 'omp_lib.h' !needed for OMP_GET_NUM_THREADS()
45+
#endif
46+
47+
integer :: start = 1
48+
integer :: end = 40000000
49+
integer :: number_of_primes = 0
50+
integer :: number_of_41primes = 0
51+
integer :: number_of_43primes = 0
52+
integer index, factor, limit, nthr
53+
real rindex, rlimit
54+
logical prime, print_primes
55+
56+
print_primes = .false.
57+
nthr = 1 ! assume just one thread
58+
print *, ' Range to check for Primes:',start,end
59+
60+
#ifdef _OPENMP
61+
!$omp parallel
62+
63+
!$omp single
64+
nthr = OMP_GET_NUM_THREADS()
65+
print *, ' We are using',nthr,' thread(s)'
66+
!$omp end single
67+
!
68+
69+
!
70+
!$omp do private(factor, limit, prime) &
71+
schedule(dynamic,10) &
72+
reduction(+:number_of_primes,number_of_41primes,number_of_43primes)
73+
#else
74+
print *, ' We are using',nthr,' thread(s)'
75+
#endif
76+
77+
do index = start, end, 2 !workshared loop
78+
79+
limit = int(sqrt(real(index)))
80+
prime = .true. ! assume number is prime
81+
factor = 3
82+
83+
do
84+
if(prime .and. factor .le. limit) then
85+
if(mod(index,factor) .eq. 0) then
86+
prime = .false.
87+
endif
88+
factor = factor + 2
89+
else
90+
exit ! we can jump out of non-workshared loop
91+
endif
92+
enddo
93+
94+
if(prime) then
95+
if(print_primes) then
96+
print *, index, ' is prime'
97+
endif
98+
99+
number_of_primes = number_of_primes + 1
100+
101+
if(mod(index,4) .eq. 1) then
102+
number_of_41primes = number_of_41primes + 1
103+
endif
104+
105+
if(mod(index,4) .eq. 3) then
106+
number_of_43primes = number_of_43primes + 1
107+
endif
108+
109+
endif ! if(prime)
110+
enddo
111+
!$omp end do
112+
!$omp end parallel
113+
114+
print *, ' Number of primes found:',number_of_primes
115+
print *, ' Number of 4n+1 primes found:',number_of_41primes
116+
print *, ' Number of 4n-1 primes found:',number_of_43primes
117+
end program ompPrime
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Copyright 2020 Intel Corporation
2+
3+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4+
5+
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6+
7+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
## =============================================================
2+
## Copyright © 2020 Intel Corporation
3+
##
4+
## SPDX-License-Identifier: MIT
5+
## =============================================================
6+
##
7+
##
8+
##******************************************************************************
9+
## Content:
10+
##
11+
## Build for optimize_sample
12+
##******************************************************************************
13+
#
14+
# >>>>> SET OPTIMIZATION LEVEL BELOW <<<<<
15+
#
16+
#Uncomment one of the following with which you wish to compile
17+
18+
FC = ifort -O0
19+
#FC = ifort -O1
20+
#FC = ifort -O2
21+
#FC = ifort -O3
22+
23+
OBJS = int_sin.o
24+
25+
all: int_sin
26+
27+
run: int_sin
28+
./int_sin
29+
30+
int_sin: $(OBJS)
31+
ifort $^ -o $@
32+
33+
%.o: src/%.f90
34+
$(FC) $^ -c
35+
36+
clean:
37+
/bin/rm -f core.* $(OBJS) int_sin
38+

0 commit comments

Comments
 (0)