CAQA5e ch1
CAQA5e ch1
Chapter 1
Fundamentals of Quantitative
Design and Analysis
CAPP
CA Definition
PC Definition
PARALLEL Computing!
Save wall clock time
Solve larger problems
Provide concurrency
(do multiple things at the same time)
Parallelism is the future of computing!
WHERE IS IT USEFUL?
Application Drivers for HPC
Weather Forecasting:
Atmosphere modeled by dividing it into 3-dimensional cells.
Calculations of each cell repeated many times to model passage of time.
Consider atmosphere to be divided into cells of size 1 mile 1 mile 1 mile to a height of 10 miles
Global region can be schemed as 2x109 cells.
Each cell has many physical variables (temperature, pressure, humidity, wind speed and direction, etc) to be
computed
FP
To forecast the weather over 7 days using 1-minute intervals, takes (7 * 24 * 60 * 1011) = 10080 * 1011 = 1015
operations
A computer operating at 1Gflops (109 floating point operations/s) takes 106 seconds or over 10 days.
To perform calculation in 5 minutes requires computer operating at 3.4 Tflops (3.4 1012 floating point
operations/sec).
TOP500
Jack Dongarra, H. Simon, E. Strohmaier and H. Meuer
Listing of the 500 most powerful Supercomputers in the World
Based on LINPACK Benchmark :Measure of Systems
Floating Point Computing Power
How fast a Computer solves a Dense N by N System of Linear
Equations
twice a year
Supercomputing Conference (SC) in United States in
November
International Supercomputing Conference (ISC) in Europe
in June 14
www.top500.org
Indian Supercomputers
PC TO PP: LANGUAGES
OpenMP, MPI
OpenCL
CUDA
OpenACC
OPENMP
Open Specifications for Multi Processing
What is OpenMP?
De-facto standard API for writing shared memory
parallel applications in C, C++, and Fortran
Consists of:
Compiler Directives
Runtime Routines
Environment variables
Specification maintained by the OpenMP Architecture
Review Board (http://www.openmp.org)
Version 4.0 has been released July 2013
History
OPENMP
When to consider
When compiler cannot find parallelism
The granularity is not high enough
OPENMP
Memory Model
Shared Memory, Thread Based Parallelism
Explicit Parallelism
Fork - Join Model
Compiler Directive Based
Nested Parallelism Support
Dynamic Thread
Memory Model : Flush often
OPENMP
Data is private or shared.
All threads have access to same
globally shared memory.
Shared data accessible by all
threads.
Private accessed only by owned
threads.
Data transfer is transparent to
programmer.
Synchronization takes place, but it
is almost implicit.
OPENMP: Compilation
GNU Compiler Example :
gcc -o omp_helloc -fopenmp omp_hello.c
Advantages of OpenMP
Good performance and scalability
If you do it right ....
Performance improvements:
Introduction
Computer Technology
Lightweight computers
Productivity-based managed/interpreted
programming languages
Move to multi-processor
Introduction
RISC
Introduction
Desktop Computing
Emphasis on price-performance
Servers
Classes of Computers
Classes of Computers
Embedded Computers
Emphasis: price
Classes of Computers
Parallelism
Classes of Computers
Flynns Taxonomy
Vector architectures
Multimedia extensions
Graphics processor units
No commercial implementation
Tightly-coupled MIMD
Loosely-coupled MIMD
Trends in Technology
Trends in Technology
Bandwidth or throughput
Trends in Technology
Trends in Technology
Feature size
Trends in Technology
Dynamic energy
Dynamic power
Intel 80386
consumed ~ 2 W
3.3 GHz Intel Core
i7 consumes 130 W
Heat must be
dissipated from 1.5
x 1.5 cm chip
This is the limit of
what can be cooled
by air
Power
Do nothing well
Dynamic Voltage-Frequency Scaling
Low power state for DRAM, disks
Overclocking, turning off cores
Reducing Power
Currentstatic x Voltage
Scales with number of transistors
To reduce: power gating
Static Power
Trends in Cost
Trends in Cost
Yield
Integrated circuit
Bose-Einstein formula:
Trends in Cost
Module reliability
Dependability
Dependability
Mean time to failure (MTTF)
Mean time to repair (MTTR)
Mean time between failures (MTBF) = MTTF + MTTR
Availability = MTTF / MTBF
Speedup of X relative to Y
Execution time
Response time
Throughput
Measuring Performance
Measuring Performance
Benchmarks
Principle of Locality
Principles
Amdahls Law
Principles
Principles