100% found this document useful (1 vote)

309 views20 pages

Distributed Shared Memory Guide

Distributed shared memory systems. DM vs. SM DSM vs. MPI Architecture, Design Issues, consistency and implementation.

Uploaded by

AsemSaleh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

309 views20 pages

Distributed Shared Memory Guide

Distributed shared memory systems. DM vs. SM DSM vs. MPI Architecture, Design Issues, consistency and implementation.

Uploaded by

AsemSaleh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Distributed Shared Memory

Outline

Introduction

Shared Memory vs. Distributed

Memory

Distributed Shared Memory

Architecture

DSM vs. Message Passing

Design and Implementation

Consistency Models

DSM Algorithms

Conclusion

Introduction

From the system interconnection perspective, parallel systems fall into

two main categories:
CP
CP
U
U
CP
CP
U
U

Shared Memory

Memor
y

CP
CP
U
U

CP
CP
U
U
Multiprocessors

Tightly-Coupled Multiprocessor
Multiprocessor
Shared Memory architecture
architecture

CP
CP
U
U

Memor
y

CP
CP
U
U
Network

Memor
y

CP
CP
U
U

Memor
y

CP
CP
U
U

Multicomputers
Loosely-Coupled
Distributed Memory

Shared Memory vs. Distributed Memory

Shared Memory

Distributed Memory

Global Address Space

No concept of global address space

Cache Coherent

No concept of cache coherency

Lack of scalability

Scalable performance

Expensive to build

Cost effectiveness: can use commodity, off-theshelf processors and networking

Easy to program, reusability

programmer is responsible for data

communication

Data sharing fast and uniform

Data sharing by message passing, non-uniform

memory access times

Cache coherent means if one processor updates a location in shared

memory, all the other processors know about the update.

Distributed-Shared Memory
Architecture
Definition

A Distributed Shared Memory DSM is an abstraction

that allows the physically separated memories to be
addressed as one logically shared address space.

General Characteristics
Hybrid architecture

Virtual Space shared between all processes

Shared Memory model implemented over

physically distributed memory
Shared-Memory programming techniques can be used
When reading and updating, processes see DSM as an ordinary
memory within their address space

Mapping Manager: Maps shared-memory

address to physical memory (remote or local)

Interconnection
Network

Shared Virtual Space

Distributed-Shared Memory
Architecture (contd.)

General Characteristics
Covert Communication operations

Heterogeneous Nodes

The shared memory component can be a cache-coherent SMP

machine and/or a graphics processing unit GPU

Processes on different computers observe the updates made

by one another

Communications are still needed to exchange data, but they are hidden
from the programmer. Inter-process communication transparency

Cache-coherent

SMP or UMA (bus-based) : A model with identical processors that

have equal access times to a shared memory
UMA : Uniform Memory Access
SMP : Symmetric Multiprocessor

Interconnection
Network

Shared Virtual Space

Distributed-Shared Memory
Architecture
(contd.)
Advantages

Implicit data sharing

Less expensive to build and scalable

Inherited from the distributed-memory architecture

Very large total physical memory for all nodes

Shields programmer from Send/Receive primitives

Large programs can run more efficiently

Software Portability and Reusability

Programs written for shared memory multiprocessors can be run on DSM systems with
minimum changes

Disadvantages

Little programmer control over actual messages being generated

DSM implementations use asynchronous message-passing

cannot be more efficient than Message Passing implementations

Distributed-Shared Memory
Architecture
(contd.)
Best Suitable

When individual shared data items can be accessed directly

e.g. Parallel Applications

Less appropriate
When data is accessed by request

e.g. Client-Server Systems

A server may be used to assist in providing DSM functionality for data

shared between clients

DSM vs. Message Passing

Property

DSM

Message Passing

Marshalling

No. Variables shared directly

Yes. Programmers job

Address space

Single. Interference may occur

Private. Processes are protected

Data representation

Uniform

Heterogonous

Synchronization

Normal construct for

shared-memory programming

Message passing primitives

Process execution

Non-overlapping lifetimes

At the same time

communications cost

Invisible

Obvious

No evidence against or in favor to any of the two

communication mechanisms

Design and Implementation

Main Issues
Granularity refers to the size of sharing unit that can be uniform chunks of memory or data structures

Structure refers to the arrangement of shared data

Most systems view DSM as a linear array of words

Replacement Strategies

byte, page or complex data structure

Small Pages : increased parallelism
increase in directory size
Large Pages : reduce paging overhead, but increase sharing overhead

Similar to caching mechanisms in MP

In cache systems, LRU is often used
In DSM, shared pages need to be given higher priority than exclusively owned pages
first

Synchronization Primitives

Coherence protocols must ensure the consistency of shared data

DSM must allow simultaneous access to shared data on different machines

single writer, multiple readers, etc.

they could be replaced

Consistency Models
Definition

A memory consistency model for a shared address space specifies constraints on the order in which
memory operations must appear to be performed (i.e. to become visible to the processors) with respect to
one another.

Strict Consistency Model

Any read to a certain memory location returns the value stored by most recent write
operation to that address, irrespective of the locations of the processors performing the
read and the write operation.

Sequential Consistency Model

if the result of any execution is the same as if the operations of all processors were
executed in the same sequential order, and the operations of each individual processor
appear in this sequence in the order specified by its program. (Leslie Lamport)
Definition restated: Sequential consistency requires that a shared memory multiprocessor
appears to be amultiprogramming uniprocessor system to any program running on it.

All instructions are executed in order

Every write operation becomes instantaneously visible throughout the system

Consistency Models (contd.)

Sequential Consistency
Model

Example 1
P1

Data = 2000
{}

while (Head == 0)

Head = 1

= Data

Sequential consistency requires program

order
The write to Data has to complete before
the write to Head can begin
The read of Head has to complete before
the read of Data can begin

Example 2
Initially A = B = 0
P1
A=1

if (A == 1)
B=1
if (B
== 1)
register = A

Sequential consistency can be had if a

process makes sure that everyone has seen
an update before that value is read -

Consistency Models (contd.)

Causal Consistency Model
Writes that are potentially causally related must be seen by all processors in the same order .
Writes that are not potentially causally related may be seen in a different order on different
machines
Processor Consistency Model

Writes done by a single processor are seen by all other processors in the order in
which they were written on that processor.
Writes from different processors may be seen in a different order by different
processors.

Release Consistency Model

Weak consistency with two types of synchronization operations : acquire and release
Each type of operations is guaranteed to be processor consistent

DSM
Algorithms

Server

The Central Server Algorithm

Central Server maintains all shared data

Read Request The server just returns the data

Write Request update the data and send acknowledgement to the client

Two messages for each data access

Implementation

If an applications request to access shared data fails repeatedly, a failure condition is sent to the application

Issues: performance and reliability

Client

A timeout is used to resend a request if acknowledgment fails

Associated sequence numbers can be used to detect duplicate write requests

bottleneck at the server

Possible solutions

Partition shared data between several servers

Use a mapping function to distribute/locate data

DSM Algorithms (contd.)

The Migration Algorithm
Data is always migrated to the site where it is accessed

Allow only one node to access a shared data at a time

Migration Request

Single Reader / Single Writer SRSW protocol

Data is typically migrated between servers in a fixed-size unit called a block

Facilitate the management of data instead of migrating individual data units

Advantages
Takes advantage of the locality of reference

No communication costs are incurred when a process accesses data held locally

Data Block

DSM can be integrated with the virtual memory of the OS at each node
-

The size of the block is chosen to be equal to a virtual memory page or a multiple thereof
A locally-held shared memory page can be mapped into the applications virtual address space

Access to data items on data blocks not held locally triggers a page fault

Normal machine instructions for accessing memory can be used

the fault handler can communicate with the remote hosts to obtain the requested data.

When a data block is migrated away, it will be removed from any local address space it was mapped to

To locate a remote data object:

Use a location server
Broadcast query
Issues

Pages can thrash between hosts: to minimize it, set minimum time for data objects to reside at a node

DSM Algorithms (contd.)

The problem with the previous techniques is the sequential access
to the data block
The Read-Replication Algorithm
Extends the migration algorithm

Replicates data blocks at multiple nodes for read access

Replication can reduce the average cost of read operations

Multiple nodes can have read access or one node write access

Block Request

multiple readers-one writer MRSW protocol

After a write, all copies are invalidated or updated

DSM has to keep track of locations of all copies of data objects

IVY The owner node of data object knows all nodes that have copies

PLUS Distributed linked-list tracks all nodes that have copies

Data Block

Multicast invalidate

Advantages

The read-replication can lead to substantial performance improvements if the ratio of reads to writes is
large

Disadvantages
Write operations might be more expensive since replies may have to be invalidated or updated to maintain
consistency

DSM Algorithms
(contd.)
The Full-Replication Algorithm

sequencer

Extension of read-replication algorithm

Multiple nodes have both read and write access to shared data blocks

multiple-readers multiple-writers MRMW protocol

write

update

Issues

Consistency of data for multiple writers

Solution

use of gap-free sequencer

Client
s

All writes sent to a sequencer

Sequencer assigns sequence number and sends write request to all sites that have copies

Each node performs writes according to sequence numbers

A gap in sequence numbers indicates a missing write request: Node asks for retransmission of missing write
requests

DSM Algorithms
Performance Measure

It needs to take into account the cost of

accessing local and remote data blocks

Parameters
p: cost of sending or receiving
a short packet
P: cost of sending or receiving a data
block, assume P/p equal to 20

r: Read/Write ratio
f: probability of an access fault on a

number of sites participating in

distributed shared memory

non-replicated data block

f `:

probability of an access fault on

replicated data blocks

Conclusion

Being a hybrid of the distributed and shared memory architectures, DSM

systems offer a trade-off between the easy-programming of shared
memory machines and the efficiency and scalability of the distributed
memory systems.

While the programmer is relieved from the communication details, he still

has to take care of many design and implementation issues. The
algorithms mentioned above offer various solutions with cost and
performance varying for each.
No single algorithm is good for all applications.
Algorithms need to be adaptive to application characteristics

Thank you for Paying Attention

Parallel Processing
No ratings yet
Parallel Processing
35 pages
CEMLI Process Description
No ratings yet
CEMLI Process Description
1 page
WINSEM2022-23 CSE4001 ETH VL2022230503162 ReferenceMaterialI TueFeb1400 00 00IST2023 Module4DistributedSystemsLecture2
No ratings yet
WINSEM2022-23 CSE4001 ETH VL2022230503162 ReferenceMaterialI TueFeb1400 00 00IST2023 Module4DistributedSystemsLecture2
27 pages
Distributed Shared Memory
No ratings yet
Distributed Shared Memory
24 pages
Distributed Shared Memory
No ratings yet
Distributed Shared Memory
51 pages
Module 2
No ratings yet
Module 2
34 pages
Distributed Shared Memory
No ratings yet
Distributed Shared Memory
35 pages
Unit - IV Notes
No ratings yet
Unit - IV Notes
42 pages
DSM - Distributedsharedmemory
No ratings yet
DSM - Distributedsharedmemory
108 pages
Lect5 - Distributed Shared Memory
No ratings yet
Lect5 - Distributed Shared Memory
120 pages
L 14 DSM
No ratings yet
L 14 DSM
3 pages
Distributed Memory Systems Guide
No ratings yet
Distributed Memory Systems Guide
20 pages
Distributed Memory Systems Guide
No ratings yet
Distributed Memory Systems Guide
39 pages
A4
No ratings yet
A4
5 pages
Distributed Shared Memory - Revised
No ratings yet
Distributed Shared Memory - Revised
64 pages
Chapter 7: Distributed Shared Memory: Why DSM?
No ratings yet
Chapter 7: Distributed Shared Memory: Why DSM?
14 pages
Distributed Shared Memory (DSM)
No ratings yet
Distributed Shared Memory (DSM)
27 pages
Distributed Shared Memory
No ratings yet
Distributed Shared Memory
109 pages
Introduction To DSM: Unit - III Essay Questions
No ratings yet
Introduction To DSM: Unit - III Essay Questions
21 pages
Distributed Shared Memory Basics
No ratings yet
Distributed Shared Memory Basics
36 pages
Distributed Shared Memory
No ratings yet
Distributed Shared Memory
20 pages
Data Shared System (DSM)
No ratings yet
Data Shared System (DSM)
7 pages
Distributed Shared Memory: Pham Quoc Cuong & Phan Dinh Khoi Use Some Slides of James Deak - Njit
No ratings yet
Distributed Shared Memory: Pham Quoc Cuong & Phan Dinh Khoi Use Some Slides of James Deak - Njit
53 pages
DSM1
No ratings yet
DSM1
77 pages
Unit 5 DOS SCR
No ratings yet
Unit 5 DOS SCR
46 pages
Distributed Shared Memory Guide
No ratings yet
Distributed Shared Memory Guide
4 pages
Parallel and Distributed Computing Lec 6
No ratings yet
Parallel and Distributed Computing Lec 6
26 pages
Distributed Resource Management: Distributed Shared Memory
No ratings yet
Distributed Resource Management: Distributed Shared Memory
20 pages
Unit 2
No ratings yet
Unit 2
15 pages
What Is Parallel Computing
No ratings yet
What Is Parallel Computing
9 pages
DS IAT 3 Answer Key
No ratings yet
DS IAT 3 Answer Key
9 pages
Distributed Shared Memory
No ratings yet
Distributed Shared Memory
22 pages
Distributed Shared Memory For Advanced Os
No ratings yet
Distributed Shared Memory For Advanced Os
21 pages
Distributed Shared Memory Guide
No ratings yet
Distributed Shared Memory Guide
35 pages
Distributed System (UNIT-III) 7th Sem
No ratings yet
Distributed System (UNIT-III) 7th Sem
7 pages
Shared Memory. Distributed Memory. Hybrid Distributed-Shared Memory
No ratings yet
Shared Memory. Distributed Memory. Hybrid Distributed-Shared Memory
22 pages
Unit 3
No ratings yet
Unit 3
58 pages
Parallel Memory Architectures
No ratings yet
Parallel Memory Architectures
6 pages
Unit 3 DSM
No ratings yet
Unit 3 DSM
12 pages
Unit 4
No ratings yet
Unit 4
7 pages
Unit 5 DOS SCR
No ratings yet
Unit 5 DOS SCR
22 pages
Multiprocessors
No ratings yet
Multiprocessors
39 pages
MULTIPROCTLPA
No ratings yet
MULTIPROCTLPA
99 pages
Chapter Four - Parallel Computing
No ratings yet
Chapter Four - Parallel Computing
86 pages
Week 5 PDC
No ratings yet
Week 5 PDC
12 pages
Name: Jayrajsinh Vaghela Roll No: 5166 Div: B Sub: DOS (Assi-3.1)
No ratings yet
Name: Jayrajsinh Vaghela Roll No: 5166 Div: B Sub: DOS (Assi-3.1)
24 pages
Lecture 5
No ratings yet
Lecture 5
15 pages
Multi Processors and Thread Level Parallelism
No ratings yet
Multi Processors and Thread Level Parallelism
74 pages
2 Parallel Computer Memory Architectures
No ratings yet
2 Parallel Computer Memory Architectures
26 pages
L04 Parallel Systems Synchronization Communication Scheduling
No ratings yet
L04 Parallel Systems Synchronization Communication Scheduling
117 pages
Proficiency - PPT (103 - Sagar Magaraiya) (DE2)
No ratings yet
Proficiency - PPT (103 - Sagar Magaraiya) (DE2)
9 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
Parallel Random Access Machines
No ratings yet
Parallel Random Access Machines
5 pages
Proficiency - PPT (103 - Sagar Magaraiya) (DE2)
No ratings yet
Proficiency - PPT (103 - Sagar Magaraiya) (DE2)
9 pages
Chap 5 Slides - DSM2
No ratings yet
Chap 5 Slides - DSM2
9 pages
5 Software - Architectures - Detailed - PPT
No ratings yet
5 Software - Architectures - Detailed - PPT
12 pages
Distributed Shared Memory
No ratings yet
Distributed Shared Memory
52 pages
Advanced Os Slides
No ratings yet
Advanced Os Slides
40 pages
Multiprocessor Basics & Performance
No ratings yet
Multiprocessor Basics & Performance
52 pages
Chapter 7 Distributed Shared Memory
No ratings yet
Chapter 7 Distributed Shared Memory
73 pages
Pipeliningandvectorprocessing 140612142847 Phpapp01
No ratings yet
Pipeliningandvectorprocessing 140612142847 Phpapp01
53 pages
Kullback-Leibler Divergence
No ratings yet
Kullback-Leibler Divergence
6 pages
Quantum Mechanical Explanation For Dark Energy, Cosmic Coincidence, Flatness, Age, and Size of The Universe
No ratings yet
Quantum Mechanical Explanation For Dark Energy, Cosmic Coincidence, Flatness, Age, and Size of The Universe
8 pages
Multimedia Information Retrieval
No ratings yet
Multimedia Information Retrieval
11 pages
Decision Trees
100% (2)
Decision Trees
16 pages
KNIME - Seven Techs For Dimensionality Reduction
No ratings yet
KNIME - Seven Techs For Dimensionality Reduction
17 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
17 pages
Simple Present, Continuous, Perfect
No ratings yet
Simple Present, Continuous, Perfect
66 pages
Systems Biology for Researchers
No ratings yet
Systems Biology for Researchers
12 pages
Solution Manual For Digital Design 5th Edition by Mano ISBN 0132774208 9780132774208 PDF Download
100% (5)
Solution Manual For Digital Design 5th Edition by Mano ISBN 0132774208 9780132774208 PDF Download
145 pages
Manual Testing Questions & Answers: Infotech Providing Disciplined Software Testing Practice
No ratings yet
Manual Testing Questions & Answers: Infotech Providing Disciplined Software Testing Practice
25 pages
8085 Microprocessor Overview
No ratings yet
8085 Microprocessor Overview
3 pages
2 SK 3919
No ratings yet
2 SK 3919
1 page
4.1. MOS Capacitor Deep Trench Isolation For CMOS Image Sensors
No ratings yet
4.1. MOS Capacitor Deep Trench Isolation For CMOS Image Sensors
4 pages
Error Impresora Canon
No ratings yet
Error Impresora Canon
4 pages
INVT BD 5-6kW-LL1 Single Phase Solar Inverter 4
No ratings yet
INVT BD 5-6kW-LL1 Single Phase Solar Inverter 4
1 page
S3 Connector Installation
No ratings yet
S3 Connector Installation
144 pages
M3600 Controller Datasheet
No ratings yet
M3600 Controller Datasheet
11 pages
The Rad Model: Software Engineering
No ratings yet
The Rad Model: Software Engineering
12 pages
Short Questions: (CHAPTER 18) Electronics
No ratings yet
Short Questions: (CHAPTER 18) Electronics
9 pages
Windows Container Virtualization Guide
100% (1)
Windows Container Virtualization Guide
48 pages
Software Testing Basics for Students
No ratings yet
Software Testing Basics for Students
34 pages
Transeducer Lab Part 2 2020
No ratings yet
Transeducer Lab Part 2 2020
40 pages
Data Processing Methods Explained
No ratings yet
Data Processing Methods Explained
20 pages
Java KeyWord
No ratings yet
Java KeyWord
6 pages
A First Book of C 4th Edition by Gary J Bronson ISBN 1111531005 9781111531003 - Download The Ebook Now and Read Anytime, Anywhere
100% (1)
A First Book of C 4th Edition by Gary J Bronson ISBN 1111531005 9781111531003 - Download The Ebook Now and Read Anytime, Anywhere
45 pages
Digital Signage
No ratings yet
Digital Signage
8 pages
2SC3807
No ratings yet
2SC3807
4 pages
pfSense Firewall Course Intro
No ratings yet
pfSense Firewall Course Intro
2 pages
OSRAM High-Speed Switching of IR-LEDs - Background and Data Sheet Definition
No ratings yet
OSRAM High-Speed Switching of IR-LEDs - Background and Data Sheet Definition
15 pages
Assemble Computer Hardware Guide
No ratings yet
Assemble Computer Hardware Guide
2 pages
Presenter:: Sanath Kumar Bratish Goswami
No ratings yet
Presenter:: Sanath Kumar Bratish Goswami
9 pages
Itim
No ratings yet
Itim
12 pages
View Topic - SSD Firmware Hacking.
No ratings yet
View Topic - SSD Firmware Hacking.
6 pages
Ir Thermometer: Presented By: Mustafa Ali Yassin Mohamed Fadel Alaa Khalil 2Nd Stage (Evening Study) Dr. Omar Youssef
No ratings yet
Ir Thermometer: Presented By: Mustafa Ali Yassin Mohamed Fadel Alaa Khalil 2Nd Stage (Evening Study) Dr. Omar Youssef
17 pages
2SC 5250
No ratings yet
2SC 5250
5 pages
Standard Cell Designformat-2
No ratings yet
Standard Cell Designformat-2
7 pages