0% found this document useful (0 votes)
1 views

Lecture_02 - System Models

Uploaded by

Andrew Koh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Lecture_02 - System Models

Uploaded by

Andrew Koh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Distributed System Models

From Coulouris, Dollimore Kindberg and Blair


Distributed Systems: Concepts and Design, 5e
What distributed systems are really about – abstraction (extracting simplicity)

! Extracting simplicity (abstraction) vs. mastering complexity


♦ Network administrators usually called “masters of complexity”
♦ Why? – they are the only ones who know how to run the system
♦ Why? – vertical integration of networking equipment; needs to be
managed individually
♦ Software-Defined Networking (SDN) should be able to change this, but
that’s a different story…
♦ Layering provides abstraction – but that’s the data path, not the network
control plane

! Distributed systems are (mostly) software-based


♦ Easier to extract simplicity through abstraction; horizontalisation more
natural; can build more advanced concepts
♦ Open interfaces is the key – know what is provided by the “lower layers”
and use this abstraction, don’t worry about mastering it!
♦ Of course, we will master (some of it) in this course J

Cloud and Distributed Computing


Architectural Models

! Architecture of a system is its structure in terms of separately


specified components

! Overall goal is to ensure that the structure will meet present and
likely future demands upon it

! Major concerns – Performance, reliability, availability, cost-


effectiveness

! An architectural model simplifies and abstracts the functions of


the individual distributed system components and then considers:
♦ The placement of components across a network of computers, seeking to
define useful patterns for the distribution of data and workload
♦ The interrelationships between components – i.e., their functional
roles and the patterns of communication between them

Cloud and Distributed Computing


Software and hardware service layers in distributed systems

Applications, serv ices

Mi ddleware

Operating sys tem

Platform

Computer and network hardware

Cloud and Distributed Computing


Platform

! Lowest-level hardware and software layers

! Provide services to layers above them


♦ Implemented independently in each computer

! Bring system’s programming interface up to a level


that facilitates communication and coordination
between processes
♦ E.g. Intel x86/Solaris

Cloud and Distributed Computing


System Architectures: Client-Server Model

! Division of responsibilities between system


components and their placement on computers in the
network
♦ Major impact on performance, reliability, and security

! Client processes interact with individual server


processes in separate host computers in order to
access the shared resources that they manage

! Servers may in turn be clients of other servers – e.g. a


web server is often a client of a local file server that
manages the files in which the web pages are stored

! Web servers and most other Internet services are clients


of the Domain Name Service, which translates Internet
domain names to network addresses
Cloud and Distributed Computing
Clients invoke individual servers

Client invocation Server


result

result invocation
Server

Client
Key:
Process: Computer:

Cloud and Distributed Computing


Peer-to-peer Model

! All processes involved in a task or activity play similar roles


♦ Interact cooperatively as peers
♦ No distinction between client and server processes or the computers
upon which they run

! Scales better than client-server


♦ System capacity and bandwidth virtually increased as it is distributed
among many participating entities
♦ Today’s desktop systems have more capacity than yesterday's servers

! File sharing applications (Napster, BitTorrent)

! Variations of the p2p theme are used in a number of application


areas today
♦ Application-level routing, p2p media streaming, etc.

Cloud and Distributed Computing


A distributed application based on peer processes

Peer 2

Peer 1
Applica tion

Applica tion

Sha rable Peer 3


objects
Applica tion

Peer 4

Applica tion

Peers 5 .... N

Cloud and Distributed Computing


Variations of the CS model

! Or, how we get from this:

Cloud and Distributed Computing


Variations of the CS model

! To that:

Cloud and Distributed Computing


Variations of the CS model

! Use of multiple servers and caches to increase


performance, availability and resilience
♦ Exploit data/service partition and replication

! Use of mobile code and mobile agents


♦ Can improve interactive response by performing local
operations at the client

! Thin clients
♦ Low-cost computers with limited hardware resources that are
simple to manage
♦ Hold minimum software locally; download OS and application
SW from server

Cloud and Distributed Computing


A service provided by multiple servers

Serv ice

Serv er
Client

Serv er

Client
Serv er

Cloud and Distributed Computing


Web proxy server

Client Web
server
Prox y
server

Client Web
server

Cloud and Distributed Computing


Web applets

a) c lient reques t results in the downloading of appl et c ode

Client Web
Applet code server

b) c lient interacts with the applet

Web
Client Applet server

e.g., Javascript
AJAX
Cloud and Distributed Computing
Thin clients and compute servers

Compute server
Network computer or PC

Thin network Application


Client Process

Terminals
X11;
Virtual Network Computing (VNC);
Remote desktop protocols
Cloud computing
Cloud and Distributed Computing
Design Requirements for Distributed Architectures

! Performance Issues – arising from limited processing and


communication capacities
♦ Responsiveness – context switch and data transfer between
processes is slow; impacts interactivity; use few software layers; transfer
small-sized data
♦ Throughput – the rate at which computational work is done; the ability of
a distributed system to perform work for all its users – forcing data
through middleware layers can have a negative impact on throughput
♦ Load balancing; Caching and replication

! Quality of service – applies to OS as well as networks


♦ Non-functional properties of systems that affect the quality of the service
experienced by clients and users – reliability, security and
performance

! Dependability
♦ Crucial for safety-critical systems; correctness, security, fault
tolerance
Cloud and Distributed Computing
Fundamental models

! All different system models share some fundamental properties


♦ Composed of processes communicating with one another by sending
messages over a network

! Fundamental models: based on fundamental properties that


allow us to be more specific about their characteristics, failures
and security risks
♦ Address correctness, fault tolerance, QoS

! A system model should address:


♦ What are the main entities in the system?
♦ How do they interact?
♦ Which characteristics affect (their) individual and collective behaviour

! Purpose of a model:
♦ To make explicit all relevant assumptions about the systems
♦ To make generalisations about what is possible or impossible, given
those assumptions (can then formally prove it)
Cloud and Distributed Computing
Fundamental models: Interaction

! Computation occurs within processes

! The processes interact by passing messages, resulting


in communication (information flow) and coordination
(synchronisation and ordering of activities) between
processes
♦ Distributed systems design is concerned especially with these
interactions

! Communication takes place with delays that are often of


considerable duration

! The accuracy with which independent processes can be


coordinated is limited by these delays and by the
difficulty of maintaining a common notion of time
across all computers in a distributed system
Cloud and Distributed Computing
Interaction Model (cont.)

! Performance of communication channels


♦ The delay between the start of a message’s transmission in one process
and the beginning of receipt by another is referred to as latency. It
includes:
" Propagation time: constant; depends on physical length and communication material
" Transmission time: (fairly) variable; depends on message size and bandwidth
" Queuing: (very) variable; depends on network and system load in routers/end-
systems(OS)
♦ The bandwidth of a computer network is the total amount of information
that can be transmitted over it in a given time
" Shared among competing channels
♦ Jitter is the variation in the time taken to deliver a series of messages

! Clocks and timing events


♦ Each computer has own internal clock; can timestamp events
♦ Clocks have different offsets and drift rates; very hard to synchronise
♦ Solutions exist but have limitations (e.g. GPS -> sky visibility)
Cloud and Distributed Computing
Two variants of the interaction model

! Synchronous distributed systems: strong assumption of time


♦ The time to execute each step of a process has known lower and
upper bounds
♦ Each message transmitted over a channel is received within a known
bounded time
♦ Each process has a local clock whose drift rate from real time has a
known bound
♦ Hard to arrive at realistic values for bounds and provide guarantees

! Asynchronous distributed systems; no bounds on:


♦ Process execution speeds
♦ Message transmission delays
♦ Clock drift rates
♦ E.g. the Internet: no intrinsic bound on server or network load; how
long does it take to download a file?
♦ Any solution valid for an asynchronous distributed system also valid
for a synchronous one
Cloud and Distributed Computing
Interaction Model – Event ordering

! In many cases, we are interested in knowing whether an event (sending or


receiving a message) at one process occurred before, after or
concurrently with another event in another process

♦ The execution of a system can be described in terms of events and their


ordering despite the lack of accurate clocks

♦ Consider the following set of exchanges between a group of email users


X, Y, Z, and A on a mailing list
" User X sends a message with the subject Meeting
" Users Y and Z reply by sending a message with the subject Re: Meeting

♦ In real-time, X’s message was sent first; Y reads it and replies; Z reads
both X’s message and Y’s reply and then sends another reply, which
references both X’s and Y’s messages

♦ Due to the independent delays – messages may be delivered in


random order
Cloud and Distributed Computing
Real-time ordering of events

send receive receive


X
m1

m2

send receive
Physical
Y time
receive

send
Z
receive receive

m3 m1 m2
A
receive receive receive
t1 t2 t3
Cloud and Distributed Computing
Fundamental models: Failure

! Correct operation of a distributed system is threatened


whenever a fault occurs in any of the computers on which it
runs or in the network that connects them
♦ The failure model defines and classifies faults

! In a distributed system, both processes and communication


channels may fail – i.e., depart from what is considered
correct or desirable behaviour

! Three types of failures are considered for each type of


component
♦ Omission failures – a process or communication channel fails to
perform actions that it is supposed to do
♦ Arbitrary/Byzantine failures – any type of error may occur
♦ Timing failures – process does not meet its execution deadline

Cloud and Distributed Computing


Failure Model (cont.)

! Process omission failure


♦ Process has crashed – it has halted and will not execute any
further steps of its program
♦ Normal detection approach for a crashed process is to observe
that it repeatedly fails to respond to queries – relies upon the
use of timeouts
♦ Fail-stop behaviour: (other) processes can detect certainly that
a process has crashed

! Communication omission failure


♦ A process p performs a send by inserting message m into its
outgoing message buffer
♦ The communication channel transports m to q’s incoming
message buffer and delivers it
♦ The message buffers are usually provided by the OSes
Cloud and Distributed Computing
Processes and channels

process p process q

send m rec eive


send-omission failure receive-omission failure

Communi cation channel


Outgoing message buffer Inc oming message buffer
channel omission failure

Cloud and Distributed Computing


Failure Model (cont.)

! Arbitrary failures
♦ An arbitrary (or Byzantine) process failure is one in which the
process omits intended processing steps or takes unintended
processing steps

♦ Worst possible failure


" Any type of error can occur
" Cannot be detected by seeing whether the process responds to invocations
" E.g. return a wrong value in response to an invocation

♦ Examples of arbitrary communication failure are:


" Message contents are corrupted
" Non-existent messages are delivered
" Real messages delivered more than once

Cloud and Distributed Computing


Failure Model (cont.)

! Masking failures
♦ Possible to construct reliable services from components that
exhibit failures
♦ A service masks a failure by hiding it altogether or by
converting it into a more acceptable type of failure
♦ E.g., checksums mask corrupted messages: convert and
arbitrary failure to an omission failure

! Reliability of one-to-one communication


♦ Reliable communication service can be built by masking some
of the failures of a basic communication channel
♦ Defined in terms of validity and integrity
" Validity: any message in the outgoing message buffer is eventually delivered
to the incoming message buffer
" Integrity: the message received is identical to the one sent; no messages
delivered twice

Cloud and Distributed Computing


Security Model

! The security of a distributed system can be achieved by


securing the processes and the channels used for their
interactions and by protecting the objects (and resources of
all types) that they encapsulate against unauthorized access

! Object protection achieved via use of the concepts of


principals and access rights
♦ Principal can be a user or a process
♦ Access rights specify who is allowed to perform the operations of an
object; e.g. who is allowed to read or write the state of an object

! A server is responsible for verifying the identity of the


principal behind each invocation and checking that that
identity has sufficient access rights to perform the requested
operation on the particular object invoked
Cloud and Distributed Computing
Objects and principals

Acc ess rights Object


invoc ation

Client
result Serv er

Principal (user) Network Principal (server)

Cloud and Distributed Computing


The enemy

Copy of m

The enemy
m’
Process p m Process q
Communication channel

Cloud and Distributed Computing


Threats

! To processes and channels


♦ Validity, integrity, privacy

! Denial of service
♦ Make excessive and pointless invocations on services or message
transmissions in a network
♦ Results in overload on physical resources; e.g. processing capacity,
network bandwidth

! Mobile code
♦ A problem for any process that receives and executes program code
from elsewhere
♦ Such code may easily play a Trojan horse role (modifies resources
available to the host node, but not to the originator of the code)

! Security and threat models


♦ Basis for the analysis and design of secure systems
♦ Careful analysis of threats (all forms of attack) arising from network,
physical, human environment
♦ Evaluates the risks and consequences of each
Cloud and Distributed Computing
Summary

! Most distributed systems arranged according to one of a variety of


architectural models

! Client-server model prevalent


♦ Use of multiple servers and data partition and replication to accommodate
large demand

! Peer-to-peer model
♦ All processes play similar roles; exploit large number of available resources

! Fundamental models
♦ Interaction: concerned with performance of processes and communication
channels and absence of global clocks
♦ Failure: classifies failures of processes and basic communication
channels
♦ Security: identifies possible threats to processes and communication
channels
Cloud and Distributed Computing

You might also like