0% found this document useful (0 votes)
4 views

Distributed Systems Architecture and Models

The document discusses the architecture and models of distributed systems, highlighting hardware and software concepts, including multiprocessors, multicomputers, and middleware. It outlines the characteristics of distributed operating systems and network operating systems, as well as communication paradigms such as interprocess communication and remote invocation. Additionally, it addresses the challenges faced by distributed systems, including varying workloads, heterogeneous environments, and security threats.

Uploaded by

Mohamed Amin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Distributed Systems Architecture and Models

The document discusses the architecture and models of distributed systems, highlighting hardware and software concepts, including multiprocessors, multicomputers, and middleware. It outlines the characteristics of distributed operating systems and network operating systems, as well as communication paradigms such as interprocess communication and remote invocation. Additionally, it addresses the challenges faced by distributed systems, including varying workloads, heterogeneous environments, and security threats.

Uploaded by

Mohamed Amin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 58

Distributed

Systems
Architecture and
Models
Distributed Systems: Hardware
Concepts

• Multiprocessors
• Multicomputers Networks of Computers
• Multiprocessors and Multicomputers Distinguishing
features:
• Private versus shared memory
• Bus versus switched interconnection
High degree of node
heterogeneity:
• High-performance parallel systems (multiprocessors as
well as multicomputers)
• High-end PCs and workstations (servers)
• Simple network computers (offer users only network
access)
• Mobile computers (palmtops, laptops)
• Multimedia workstations
High degree of network
heterogeneity:
• Local-area gigabit networks
• Wireless connections
• Long-haul, high-latency connections
• Wide-area switched megabit connections
Distributed Systems: Software
Concepts
• Distributed operating system
• Network operating system
• Middleware
Characteristics of DOS
• OS on each computer
knows about the other
computers
• OS on different computers
generally the same
• Services are generally
(transparently) distributed
across computers
Characteristics of Network
Operating System
• Each computer has its own
operating system with
networking facilities
• Computers work
independently (i.e., they may
even have different operating
systems)
• Services are tied to individual
nodes (ftp, telnet, WWW)
• Highly file oriented (basically,
processors share only files)
Characteristics of Middleware
• OS on each computer
need not know about the
other computers
• OS on different computers
need not generally be the
same
• Services are generally
(transparently) distributed
across computers
Need for Middleware
• Motivation: Too many networked applications were hard or difficult to
integrate:
• Departments are running different NOSs
• Integration and interoperability only at level of primitive NOS services
• Combining different databases, but providing a single view to applications
• Setting up enterprise-wide Internet services, making use of existing
information systems
• Allow transactions across different databases
• Allow extensibility for future services (e.g., mobility, teleworking,
collaborative applications)
Interfaces in Distributed
Systems
When modules are in different processes or on different hosts there are
limitations on the interactions that can occur. Only actions with parameters
that are fully specified and understood can communicate effectively to
request or provide services to modules in another process
• A service interface allows a client to request and a server to provide
particular services
• A remote interface allows objects to be passed as arguments to and
results from distributed modules
• Object Interfaces - An interface defines the signatures of a set of
methods, including arguments, argument types, return values and
exceptions.
System architectures –
Client/Server
• Client-server: This is the architecture that is most often cited when
distributed systems are discussed. It is historically the most important
and remains the most widely employed.
• In particular, client processes interact with individual server
processes in potentially separate host computers in order to access
the shared resources that they manage.
• Servers may in turn be clients of other servers, as the figure indicates.
For example, a web server is often a client of a local file server that
manages the files in which the web pages are stored. Web servers and
most other Internet services are clients of the DNS service, which
translates Internet domain names to network addresses.
System architectures –
Client/Server
System architectures –
Peer/Peer
• Peer-to-peer: In this architecture all of the processes
involved in a task or activity play similar roles, interacting
cooperatively as peers without any distinction between client
and server processes or the computers on which they run.
• In practical terms, all participating processes run the same
program and offer the same set of interfaces to each other.
• While the client-server model offers a direct and relatively
simple approach to the sharing of data and other resources, it
scales poorly.
System architectures –
Peer/Peer
System Models
Difficulties and Threats For
Distributed Systems
• Widely varying modes of use: The component parts of systems are
subject to wide variations in workload – for example, some web pages are
accessed several million times a day.
• Wide range of system environments: A distributed system must
accommodate heterogeneous hardware, operating systems and networks.
• Internal problems: Non-synchronized clocks, conflicting data updates
and many modes of hardware and software failure involving the individual
system components.
• External threats: Attacks on data integrity and secrecy, denial of service
attacks.
Introduction To System
Models
• Systems that are intended for use in real-world environments should be
designed to function correctly in the widest possible range of
circumstances and in the face of many possible difficulties and threats.
• Each type of model is intended to provide an abstract, simplified but
consistent description of a relevant aspect of distributed system design:
• Physical models are the most explicit way in which to describe a
system; they capture the hardware composition of a system in terms of
the computers (and other devices, such as mobile phones) and their
interconnecting networks.
Introduction To System
Models ..cont’d
• Architectural models describe a system in terms of the computational
and communication tasks performed by its computational elements; the
computational elements being individual computers or aggregates of
them supported by appropriate network interconnections.
• Fundamental models take an abstract perspective in order to
examine individual aspects of a distributed system. interaction
models, which consider the structure and sequencing of the
communication between the elements of the system; failure models,
which consider the ways in which a system may fail to operate correctly
and; security models, which consider how the system is protected
against attempts to interfere with its correct operation or to steal its
data.
Physical Model
• A physical model is a representation of the underlying hardware
elements of a distributed system that abstracts away from specific details
of the computer and networking technologies employed.
• Baseline physical model: A distributed system was defined as one in
which hardware or software components located at networked computers
communicate and coordinate their actions only by passing messages.
• This leads to a minimal physical model of a distributed system as an
extensible set of computer nodes interconnected by a computer network
for the required passing of messages.
• Beyond this baseline model, we can usefully identify three generations of
distributed systems.
Three Generations of Distributed
Systems
Early distributed systems:
• Such systems emerged in the late 1970s and early 1980s in
response to the emergence of local area networking
technology, usually Ethernet
• These systems typically consisted of between 10 and 100
nodes interconnected by a local area network, with limited
Internet connectivity and supported a small range of services
such as shared local printers and file servers as well as email
and file transfer across the Internet.
Three Generations of Distributed
Systems
Internet-scale distributed systems:
• Building on this foundation, larger-scale distributed systems
started to emerge in the 1990s in response to the dramatic
growth of the Internet during this time
• Such systems exploit the infrastructure offered by the Internet to
become truly global.
• They incorporate large numbers of nodes and provide distributed
system services for global organizations and across
organizational boundaries.
Three Generations of Distributed
Systems
Contemporary distributed systems:
• The emergence of mobile computing has led to physical models where
nodes such as laptops or smart phones may move from location to location
in a distributed system, leading to the need for added capabilities such as
service discovery and support for spontaneous interoperation.
• The emergence of ubiquitous computing has led to a move from
discrete nodes to architectures where computers are embedded in
everyday objects and in the surrounding environment
• The emergence of cloud computing and, in particular, cluster
architectures has led to a move from autonomous nodes performing a
given role to pools of nodes that together provide a given service
Generations of distributed
systems
Architectural models
• The architecture of a system is its structure in terms of
separately specified components and their interrelationships.
• The overall goal is to ensure that the structure will meet present
and likely future demands on it. Major concerns are to make the
system reliable, manageable, adaptable and cost-effective.
• The architectural design of a building has similar aspects – it
determines not only its appearance but also its general structure
and architectural style (gothic, neo-classical, modern) and
provides a consistent frame of reference for the design.
Architectural Elements
• To understand the fundamental building blocks of a distributed
system, it is necessary to consider four key questions:
• What are the entities that are communicating in the distributed
system?
• How do they communicate, or, more specifically, what communication
paradigm is used?
• What (potentially changing) roles and responsibilities do they have in
the overall architecture?
• How are they mapped on to the physical distributed infrastructure
(what is their placement)?
Architectural Element -
Communicating Entities
• The first two questions above are absolutely central to an understanding
of distributed systems; what is communicating and how those entities
communicate together define a rich design space for the distributed
systems developer to consider. It is helpful to address the first question
from a system-oriented and a problem-oriented perspective.
• From a system perspective, the answer is normally very clear in that the
entities that communicate in a distributed system are typically
processes, leading to the prevailing view of a distributed system as
processes coupled with appropriate interprocess communication
paradigms
Communicating entities…
cont’d
• From System Perspective;
• In some primitive environments, such as sensor networks,
the underlying operating systems may not support process
abstractions (or indeed any form of isolation), and hence
the entities that communicate in such systems are nodes.
• In most distributed system environments, processes are
supplemented by threads, so, strictly speaking, it is
threads that are the endpoints of communication.
Communicating entities…
cont’d
• From Programming Perspective; Problem-oriented Approach
• Objects: Objects have been introduced to enable and encourage the use of object
oriented approaches in distributed systems
• Components: Since their introduction a number of significant problems have been
identified with distributed objects, and the use of component technology has
emerged as a direct response to such weaknesses. Component-based middleware
often provides additional support for key areas such as deployment and support for
server-side programming
• Web services: Web services represent the third important paradigm for the
development of distributed systems. Web services are closely related to objects and
components, again taking an approach based on encapsulation of behaviour and
access through interfaces.
Architectural Element - Communication
Paradigms
We now turn our attention to how entities communicate in a
distributed system, and consider three types of
communication paradigm:
• interprocess communication;
• remote invocation;
• indirect communication.
Interprocess communication
Interprocess communication refers to the relatively low-
level support for
communication between processes in distributed systems,
including message-passing primitives, direct access to the
API offered by Internet protocols (socket programming) and
support for multicast communication.
Remote Invocation
• Remote invocation represents the most common communication paradigm in
distributed systems, covering a range of techniques based on a two-way exchange
between communicating entities in a distributed system and resulting in the calling of
a remote operation, procedure or method, as defined further below
• Request-reply protocols: Request-reply protocols are effectively a pattern imposed
on an underlying message-passing service to support client-server computing.
• Remote procedure calls: The concept of a remote procedure call (RPC), represents
a major intellectual breakthrough in distributed computing.
• Remote method invocation: Remote method invocation (RMI) strongly resembles
remote procedure calls but in a world of distributed objects. With this approach, a
calling object can invoke a method in a remote object. As with RPC, the underlying
details are generally hidden from the user.
Indirect Communication
In contrast, a number of techniques have emerged whereby
communication is indirect, through a third entity, allowing a
strong degree of decoupling between senders and receivers.
In particular:
• Senders do not need to know who they are sending to
(space uncoupling).
• Senders and receivers do not need to exist at the same
time (time uncoupling).
Indirect Communication
Key techniques for indirect communication include:
• Group communication: Group communication is concerned with
the delivery of messages to a set of recipients and hence is a
multiparty communication paradigm supporting one-to-many
communication. Group communication relies on the abstraction of
a group which is represented in the system by a group identifier.
• Publish-subscribe systems: Many systems, can be classified as
information-dissemination systems wherein a large number of
producers (or publishers) distribute information items of interest
(events) to a similarly large number of consumers (or subscribers).
Indirect Communication …
cont’d
Key techniques for indirect communication include:
• Message queues: Whereas publish-subscribe systems offer a one-to-
many style of communication, message queues offer a point-to-point
service whereby producer processes can send messages to a
specified queue and consumer processes can receive messages from
the queue or be notified of the arrival of new messages in the queue.
• Tuple spaces: Tuple spaces offer a further indirect communication
service by supporting a model whereby processes can place arbitrary
items of structured data, called tuples, in a persistent tuple space
and other processes can either read or remove such tuples from the
tuple space by specifying patterns of interest.
Indirect Communication …
cont’d
Key techniques for indirect communication include:
• Distributed shared memory: Distributed shared memory (DSM)
systems provide an abstraction for sharing data between processes
that do not share physical memory.
• Programmers are nevertheless presented with a familiar abstraction
of reading or writing (shared) data structures as if they were in their
own local address spaces, thus presenting a high level of distribution
transparency.
• The underlying infrastructure must ensure a copy is provided in a
timely manner and also deal with issues relating to synchronization
and consistency of data..
Architectural patterns
• Architectural patterns build on the more primitive architectural
elements discussed above and provide composite recurring structures
that have been shown to work well in given circumstances. They are
not themselves necessarily complete solutions but rather offer partial
insights that, when combined with other patterns, lead the designer to
a solution for a given problem domain.
• In this section, we learn about layering and tiered architectures and
the related concept of thin clients (including the specific mechanism
of virtual network computing). We also examine web services as an
architectural pattern and give pointers to others that may be
applicable in distributed systems.
Layering
• The concept of layering is a familiar
one and is closely related to
abstraction.
• In a layered approach, a complex
system is partitioned into a number of
layers, with a given layer making use of
the services offered by the layer below.
• A given layer therefore offers a
software abstraction, with higher layers
being unaware of implementation
details, or indeed of any other layers
beneath them.
Tiered Architecture
• Tiered architectures are complementary to layering.
• Whereas layering deals with the vertical organization of
services into layers of abstraction, tiering is a technique
to organize functionality of a given layer and place this
functionality into appropriate servers and, as a secondary
consideration, on to physical nodes.
• This technique is most commonly associated with the
organization of applications
Concepts of Two and Three-Tiered
Architecture
• The presentation logic, which is concerned with handling
user interaction and updating the view of the application as
presented to the user;
• The application logic, which is concerned with the detailed
application-specific processing associated with the
application (also referred to as the business logic, although
the concept is not limited only to business applications);
• The data logic, which is concerned with the persistent
storage of the application, typically in a database
management system.
Two –Tier Architecture
Three –Tier Architecture
Fundamental models
• In general, a fundamental model should contain only the essential
ingredients that we need to consider in order to understand and
reason about some aspects of a system’s behaviour.
• By abstracting only the essential system entities and characteristics
away from details such as hardware, we can clarify our understanding
of our systems.
• There is much to be gained by knowing what our designs do, and do
not, depend upon. It allows us to decide whether a design will work if
we try to implement it in a particular system: we need only ask
whether our assumptions hold in that system.
Fundamental models ..cont’d
The aspects of distributed systems that we wish to capture in our
fundamental models are intended to help us to discuss and reason
about:
• Interaction: Computation occurs within processes; the processes
interact by passing messages, resulting in communication (information
flow) and coordination (synchronization and ordering of activities)
between processes.
• Failure: The correct operation of a distributed system is threatened
whenever a fault occurs in any of the computers on which it runs
(including software faults) or in the network that connects them.
• Security: The modular nature of distributed systems and their
openness exposes them to attack by both external and internal agents.
Fundamental model –
Interaction Model
Let's discuss two significant factors affecting interacting
processes in a distributed system:
1. Communication performance is often a limiting
characteristic.
2. It is impossible to maintain a single global notion of time.
Performance of Communication Channels

Communication over a computer network has the following


performance characteristics relating to latency, bandwidth and jitter:
• Latency - The delay between the start of a message’s transmission
from one process and the beginning of its receipt by another
• The bandwidth of a computer network is the total amount of
information that can be transmitted over it in a given time.
• Jitter is the variation in the time taken to deliver a series of
messages. Jitter is relevant to multimedia data. For example, if
consecutive samples of audio data are played with differing time
intervals, the sound will be badly distorted.
Computer Clocks and Timing
Events
• Each computer in a distributed system has its own internal clock,
which can be used by local processes to obtain the value of the
current time.
• Therefore two processes running on different computers can each
associate timestamps with their events. However, even if the two
processes read their clocks at the same time, their local clocks may
supply different time values.
• This is because computer clocks drift from perfect time and, more
importantly, their drift rates differ from one another. The term clock
drift rate refers to the rate at which a computer clock deviates from a
perfect reference clock.
Two Variants of the Interaction
Model
In a distributed system it is hard to set limits on the time that can
be taken for process execution, message delivery or clock drift.
• Two opposing extreme positions provide a pair of simple models –
the first has a strong assumption of time and the second makes
no assumptions about time:
• Synchronous distributed systems: The time to execute each
step of a process has known lower and upper bounds
• Asynchronous distributed systems: One message from process
A to process B may be delivered in negligible time and another
may take several years.
Two Variants of the Interaction
Model
In a distributed system it is hard to set limits on the time that can
be taken for process execution, message delivery or clock drift.
• Two opposing extreme positions provide a pair of simple models –
the first has a strong assumption of time and the second makes
no assumptions about time:
• Synchronous distributed systems: The time to execute each
step of a process has known lower and upper bounds
• Asynchronous distributed systems: One message from process
A to process B may be delivered in negligible time and another
may take several years.
Failure Model
• In a distributed system both processes and communication
channels may fail – that is, they may depart from what is
considered to be correct or desirable behaviour.
• The failure model defines the ways in which failure may occur
in order to provide an understanding of the effects of failures.
Failure Model - Omission
Failures
• The faults classified as omission failures refer to cases when a
process or communication channel fails to perform actions that it
is supposed to do.
• Process omission failures: The chief omission failure of a
process is to crash.
• Communication omission failures: The communication
channel produces an omission failure if it does not transport a
message from p’s outgoing message buffer to q’s incoming
message buffer.
Failure Model - Arbitrary
Failures
• The term arbitrary or Byzantine failure is used to describe the worst
possible failure semantics, in which any type of error may occur.
• For example, a process may set wrong values in its data items, or it
may return a wrong value in response to an invocation.
• An arbitrary failure of a process is one in which it arbitrarily omits
intended processing steps or takes unintended processing steps.
• Communication channels can suffer from arbitrary failures; for
example, message contents may be corrupted, nonexistent messages
may be delivered or real messages may be delivered more than once.
Security Model
• The security of a distributed system can be achieved by securing the
processes and the channels used for their interactions and by protecting the
objects that they encapsulate against unauthorized access.
Security Model
Distributed systems are often deployed and used in tasks that are
likely to be subject to external attacks by hostile users.
• Protecting objects
• Securing processes and their interactions
Defeating Security Threats
• Cryptography is the science of keeping messages secure, and
encryption is the process of scrambling a message in such a way as to
hide its contents.
• Authentication: The use of shared secrets and encryption provides
the basis for the authentication of messages
• Secure channels: Encryption and authentication are used to build
secure channels as a service layer on top of existing communication
services.

You might also like