0% found this document useful (0 votes)
8 views

Unit 1distributed

A distributed system consists of autonomous computer systems connected via a centralized network, allowing resource sharing and communication. Key characteristics include resource sharing, openness, concurrency, scalability, fault tolerance, transparency, and heterogeneity, with advantages like better performance and reliability, but challenges such as security and data consistency. Models of distributed systems include architectural, interaction, and fault models, with examples like client-server and peer-to-peer networks.

Uploaded by

Deepika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Unit 1distributed

A distributed system consists of autonomous computer systems connected via a centralized network, allowing resource sharing and communication. Key characteristics include resource sharing, openness, concurrency, scalability, fault tolerance, transparency, and heterogeneity, with advantages like better performance and reliability, but challenges such as security and data consistency. Models of distributed systems include architectural, interaction, and fault models, with examples like client-server and peer-to-peer networks.

Uploaded by

Deepika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

UNIT-1

DISTRIBUTED SYSTEM

Distributed System is a collection of autonomous computer systems that are physically


separated but are connected by a centralized computer network that is equipped with
distributed system software. The autonomous computers will communicate among each
system by sharing resources and files and performing the tasks assigned to them.

Example of Distributed System:

Any Social Media can have its Centralized Computer Network as its Headquarters and
computer systems that can be accessed by any user and using their services will be the
Autonomous Systems in the Distributed System Architecture.

Characteristics of Distributed System:

Resource Sharing: It is the ability to use any Hardware, Software, or Data anywhere in
the System.

Openness: It is concerned with Extensions and improvements in the system (i.e., How
openly the software is developed and shared with others)

Concurrency: It is naturally present in Distributed Systems, that deal with the same
activity or functionality that can be performed by separate users who are in remote
locations. Every local system has its independent Operating Systems and Resources.

Scalability: It increases the scale of the system as a number of processors communicate


with more users by accommodating to improve the responsiveness of the system.

Fault tolerance: It cares about the reliability of the system if there is a failure in
Hardware or Software, the system continues to operate properly without degrading the
performance the system.

Transparency: It hides the complexity of the Distributed Systems to the Users and
Application programs as there should be privacy in every system.

Heterogeneity: Networks, computer hardware, operating systems, programming


languages, and developer implementations can all vary and differ among dispersed
system components.

Advantages of Distributed System:

 Applications in Distributed Systems are Inherently Distributed Applications.


 Information in Distributed Systems is shared among geographically distributed
users.
 Resource Sharing (Autonomous systems can share resources from remote
locations).
 It has a better price performance ratio and flexibility.
 It has shorter response time and higher throughput.
 It has higher reliability and availability against component failure.
 It has extensibility so that systems can be extended in more remote locations and
also incremental growth.

Disadvantages of Distributed System:

 Relevant Software for Distributed systems does not exist currently.


 Security possess a problem due to easy access to data as the resources are shared
to multiple systems.
 Networking Saturation may cause a hurdle in data transfer i.e., if there is a lag in
the network then the user will face a problem accessing data.
 In comparison to a single user system, the database associated with distributed
systems is much more complex and challenging to manage.
 If every node in a distributed system tries to send data at once, the network may
become overloaded.

Applications Area of Distributed System:

 Finance and Commerce: Amazon, eBay, Online Banking, E-Commerce websites.


 Information Society: Search Engines, Wikipedia, Social Networking, Cloud
Computing.
 Cloud Technologies: AWS, Salesforce, Microsoft Azure, SAP.
 Entertainment: Online Gaming, Music, youtube.
 Healthcare: Online patient records, Health Informatics.
 Education: E-learning.
 Transport and logistics: GPS, Google Maps.
 Environment Management: Sensor technologies.

Challenges of Distributed Systems:

While distributed systems offer many advantages, they also present some challenges that
must be addressed. These challenges include:

 Network latency: The communication network in a distributed system can


introduce latency, which can affect the performance of the system.
 Distributed coordination: Distributed systems require coordination among the
nodes, which can be challenging due to the distributed nature of the system.
 Security: Distributed systems are more vulnerable to security threats than
centralized systems due to the distributed nature of the system.
 Data consistency: Maintaining data consistency across multiple nodes in a
distributed system can be challenging
 Scalability: Scalability is one of the challenges in distributed systems. As
distributed systems grow in size and complexity, it becomes increasingly difficult
to maintain their performance and availability. The major challenges are security,
maintaining consistency of data in every system, network latency between
systems, resource allocation, or proper node balancing across multiple nodes.
 Transparency: Transparency refers to the level of abstraction present in the
system to hide complex information from the user. It is essential to ensure that
failures are transparent to users and do not affect the overall system's
performance. Systems with different hardware and software configurations
provide to be a challenge for Transparency. Security is also a concern to maintain
transparency in distributed systems.
 Security: The distributed and heterogeneous nature of the distributed system
makes security a major challenge for data processing systems.

DISTRIBUTED SYSTEM MODELS IS AS FOL

1. Architectural Models
2. Interaction Models
3. Fault Models

1. Architectural Models
Architectural model describes responsibilities distributed between system components
and how are these components placed.
the Client-Server model is described as follows:

Client: The Client is a computer (Host) i.e. capable of receiving information or using a
particular service from the service providers (Servers).
Servers: Server is a remote computer which provides information (data) or access to
particular services.
So, its basically the Client requesting something and the Server serving it as long as its
present in the database.

Client Server Process


There are few steps to follow to interacts with the servers a client.
 User enters the URL(Uniform Resource Locator) of the website or file. The
Browser then requests the DNS(DOMAIN NAME SYSTEM) Server.
 DNS Server lookup for the address of the WEB Server.
 DNS Server responds with the IP address of the WEB Server.
 Browser sends over an HTTP/HTTPS request to WEB Server’s IP (provided
by DNS server).
 Server sends over the necessary files of the website.
 Browser then renders the files and the website is displayed. This rendering is done
with the help of DOM (Document Object Model) interpreter, CSS interpreter and JS
Engine collectively known as the JIT or (Just in Time) Compilers.

Advantages of Client-Server model:


 Centralized system with all data in a single place.
 Cost efficient requires less maintenance cost and Data recovery is possible.
 The capacity of the Client and Servers can be changed separately.
Disadvantages of Client-Server model:
 Clients are prone to viruses, Trojans and worms if present in the Server or uploaded
into the Server.
 Server are prone to Denial of Service (DOS) attacks.
 Data packets may be spoofed or modified during transmission.
 Phishing or capturing login credentials or other useful information of the user are
common and MITM(Man in the Middle) attacks are common.
Peer-to-Peer Process
A peer-to-peer (P2P) network is a type of network in which each participant (or "peer")
can act as both a client and a server, allowing them to share resources and information
directly with one another without the need for a central server. P2P networks are
decentralized, meaning that there is no central authority or organization that controls
the network or its resources.
In a P2P network, each peer has equal status and can connect to any other peer on the
network. Peers can share a variety of resources, including files, data, and computing
power, with one another. P2P networks are often used for file sharing, as they allow
users to download files directly from other users rather than from a central server.

Types of P2P networks


1. Unstructured P2P networks: In this type of P2P network, each device is able to
make an equal contribution. This network is easy to build as devices can be connected
randomly in the network. But being unstructured, it becomes difficult to find content.
For example, Napster, Gnutella, etc.
2. Structured P2P networks: It is designed using software that creates a virtual layer
in order to put the nodes in a specific structure. These are not easy to set up but can
give easy access to users to the content. For example, P-Grid, Kademlia, etc.
3. Hybrid P2P networks: It combines the features of both P2P networks and client-
server architecture. An example of such a network is to find a node using the central
server.

Features of P2P network

Why are these Peer-To-Peer Networks so Useful?


Peer-to-peer (P2P) networks are useful for a variety of reasons. Some of the main
benefits of P2P networks include −
 Decentralization − P2P networks are decentralized, meaning that there is no
central server or authority that controls the network. This makes P2P networks
more resilient and less vulnerable to downtime or attack, as there is no single point
of failure.
 Efficiency − In a P2P network, resources can be shared directly between peers
rather than being passed through a central server. This can make P2P networks
more efficient, as it reduces the need for intermediate servers and the associated
overhead.
 Resource sharing − P2P networks allow users to share a variety of resources with
one another, including files, data, and computing power. This can be especially
useful for file sharing, as it allows users to download files directly from other users
rather than from a central server.
 Cost savings − P2P networks can help to reduce the costs associated with hosting
and maintaining central servers, as resources are shared directly between peers
rather than being stored on a central server. This can be especially beneficial for
large organizations or businesses that need to share large amounts of data or
resources.
 Scalability − P2P networks can scale more easily than traditional client-server
networks, as there is no central server that needs to handle all of the traffic. This
can make P2P networks more suitable for large-scale applications or for use in
environments with fluctuating demand

P2P Network Architecture

In the P2P network architecture, the computers connect with each other in a workgroup
to share files, and access to internet and printers.

 Each computer in the network has the same set of responsibilities and capabilities.
 Each device in the network serves as both a client and server.
 The architecture is useful in residential areas, small offices, or small companies where
each computer act as an independent workstation and stores the data on its hard
drive.
 Each computer in the network has the ability to share data with other computers in
the network.
 The architecture is usually composed of workgroups of 12 or more computers.
How Does P2P Network Work?

One of the most well-known peer-to-peer networks is torrent. All computer in this
kind of network is linked to the internet, allowing users to download resources shared by
any one computer.

Applications of P2P Network

Below are some of the common uses of P2P network:

 File sharing: P2P network is the most convenient, cost-efficient method for file
sharing for businesses. Using this type of network there is no need for intermediate
servers to transfer the file.
 Blockchain: The P2P architecture is based on the concept of decentralization. When
a peer-to-peer network is enabled on the blockchain it helps in the maintenance of a
complete replica of the records ensuring the accuracy of the data at the same time. At
the same time, peer-to-peer networks ensure security also.
 Direct messaging: P2P network provides a secure, quick, and efficient way to
communicate. This is possible due to the use of encryption at both the peers and
access to easy messaging tools.
 Collaboration: The easy file sharing also helps to build collaboration among other
peers in the network.
 File sharing networks: Many P2P file sharing networks like G2, and eDonkey have
popularized peer-to-peer technologies.
 Content distribution: In a P2P network, unline the client-server system so the clients
can both provide and use resources. Thus, the content serving capacity of the P2P
networks can actually increase as more users begin to access the content.
 IP Telephony: Skype is one good example of a P2P application in VoIP.
Advantages of P2P Network

 Easy to maintain: The network is easy to maintain because each node is independent
of the other.
 Less costly: Since each node acts as a server, therefore the cost of the central server
is saved. Thus, there is no need to buy an expensive server.
 No network manager: In a P2P network since each node manages his or her own
computer, thus there is no need for a network manager.
 Adding nodes is easy: Adding, deleting, and repairing nodes in this network is easy.
 Less network traffic: In a P2P network, there is less network traffic than in a client/
server network.

Disadvantages of P2P Network

 Data is vulnerable: Because of no central server, data is always vulnerable to getting


lost because of no backup.
 Less secure: It becomes difficult to secure the complete network because each node
is independent.
 Slow performance: In a P2P network, each computer is accessed by other computers
in the network which slows down the performance of the user.
 Files hard to locate: In a P2P network, the files are not centrally stored, rather they
are stored on individual computers which makes it difficult to locate the files.

Examples of P2P networks : One of the most well-known peer-to-peer


networks is torrent. All computer in this kind of network is linked to the internet,
allowing users to download resources shared by any one computer.

the local area network (LAN), which is typically preferred by small workplaces for the
purpose of resource sharing, is another frequently used example of the peer-to-peer
network.

● BitTorrent - popular P2P file-sharing protocol, usually associated with piracy ● Skype -
it used to use proprietary hybrid P2P protocol, now uses client-server model after
Microsoft’s acquisition ● Bitcoin - P2P cryptocurrency without a central monetary
authority

2.Interaction Model
Interaction model are for handling time i. e. for process execution, message delivery, clock
drifts etc.

a)Synchronous distributed systems

Main features:

● Lower and upper bounds on execution time of processes can be set.


● Transmitted messages are received within a known bounded time.
● Drift rates between local clocks have a known bound.

Important consequences:
1. In a synchronous distributed system there is a notion of global physical time (with
a known relative precision depending on the drift rate).
2. Only synchronous distributed systems have a predictable behavior in terms of
timing. Only such systems can be used for hard real-time applications.
3. In a synchronous distributed system it is possible and safe to use timeouts in order
to detect failures of a process or communication link.
4. It is difficult and costly to implement synchronous distributed systems.

b)Asynchronous distributed systems

● Many distributed systems (including those on the Internet) are asynchronous. -


No bound on process execution time (nothing can be assumed about speed, load,
and reliability of computers).
● No bound on message transmission delays (nothing can be assumed about speed,
load, and reliability of interconnections) - No bounds on drift rates between local
clocks.

Important consequences:

1. In an asynchronous distributed system there is no global physical time. Reasoning


can be only in terms of logical time.
2. Asynchronous distributed systems are unpredictable in terms of timing.
3. No timeouts can be used.
4. Asynchronous systems are widely and successfully used in practice.
5. In practice timeouts are used with asynchronous systems for failure detection.
6. However, additional measures have to be applied in order to avoid duplicated
messages, duplicated execution of operations, etc.

3. Fault Models
● Failures can occur both in processes and communication channels. The reason can
be both software and hardware faults.
● Fault models are needed in order to build systems with predictable behavior in
case of faults (systems which are fault tolerant).
● such a system will function according to the predictions, only as long as the real
faults behave as defined by the “fault model”.
Types of Fault
 Omission faults
 Arbitrary faults
 Timing faults

Omission Faults:
A processor or communication channel fails to perform actions it is supposed to
do: the particular action is not performed by the faulty component!

- If a component is faulty it does not produce any output.


- If a component produces an output, this output is correct.

Arbitrary Faults: These kinds of failures are known as arbitrary failures. Arbitrary
failures are ones that occur when a node responds with different responses when parts
of the system communicate with it.
Timing faults: It can occur in synchronous distributed systems, where time limits are
set to process execution, communications, and clock drifts. A timing fault results in any
of these time limits being exceeded.
basis of Client-Server Network Peer-to-Peer Network
Comparison

Basic In a client-server network, we have a In a peer-to-peer network, clients


specific server and specific clients are not distinguished; every node act
connected to the server. as a client and server.

Expense A Client-Server network is more A Peer-to-Peer is less expensive to


expensive to implement. implement.

Stability It is more stable and scalable than a peer- It is less stable and scalable, if the
to-peer network. number of peers increases in the
system.

Data In a client-server network, the data is stored In a peer-to-peer network, each peer
in a centralized server. has its own data.

Server A server may get overloaded when many A server is not bottlenecked since
customers make simultaneous service the services are dispersed among
requests. numerous servers using a peer-to-
peer network.

Focus Sharing the information. Connectivity.

Service The server provides the requested service Each node has the ability to both
in response to the client's request. request and delivers services.

Performance Because the server does the bulk of the Because resources are shared in a
work, performance is unaffected by the big peer-to-peer network,
growth of clients. performance will likely to suffer.

Security A Client-Server network is a secured The network's security deteriorates,


network because the server can verify a and its susceptibility grows as the
client's access to any area of the network, number of peers rises.
making it secure.

Remote Method Invocation (RMI )


RMI stands for Remote Method Invocation. It is a mechanism that allows an object
residing in one system (JVM) to access/invoke an object running on another JVM.
RMI is used to build distributed applications; it provides remote communication
between Java programs. It is provided in the package java.rmi.

a)Architecture of an RMI Application

In an RMI application, we write two programs, a server program (resides on the server)
and a client program (resides on the client).
● Inside the server program, a remote object is created and reference of that object
is made available for the client (using the registry).
● The client program requests the remote objects on the server and tries to invoke
its methods.
The following diagram shows the architecture of an RMI application.

Let us now discuss the components of this architecture.


● Transport Layer − This layer connects the client and the server. It manages the
existing connection and also sets up new connections.
● Stub − A stub is a representation (proxy) of the remote object at client. It resides
in the client system; it acts as a gateway for the client program.
● Skeleton − This is the object which resides on the server
side. stub communicates with this skeleton to pass request to the remote object.
● RRL(Remote Reference Layer) − It is the layer which manages the references
made by the client to the remote object.
b)Working of an RMI Application

The following points summarize how an RMI application works −


● When the client makes a call to the remote object, it is received by the stub which
eventually passes this request to the RRL.
● When the client-side RRL receives the request, it invokes a method
called invoke() of the object remoteRef. It passes the request to the RRL on the
server side.
● The RRL on the server side passes the request to the Skeleton (proxy on the
server) which finally invokes the required object on the server.
● The result is passed all the way back to the client.

c)Marshalling and Unmarshalling

Whenever a client invokes a method that accepts parameters on a remote object, the
parameters are bundled into a message before being sent over the network. These
parameters may be of primitive type or objects. In case of primitive type, the parameters
are put together and a header is attached to it. In case the parameters are objects, then
they are serialized. This process is known as marshalling.
At the server side, the packed parameters are unbundled and then the required method
is invoked. This process is known as unmarshalling.

d)RMI Registry

RMI registry is a namespace on which all server objects are placed. Each time the server
creates an object, it registers this object with the RMIregistry
(using bind() or reBind() methods). These are registered using a unique name known
as bind name.
To invoke a remote object, the client needs a reference of that object. At that time, the
client fetches the object from the registry using its bind name (using lookup() method).
The following illustration explains the entire process −
e)Goals of RMI

Following are the goals of RMI −

● To minimize the complexity of the application.


● To preserve type safety.
● Distributed garbage collection.
● Minimize the difference between working with local and remote objects.

REMOTE PROCEDURE CALLS (RPCS) IN DISTRIBUTED SYSTEMS:

Communication Protocols for Remote Procedure Calls:


The following are the communication protocols that are used:

 Request Protocol
 Request/Reply Protocol
 The Request/Reply/Acknowledgement-Reply Protocol

Request Protocol:

 The Request Protocol is also known as the R protocol.


 It is used in Remote Procedure Call (RPC) when a request is made from the calling
procedure to the called procedure. After execution of the request, a called procedure
has nothing to return and there is no confirmation required of the execution of a
procedure.
 Because there is no acknowledgement or reply message, only one message is sent
from client to server.
 A reply is not required so after sending the request message the client can further
proceed with the next request.
 May-be call semantics are provided by this protocol, which eliminates the
requirement for retransmission of request packets.
 Asynchronous Remote Procedure Call (RPC) employs the R protocol for enhancing
the combined performance of the client and server. By using this protocol, the client
need not wait for a reply from the server and the server does not need to send that.
 In an Asynchronous Remote Procedure Call (RPC) in case communication fails, the
RPC Runtime does not retry the request. TCP is a better option than UDP since it
does not require retransmission and is connection-oriented.
 In most cases, asynchronous RPC with an unstable transport protocol is utilized to
implement periodic update services. One of its applications is the Distributed
System Window.
Request/Reply Protocol:

 The Request-Reply Protocol is also known as the RR protocol.


 It works well for systems that involve simple RPCs.
 The parameters and result values are enclosed in a single packet buffer in simple
RPCs. The duration of the call and the time between calls are both briefs.
 This protocol has a concept base of using implicit acknowledgements instead of
explicit acknowledgements.
 Here, a reply from the server is treated as the acknowledgement (ACK) for the
client’s request message, and a client’s following call is considered as an
acknowledgement (ACK) of the server’s reply message to the previous call made by
the client.
 To deal with failure handling e.g. lost messages, the timeout transmission technique
is used with RR protocol.
 If a client does not get a response message within the predetermined timeout period,
it retransmits the request message.
 Exactly-once semantics is provided by servers as responses get held in reply cache
that helps in filtering the duplicated request messages and reply messages are
retransmitted without processing the request again.
 If there is no mechanism for filtering duplicate messages then at least-call semantics
is used by RR protocol in combination with timeout transmission.

The Request/Reply/Acknowledgement-Reply Protocol:

 This protocol is also known as the RRA protocol (request/reply/acknowledge-


reply).
 Exactly-once semantics is provided by RR protocol which refers to the responses
getting held in reply cache of servers resulting in loss of replies that have not been
delivered.
 The RRA (Request/Reply/Acknowledgement-Reply ) Protocol is used to get rid of
the drawbacks of the RR (Request/Reply) Protocol.
 In this protocol, the client acknowledges the receiving of reply messages and when
the server gets back the acknowledgement from the client then only deletes the
information from its cache.
 Because the reply acknowledgement message may be lost at times, the RRA protocol
requires unique ordered message identities. This keeps track of the
acknowledgement series that has been sent.

You might also like