CC Notes
CC Notes
Module-1
Distributed System Models and Enabling Technologies:
Scalable Computing Over the Internet, Technologies for Network Based Systems, System Models
for Distributed and Cloud Computing, Software Environments for Distributed Systems and
Clouds, Performance, Security and Energy Efficiency.
Textbook 1: Chapter 1: 1.1 to 1.5
• Grids enable access to shared computing power and storage capacity from your desktop.
• Clouds enable access to leased computing power and storage capacity from your desktop.
• Grids are an open source technology. Resource users and providers alike can understand and
contribute to the management of their grid
• Clouds are a proprietary technology. Only the resource provider knows exactly how their cloud
manages data, job queues, security requirements and so on.
• The concept of grids was proposed in 1995. The Open science grid (OSG) started in 1995 The
EDG (European Data Grid) project began in 2001.
• In the late 1990`s Oracle and EMC offered early private cloud solutions . However the term cloud
computing didn't gain prominence until 2007. o high-performance computing (HPC) applications
is no longer optimal for measuring system performance
• The emergence of computing clouds instead demands high-throughput computing (HTC)
systems built with parallel and distributed computing technologies
• We have to upgrade data centers using fast servers, storage systems, and high-bandwidth
networks.
• From 1950 to 1970, a handful of mainframes, including the IBM 360 and CDC 6400
1.1 SCALABLE COMPUTING OVER THE INTERNET
Instead of using a centralized computer to solve computational problems, a parallel and distributed
computing system uses multiple computers to solve large-scale problems over the Internet. Thus,
distributed computing becomes data-intensive and network-centric.
The Age of Internet Computing
The Platform Evolution
o From 1960 to 1980, lower-cost minicomputers such as the DEC PDP 11 and VAX Series
o From 1970 to 1990, we saw widespread use of personal computers built with VLSI microprocessors.
o From 1980 to 2000, massive numbers of portable computers and pervasive devices appeared in both
wired and wireless applications
o Since 1990, the use of both HPC and HTC systems hidden in clusters, grids, or Internet clouds has
proliferated
The transition from HPC to HTC marks a strategic shift in computing paradigms, focusing on
scalability, efficiency, and real-world usability over pure processing power.
Computing Paradigm Distinctions
Centralized computing
A computing paradigm where all computer resources are centralized in a single physical
system. In this setup, processors, memory, and storage are fully shared and tightly integrated
within one operating system. Many data centers and supercomputers operate as centralized
systems, but they are also utilized in parallel, distributed, and cloud computing applications.
• Parallel computing
In parallel computing, processors are either tightly coupled with shared memory or loosely
coupled with distributed memory. Communication occurs through shared memory or message
passing. A system that performs parallel computing is a parallel computer, and the programs
running on it are called parallel programs. Writing these programs is referred to as parallel
programming.
• Distributed computing studies distributed systems, which consist of multiple autonomous
computers with private memory communicating through a network via message passing.
Programs running in such systems are called distributed programs, and writing them is known
as distributed programming.
Cloud computing refers to a system of Internet-based resources that can be either centralized
or distributed. It uses parallel, distributed computing, or both, and can be established with
physical or virtualized resources over large data centers. Some regard cloud computing as a
form of utility computing or service computing. Alternatively, terms such as concurrent
computing or concurrent programming are used within the high-tech community, typically
referring to the combination of parallel and distributed computing, although interpretations
may vary among practitioners.
• Ubiquitous computing refers to computing with pervasive devices at any place and time
using wired or wireless communication. The Internet of Things (IoT) is a networked connection
of everyday objects including computers, sensors, humans, etc. The IoT is supported by Internet
clouds to achieve ubiquitous computing with any object at any place and time. Finally, the term
Internet computing is even broader and covers all computing paradigms over the Internet. This
book covers all the aforementioned computing paradigms, placing more emphasis on
distributed and cloud computing and their working systems, including the clusters, grids, P2P,
and cloud systems.
Internet of Things The traditional Internet connects machines to machines or web pages to web
pages. The concept of the IoT was introduced in 1999 at MIT.
• The IoT refers to the networked interconnection of everyday objects, tools, devices, or computers.
One can view the IoT as a wireless network of sensors that interconnect all things in our daily life.
• It allows objects to be sensed and controlled remotely across existing network infrastructure
HTC systems prioritize task throughput over raw speed, addressing challenges like cost, energy
efficiency, security, and reliability.
The Shift Toward Utility Computing
Utility computing follows a pay-per-use model where computing resources are delivered as a service.
Cloud computing extends this concept, allowing distributed applications to run on edge networks.
Challenges include:
• Efficient network processors
• Scalable storage and memory
• Virtualization middleware
• New programming models
The Hype Cycle of Emerging Technologies
New technologies follow a hype cycle, progressing through:
1. Technology Trigger – Early development and research.
2. Peak of Inflated Expectations – High expectations but unproven benefits.
3. Trough of Disillusionment – Realization of limitations.
4. Slope of Enlightenment – Gradual improvements.
5. Plateau of Productivity – Mainstream adoption.
For example, in 2010, cloud computing was moving toward mainstream adoption, while broadband
over power lines was expected to become obsolete.
The Internet of Things (IoT) and Cyber-Physical Systems (CPS)
• IoT: Interconnects everyday objects (sensors, RFID, GPS) to enable real-time tracking and
automation.
• CPS: Merges computation, communication, and control (3C) to create intelligent systems
for virtual and physical world interactions.
Both IoT and CPS will play a significant role in future cloud computing and smart infrastructure
development.
1.2 Technologies for Network-Based Systems
Advancements in multicore CPUs and multithreading technologies have played a crucial role in
the development of high-performance computing (HPC) and high-throughput computing (HTC).
Advances in CPU Processors
• Modern multicore processors integrate dual, quad, six, or more processing cores to
enhance parallelism at the instruction level (ILP) and task level (TLP).
• Processor speed growth has followed Moore’s Law, increasing from 1 MIPS (VAX 780,
1978) to 22,000 MIPS (Sun Niagara 2, 2008) and 159,000 MIPS (Intel Core i7 990x, 2011).
• Clock rates have increased from 10 MHz (Intel 286) to 4 GHz (Pentium 4) but have
stabilized due to heat and power limitations.
Multicore CPU and Many-Core GPU Architectures
• Multicore processors house multiple processing units, each with private L1 cache and
shared L2/L3 cache for efficient data access.
• Many-core GPUs (e.g., NVIDIA and AMD architectures) leverage hundreds to thousands
of cores, excelling in data-level parallelism (DLP) and graphics processing.
• Example: Sun Niagara II – Built with eight cores, each supporting eight threads, achieving
a maximum parallelism of 64 threads.
Key Trends in Processor and Network Technology
• Multicore chips continue to evolve with improved caching mechanisms and increased
processing cores per chip.
• Network speeds have improved from Ethernet (10 Mbps) to Gigabit Ethernet (1 Gbps)
and beyond 100 Gbps to support high-speed data communication.
Modern distributed computing systems rely on scalable multicore architectures and high-speed
networks to handle massive parallelism, optimize efficiency, and enhance overall performance.
Multicore CPU and Many-Core GPU Architectures
Advancements in multicore CPUs and many-core GPUs have significantly influenced modern
high-performance computing (HPC) and high-throughput computing (HTC) systems. As CPUs
approach their parallelism limits, GPUs have emerged as powerful alternatives for massive
parallelism and high computational efficiency.
Multicore CPU and Many-Core GPU Trends
• Multicore CPUs continue to evolve from tens to hundreds of cores, but they face challenges
like the memory wall problem, limiting data-level parallelism (DLP).
• Many-core GPUs, with hundreds to thousands of lightweight cores, excel in DLP and
task-level parallelism (TLP), making them ideal for massively parallel workloads.
• Hybrid architectures are emerging, combining fat CPU cores and thin GPU cores on a
single chip for optimal performance.
Multithreading Technologies in Modern CPUs
• Different microarchitectures exploit parallelism at instruction-level (ILP) and thread-
level (TLP):
o Superscalar Processors – Execute multiple instructions per cycle.
o Fine-Grained Multithreading – Switches between threads every cycle.
o Coarse-Grained Multithreading – Runs one thread for multiple cycles before
switching.
o Simultaneous Multithreading (SMT) – Executes multiple threads in the same cycle.
• Modern GPUs (e.g., NVIDIA CUDA, Tesla, and Fermi) feature hundreds of cores,
handling thousands of concurrent threads.
• Example: The NVIDIA Fermi GPU has 512 CUDA cores and delivers 82.4 teraflops,
contributing to the performance of top supercomputers like Tianhe-1A.
• First, the VMs can be multiplexed between hardware machines, as shown in Figure 1.13(a).
• Second, a VM can be suspended and stored in stable storage, as shown in Figure 1.13(b).
• Third, a suspended VM can be resumed or provisioned to a new hardware platform, as shown in Figure 1.13(c).
• Finally, a VM can be migrated from one hardware platform to another, as shown in Figure 1.13(d).
The integration of memory, storage, networking, virtualization, and cloud data centers is
transforming distributed systems. By leveraging virtualization, scalable networking, and cloud
computing, modern infrastructures achieve higher efficiency, flexibility, and cost-effectiveness,
paving the way for future exascale computing.
• Clusters are connected to the Internet via a VPN gateway, which assigns an IP address to
locate the cluster.
• Each node operates independently, with its own OS, creating multiple system images
(MSI).
• The cluster manages shared I/O devices and disk arrays, providing efficient resource
utilization.
1.3.1.2 Single-System Image (SSI)
An ideal cluster should merge multiple system images into a single-system image (SSI), where all
nodes appear as a single powerful machine.
• SSI is achieved through middleware or specialized OS support, enabling CPU, memory,
and I/O sharing across all cluster nodes.
• Clusters without SSI function as a collection of independent computers rather than a unified
system.
Availability and Support Hardware and software support f o r Failover, failback, check pointing,
sustained HA in cluster rollback recovery, nonstop OS, etc.
Hardware Fault Tolerance Automated failure management to Component redundancy, hot
eliminate all single points of failure swapping, RAID, multiple
power supplies, etc.
Single System Image (SSI) Achieving SSI at functional level with Hardware mechanisms or middleware
hardware a nd software support, support t o achieve DSM at coherent
middleware, or OS extensions c a c h e level
Efficient Communications To reduce me s sa ge -passing system Fast message passing, active
overhead and hide latencies messages, enhanced MPI library, etc.
Cluster-wide Job Using a global job management Application of single-job
system with better scheduling and management systems such as LSF,
Management monitoring Codine, etc.
Dynamic Load Balancing Balancing the workload of all Workload monitoring, process
processing nodes a l o n g with failure migration, job replication and gang
recovery scheduling, etc.
Scalability and Adding more servers to a cluster or Use of scalable interconnect,
adding more clusters to a grid as the performance monitoring, distributed
Programmability workload or data set increases execution environment, and better
software tools
• Lack of a cluster-wide OS limits full resource sharing.
• Middleware solutions provide necessary functionalities like scalability, fault tolerance, and
job management.
• Key challenges include efficient message passing, seamless fault tolerance, high
availability, and performance scalability.
Server clusters are scalable, high-performance computing systems that utilize networked
computing nodes for parallel and distributed processing. Achieving SSI and efficient
middleware support remains a key challenge in cluster computing. Virtual clusters and cloud
computing are evolving to enhance cluster flexibility and resource management.
1.3.2 Grid Computing, Peer-to-Peer (P2P) Networks, and System Models
Grid Computing Infrastructures
Grid computing has evolved from Internet and web-based services to enable large-scale
distributed computing. It allows applications running on remote systems to interact in real-time.
1.3.2.1 Computational Grids
• A grid connects distributed computing resources (workstations, servers, clusters,
supercomputers) over LANs, WANs, and the Internet.
• Used for scientific and enterprise applications, including SETI@Home and astrophysics
simulations.
• Provides an integrated resource pool, enabling shared computing, data, and information
services.
1.3.2.2 Grid Families
Design Issues Computational and Data Grids P2P Grids
Grid Applications Reported Distributed supercomputing, Open grid with P2P flexibility, all
National Grid initiatives, etc. resources from client machines
Representative Systems TeraGrid built in US, ChinaGrid in JXTA, FightAid@home,
China, and the e-Science grid SETI@home
built in UK
Development Lessons Learned Restricted user groups, Unreliable user-contributed
middleware bugs, protocols to resources, limited to a few apps
acquire resources
• Computational and Data Grids – Used in national-scale supercomputing projects (e.g.,
TeraGrid, ChinaGrid, e-Science Grid).
• P2P Grids – Utilize client machines for open, distributed computing (e.g., SETI@Home,
JXTA, FightingAID@Home).
• Challenges include middleware bugs, security issues, and unreliable user-contributed
resources.
1.3.3 Peer-to-Peer (P2P) Network Families
P2P systems eliminate central coordination, allowing client machines to act as both servers and
clients.
1.3.3.1 P2P Systems
Collaboration PlatformsSkype, MSN, Multiplayer games Privacy risks, spam, lack of trust
user-driven resource sharing. Future developments will focus on security, standardization, and
efficiency improvements.
Cloud Computing over the Internet
Cloud computing has emerged as a transformative on-demand computing paradigm, shifting
computation and data storage from desktops to large data centers. This approach enables the
virtualization of hardware, software, and data resources, allowing users to access scalable
services over the Internet.
1.3.4.1 Internet Clouds
• Hybrid Cloud – Combines public and private clouds, optimizing cost and security.
Advantages of Cloud Computing
Cloud computing provides several benefits over traditional computing paradigms, including:
1. Energy-efficient data centers in secure locations.
2. Resource sharing, optimizing utilization and handling peak loads.
3. Separation of infrastructure maintenance from application development.
4. Cost savings compared to traditional on-premise infrastructure.
5. Scalability for application development and cloud-based computing models.
6. Enhanced service and data discovery for content and service distribution.
7. Security and privacy improvements, though challenges remain.
8. Flexible service agreements and pricing models for cost-effective computing.
Cloud computing fundamentally changes how applications and services are developed, deployed,
and accessed. With virtualization, scalability, and cost efficiency, it has become the backbone of
modern Internet services and enterprise computing. Future advancements will focus on security,
resource optimization, and hybrid cloud solutions.
1.4 Software Environments for Distributed Systems and Clouds
This section introduces Service-Oriented Architecture (SOA) and other key software environments
that enable distributed and cloud computing systems. These environments define how
applications, services, and data interact within grids, clouds, and P2P networks.
1.4.1 Service-Oriented Architecture (SOA)
SOA enables modular, scalable, and reusable software components that communicate over a
network. It underpins web services, grids, and cloud computing environments.
1.4.1.1 Layered Architecture for Web Services and Grids
• Distributed computing builds on the OSI model, adding layers for service interfaces,
workflows, and management.
SOA has expanded from basic web services to complex multi-layered ecosystems:
• Sensor Services (SS) – Devices like ZigBee, Bluetooth, GPS, and WiFi collect raw data.
• Filter Services (FS) – Process data before feeding into computing, storage, or discovery
clouds.
• Cloud Ecosystem – Integrates compute clouds, storage clouds, and discovery clouds for
managing large-scale applications.
SOA enables data transformation from raw data → useful information → knowledge → wisdom
→ intelligent decisions.
SOA defines the foundation for web services, distributed systems, and cloud computing. By
integrating sensors, processing layers, and cloud resources, SOA provides a scalable, flexible
approach for modern computing applications. The future of distributed computing will rely on
intelligent data processing, automation, and service-driven architectures.
1.4.1.4 Grids vs. Clouds
• Grids use static resources, whereas clouds provide elastic, on-demand resources via
virtualization.
• Clouds focus on automation and scalability, while grids are better for negotiated
resource allocation.
• Hybrid models exist, such as clouds of grids, grids of clouds, and inter-cloud architectures.
• Distributed OS models are evolving, with MOSIX2 enabling process migration and
resource sharing across Linux clusters.
• Parallel programming models like MPI and MapReduce optimize large-scale computing.
• Cloud computing and grid computing continue to merge, leveraging virtualization and
elastic resource management.
• Standardized middleware (OGSA, Globus) enhances grid security, interoperability, and
automation.
1.5 Performance, Security, and Energy Efficiency
This section discusses key design principles for distributed computing systems, covering
performance metrics, scalability, system availability, fault tolerance, and energy efficiency.
1.5.1 Performance Metrics and Scalability Analysis
Performance is measured using MIPS, Tflops, TPS, and network latency. Scalability is crucial in
distributed systems and has multiple dimensions:
1. Size Scalability – Expanding system resources (e.g., processors, memory, storage) to improve
performance.
2. Software Scalability – Upgrading OS, compilers, and libraries to accommodate larger
systems.
3. Application Scalability – Increasing problem size to match system capacity for cost-
effectiveness.
4. Technology Scalability – Adapting to new hardware and networking technologies while
ensuring compatibility.
1.5.1.3 Scalability vs. OS Image Count
• SMP systems scale up to a few hundred processors due to hardware constraints.
• NUMA systems use multiple OS images to scale to thousands of processors.
• Clusters and clouds scale further by using virtualization.
• Grids integrate multiple clusters, supporting hundreds of OS images.
• P2P networks scale to millions of nodes with independent OS images.
• Speedup Formula:
where α is the fraction of the workload that is sequential.
• Even with hundreds of processors, speedup is limited if sequential execution (α) is high.
Problem with Fixed Workload
• In Amdahl’s law, we have assumed the same amount of workload for both sequential and parallel
execution of the program with a fixed problem size or data set. This was called fixed-workload speedup
by Hwang and Xu [14]. To execute a fixed workload on n processors, parallel processing may lead to
a system efficiency defined as follows:
• Speedup Formula:
• This speedup is known as Gustafson’s law. By fixing the parallel execution time at
level W, the following efficiency expression is obtained:
• More efficient for large clusters, as workload scales dynamically with system size.
1.5.2 Fault Tolerance and System Availability
• High availability (HA) is essential in clusters, grids, P2P networks, and clouds.
• System availability depends on Mean Time to Failure (MTTF) and Mean Time to Repair
(MTTR): Availability=MTTF/(MTTF+MTTR)
• Eliminating single points of failure (e.g., hardware redundancy, fault isolation) improves
availability.
• P2P networks are highly scalable but have low availability due to frequent peer failures.
• Grids and clouds offer better fault isolation and thus higher availability than traditional
clusters.
• Scalability and performance depend on resource expansion, workload distribution, and
parallelization.
• Amdahl’s Law limits speedup for fixed workloads, while Gustafson’s Law optimizes
large-scale computing.
• High availability requires redundancy, fault tolerance, and system design improvements.
• Clouds and grids balance scalability and availability better than traditional SMP or
NUMA systems.
Network Threats, Data Integrity, and Energy Efficiency
This section highlights security challenges, energy efficiency concerns, and mitigation strategies
in distributed computing systems, including clusters, grids, clouds, and P2P networks.
1.5.3 Network Threats and Data Integrity
Distributed systems require security measures to prevent cyberattacks, data breaches, and
unauthorized access.
1.5.3.1 Threats to Systems and Networks
• Improper Authentication – Allows attackers to steal resources, modify data, and conduct
replay attacks.
1.5.3.2 Security Responsibilities
Security in cloud computing is divided among different stakeholders based on the cloud service
model:
• SaaS: Cloud provider handles security, availability, and integrity.
• PaaS: Provider manages integrity and availability, while users control confidentiality.
• IaaS: Users are responsible for most security aspects, while providers ensure availability.
1.5.3.3 Copyright Protection
• Collusive piracy in P2P networks allows unauthorized file sharing.
• Content poisoning and timestamped tokens help detect piracy and protect digital rights.
1.5.3.4 System Defense Technologies
Three generations of network security have evolved:
1. Prevention-based – Access control, cryptography.
2. Detection-based – Firewalls, intrusion detection systems (IDS), Public Key Infrastructure
(PKI).
3. Intelligent response systems – AI-driven threat detection and response.
1.5.3.5 Data Protection Infrastructure
• Trust negotiation ensures secure data sharing.
• Worm containment & intrusion detection protect against cyberattacks.
• Cloud security responsibilities vary based on the service model (SaaS, PaaS, IaaS).
Module 2
Virtual Machines and Virtualization of Clusters and Data Centers: Implementation Levels of
Virtualization, Virtualization Structure/Tools and Mechanisms, Virtualization of CPU/Memory
and I/O devices, Virtual Clusters and Resource Management, Virtualization for Data Center
Automation.
Example Scenarios
Benefits of Virtualization
Virtualization Layer
The virtualization layer is a software layer that abstracts physical hardware resources (CPU,
memory, storage, network, etc.) and presents them as virtual resources to applications and
operating systems. It acts as a bridge between the physical hardware and virtual instances,
ensuring proper allocation, isolation, and management of resources.
1. Instruction Emulation:
o The source ISA (e.g., MIPS) is emulated on the target ISA (e.g., x86) through
a software layer.
o The software layer interprets or translates the source instructions into target
machine instructions.
2. Virtual ISA (V-ISA):
1. Code Interpretation:
o Process: An interpreter program translates source instructions into host
(native) instructions one-by-one during execution.
o Characteristics:
▪ Simple to implement.
▪ High overhead due to the need to process each instruction individually.
o Performance: Slow, as each source instruction may require tens or even
hundreds of native instructions to execute.
2. Dynamic Binary Translation:
o Process:
▪ Instead of interpreting instructions one-by-one, this method translates
blocks of source instructions (basic blocks, traces, or superblocks)
into target instructions.
▪ The translated blocks are cached, so subsequent executions do not need
re-translation.
o Characteristics:
▪ Faster than interpretation due to caching and reuse of translated
instructions.
▪ Optimization opportunities arise from analyzing multiple instructions
in a block.
o Performance: Significantly better than interpretation but requires more
complex implementation.
3. Binary Translation and Optimization:
o Purpose: Enhance performance and reduce the overhead of translation.
o Methods:
▪ Static Binary Translation: Translates the entire binary code before
execution, which avoids runtime translation but can miss opportunities
for runtime optimizations.
▪ Dynamic Binary Translation: Translates instructions at runtime,
enabling better adaptability to runtime conditions.
▪ Dynamic Optimizations: Includes reordering, inlining, and loop
unrolling to improve the efficiency of translated code.
ISA-level virtualization via instruction set emulation opens immense possibilities for
running diverse workloads across platforms, supporting legacy systems, and enabling
hardware independence. The shift from simple interpretation to more advanced
techniques like dynamic binary translation and optimizations has significantly
improved its performance and applicability, making it a key enabler for cross-platform
software execution.
o Allows legacy binary code (e.g., for MIPS or PowerPC) to run on newer
hardware (e.g., x86 or ARM).
o Extends the lifespan of legacy software without needing hardware redesign.
2. Cross-Architecture Compatibility:
o Applications can run on hardware with different ISAs, enhancing portability
and flexibility.
3. Facilitates Hardware Upgrades:
o Software compiled for older processors can run on modern processors, easing
transitions to new architectures.
4. Enables Platform Independence:
o Virtual ISAs abstract the underlying hardware, enabling software to operate
across heterogeneous platforms.
1. Performance Overhead:
o Emulating an ISA on another is inherently slower due to instruction-by-
instruction interpretation or translation.
o Dynamic binary translation improves performance but still adds runtime
overhead.
2. Complexity:
o Implementing dynamic binary translation and optimizations requires advanced
techniques and significant development effort.
3. Scalability:
o Supporting highly diverse ISAs can become challenging, especially when
optimizing performance for multiple architectures.
1. Bare-Metal Hypervisors:
o A hypervisor (Type 1) operates directly on the hardware without requiring an
underlying host operating system.
o It creates and manages virtual hardware environments for virtual machines.
2. Resource Virtualization:
o Virtualizes hardware components such as CPUs, memory, network interfaces,
and storage.
o VMs appear to have dedicated hardware, even though they share the
underlying physical resources.
3. Improved Hardware Utilization:
o Allows multiple users or workloads to share the same hardware, increasing
resource utilization and efficiency.
4. Isolation:
o Each VM operates in isolation, meaning that the failure or compromise of one
VM does not affect others.
1. High Performance:
o Since the hypervisor runs directly on hardware, it minimizes overhead and
provides near-native performance for VMs.
2. Scalability:
o Easily supports multiple VMs, enabling efficient use of physical server
resources.
3. Fault Isolation:
o Problems in one VM (e.g., OS crashes or software bugs) do not impact other
VMs or the host system.
4. Versatility:
o Supports running different operating systems or environments on the same
physical hardware.
Operating System (OS) level virtualization is a type of virtualization that operates at the OS
kernel layer, creating isolated environments called containers or virtual environments within
a single instance of an operating system. This approach allows multiple isolated user spaces to
run on the same physical hardware while sharing the same operating system kernel.
1. Single OS Kernel:
o All containers share the same underlying OS kernel, eliminating the need for
separate kernels for each environment.
o More lightweight compared to traditional hardware-level virtualization since
there's no need to emulate hardware.
2. Isolated Environments (Containers):
o Containers behave like independent servers, with their own libraries, binaries,
and configuration files.
o Processes running inside one container are isolated from processes in other
containers.
3. Efficient Resource Utilization:
o OS-level virtualization efficiently shares hardware resources like CPU,
memory, and storage across containers.
o Reduces overhead compared to full virtualization, as there is no need for a
hypervisor or virtual hardware.
2. High Performance:Since all containers share the same OS kernel, there is minimal
overhead, resulting in near-native performance.
3. Scalability:Containers can be created, started, stopped, and destroyed quickly, making
them ideal for dynamic environments.
4. Isolation:Although containers share the same kernel, they provide process and file
system isolation, ensuring that one container does not interfere with another.
5. Ease of Deployment:Containers package applications with their dependencies, making
them portable across different environments.
1. Single OS Limitation:Since all containers share the same kernel, they must use the
same operating system. For example, you cannot run a Windows container on a Linux
host.
2. Weaker Isolation:Compared to hardware-level virtualization, OS-level virtualization
provides less isolation. If the kernel is compromised, all containers are at risk.
3. Compatibility Issues:Applications that require specific kernel modules or features not
supported by the shared kernel may face compatibility challenges.
1. API Hooks:
o Applications typically interact with the operating system via APIs exported by
user-level libraries.
o Library-level virtualization works by intercepting API calls and redirecting
them to virtualized implementations.
2. Controlled Communication:
o Virtualization happens by managing the communication link between the
application and the underlying system.
o This approach avoids direct interaction with the operating system and replaces
it with controlled, virtualized responses.
3. Application-Specific Virtualization:
o Focused on enabling specific features or compatibility, such as supporting
applications from one environment on another.
• Applications are written to use standard library calls for their functionality, such as file
access, networking, or graphics.
• Library-level virtualization intercepts these calls (using API hooks) and replaces the
original functionality with emulated or redirected behavior.
portability. It plays a critical role in scenarios like running software across platforms,
leveraging hardware features in virtualized environments, and extending the life of
legacy applications. While it may not provide the full isolation or flexibility of OS- or
hardware-level virtualization, its efficiency and simplicity make it invaluable for
specific use cases.
1. Cross-Platform Compatibility:
o Applications written for an abstract VM (e.g., JVM, CLR) can run on any
system with the corresponding VM implementation.
2. Improved Security:
o Applications are isolated from the host OS and other applications, reducing the
risk of system compromise or interference.
3. Simplified Deployment:
o Applications can be distributed as self-contained packages, eliminating the need
for complex installation procedures or OS-level dependencies.
4. Resource Efficiency:
o Compared to hardware- or OS-level virtualization, application-level
virtualization has lower overhead as it focuses only on individual processes.
5. Portability:
o Virtualized applications can be easily moved between systems or platforms.
1. Performance Overhead:
o Running applications in a virtualized environment may introduce some latency
compared to native execution.
2. Limited Scope:
o Unlike OS- or hardware-level virtualization, application-level virtualization
cannot provide a full OS environment or support multiple users.
3. Compatibility Challenges:
o Not all applications can be easily virtualized, especially those with tight
integration with the underlying OS or hardware.
In the above table, the column headings correspond to four technical merits. “Higher
Performance” and “Application Flexibility” are self-explanatory. “Implementation
Complexity” implies the cost to implement that particular virtualization level. “Application
Isolation” refers to the effort required to isolate resources committed to different VMs.
The number of X’s in the table cells reflects the advantage points of each implementation level.
Five X’s implies the best case and one X implies the worst case. Overall, hardware and OS
support will yield the highest performance. However, the hardware and application levels are
also the most expensive to implement. User isolation is the most difficult to achieve. ISA
implementation offers the best application flexibility.
Hardware-level virtualization adds a layer, the Virtual Machine Monitor (VMM), between
the hardware and operating systems. The VMM manages hardware resources and allows
multiple operating systems to run simultaneously on a single hardware setup by virtualizing
components like the CPU. A VMM must meet three key requirements:
Efficiency is crucial for VMMs, as slow emulators or interpreters are unsuitable for real
machines. To ensure performance, most virtual processor instructions should execute directly
on physical hardware without VMM intervention.
The VMM manages resources by allocating them to programs, restricting unauthorized access,
and regaining control when needed. However, implementing VMMs can be challenging on
certain processor architectures (e.g., x86), where privileged instructions cannot always be
trapped. Processors not designed for virtualization may require hardware modifications to meet
VMM requirements, a method known as hardware-assisted virtualization.
Key Observations:
• VMware Workstation supports a wide range of guest operating systems and uses full
virtualization.
• VMware ESX Server eliminates a host OS, running directly on hardware with para-
virtualization.
• Xen supports diverse host OSs and uses a hypervisor-based architecture.
• KVM runs exclusively on Linux hosts and supports para-virtualization for multiple
architectures.
Cloud computing, enabled by VM technology, shifts the cost and responsibility of managing
computational centers to third parties, resembling the role of banks. While transformative, it
faces two significant challenges:
To address these challenges and enhance cloud computing efficiency, significant research and
development are needed.
VEs share the same OS kernel but appear as independent servers to users, each with its own
processes, file system, user accounts, network settings, and more. This approach, known as
single-OS image virtualization, is an efficient alternative to hardware-level virtualization.
Figure 3.3 illustrates operating systems virtualization from the point of view of a machine
stack.
1. Efficiency and Scalability: OS-level VMs have low startup/shutdown costs, minimal
resource requirements, and high scalability.
2. State Synchronization: VMs can synchronize state changes with the host environment
when needed.
In cloud computing, these features address the slow initialization of hardware-level VMs and
their inability to account for the current application state.
The primary disadvantage of OS-level virtualization is that all VMs on a single container must
belong to the same operating system family. For example, a Linux-based container cannot run
a Windows OS. This limitation challenges its usability in cloud computing, where users may
prefer different operating systems.
1. Duplicating resources for each VM: This incur high resource costs and overhead.
2. Sharing most resources with the host and creating private copies on demand: This
is more efficient and commonly used.
Due to its limitations and overhead in some scenarios, OS-level virtualization is often
considered a secondary choice compared to hardware-assisted virtualization.
• Most Linux platforms are not tied to a specific kernel, enabling a host to run multiple
VMs simultaneously on the same hardware.
• Linux-based tools, such as Linux vServer and OpenVZ, support running applications
from other platforms through virtualization.
• On Windows, FVM is a specific tool developed for OS-level virtualization on the
Windows NT platform.
Key Features:
1. Isolation:
o Each VPS has its own files, user accounts, process tree, virtual network, virtual
devices, and interprocess communication (IPC) mechanisms.
2. Resource Management:
o Disk Allocation: Two levels:
▪ First level: The OpenVZ server administrator assigns disk space limits
to each VM.
▪ Second level: VM administrators manage disk quotas for users and
groups.
o CPU Scheduling:
▪ First level: OpenVZ's scheduler allocates time slices based on virtual
CPU priority and limits.
▪ Second level: Uses the standard Linux CPU scheduler.
o Resource Control: OpenVZ has ~20 parameters to control VM resource usage.
3. Checkpointing and Live Migration:
o Allows saving the complete state of a VM to a disk file, transferring it to another
machine, and restoring it there.
o The process takes only a few seconds, although network connection re-
establishment causes minor delays.
Advantages:
Challenges:
1. WABI:
o Middleware that translates Windows system calls into Solaris system calls,
allowing Windows applications to run on Solaris systems.
2. Lxrun:
o A system call emulator enabling Linux applications designed for x86 hosts to
run on UNIX systems.
3. WINE:
o Provides library support to virtualize x86 processors, enabling Windows
applications to run on UNIX-based systems.
4. Visual MainWin:
o A compiler support system that allows developers to use Visual Studio to
create Windows applications capable of running on some UNIX hosts.
5. vCUDA:
o A virtualization solution for CUDA, enabling applications requiring GPU
acceleration to utilize GPU resources remotely. (Discussed in detail in Example
3.2.)
Key Benefits:
Challenges:
1. Purpose:
o Virtualizes the CUDA library for guest OSes, enabling CUDA applications to
execute GPU-based tasks indirectly through the host OS.
2. Architecture:
o Follows a client-server model with three main components:
▪ vCUDA Library:
▪ Resides in the guest OS as a substitute for the standard CUDA
library.
▪ Intercepts and redirects API calls to the host OS.
▪ Manages virtual GPUs (vGPUs).
▪ Virtual GPU (vGPU):
▪ Abstracts GPU hardware, provides a uniform interface, and
manages device memory allocation.
▪ Tracks and stores CUDA API flow.
▪ vCUDA Stub:
▪ Resides in the host OS.
▪ Receives and interprets requests from the guest OS.
▪ Creates execution contexts for CUDA API calls and manages the
physical GPU resources.
3. Functionality of vGPU:
o Abstracts the GPU structure, giving applications a consistent view of hardware.
o Handles memory allocation by mapping virtual addresses in the guest OS to real
device memory in the host OS.
o Stores the flow of CUDA API calls for proper execution.
4. Workflow:
o CUDA applications on the guest OS send API calls to the vCUDA library.
o The vCUDA library redirects these calls to the vCUDA stub on the host OS.
o The vCUDA stub processes the requests, executes them on the physical GPU,
and returns results to the guest OS.
Benefits of vCUDA:
Challenges:
• Relies heavily on the client-server architecture and the efficiency of API call
redirection.
There are three typical classes of VM architectures, differentiated by the placement of the
virtualization layer in the system stack. Virtualization transforms a machine’s architecture by
inserting a virtualization layer between the hardware and the operating system. This layer
converts real hardware into virtual hardware, enabling different operating systems (e.g., Linux
and Windows) to run simultaneously on the same physical machine.
Classes of VM Architectures:
Key Points:
• The virtualization layer is crucial for translating real hardware into virtual hardware.
• These architectures enable flexibility in running multiple operating systems on the same
machine.
• Hypervisors (or VMMs) and other approaches vary in performance, complexity, and
implementation.
The hypervisor (or Virtual Machine Monitor, VMM) enables hardware-level virtualization by
acting as an intermediate layer between physical hardware (e.g., CPU, memory, disk, network
interfaces) and the operating systems (OS). It facilitates the creation of virtual resources that
guest OSes and applications can utilize.
1. Micro-Kernel Hypervisor:
o Only includes essential and unchanging functionalities, such as physical
memory management and processor scheduling.
o Device drivers and other changeable components are kept outside the
hypervisor.
o Examples: Microsoft Hyper-V.
o Advantages: Smaller code size, reduced complexity, and easier maintainability.
2. Monolithic Hypervisor:
o Integrates all functionalities, including device drivers, within the hypervisor
itself.
o Examples: VMware ESX for server virtualization.
o Advantages: Comprehensive functionality but with a larger codebase and
potential complexity.
• Supports virtualized access to physical hardware through hypercalls for guest OSes
and applications.
• Converts physical devices into virtual resources for use by virtual machines (VMs).
• Plays a critical role in resource management and scheduling for multiple VMs.
These architectures allow efficient use of physical hardware while enabling multiple OSes to
run simultaneously.
A key feature of Xen is Domain 0 (Dom0), a privileged guest OS that manages hardware
access and resource allocation for other guest domains (Domain U). Since Dom0 controls the
entire system, its security is critical. If compromised, an attacker can control all virtual
machines.
Xen allows users to manage VMs flexibly creating, copying, migrating, and rolling back
instances. However, this flexibility also introduces security risks, as VMs can revert to previous
Note: A key feature of Xen is Domain 0 (Dom0), a privileged virtual machine responsible for
managing hardware, I/O operations, and other guest VMs (Domain U). Dom0 is the first OS
to load and has direct hardware access, allowing it to allocate resources and manage devices
for unprivileged guest domains.
Application
Application
Application
Application
Application
Application
Application
Application
Application
Domain0 XenoLinux XenoWindows
XEN (Hypervisor)
Hardware devices
FIGURE 3.5
The Xen architecture’s special domain 0 for control and I/O, and several guest domains for user applications.
Full Virtualization
Host-Based Virtualization
While host-based virtualization offers flexibility, it is generally less efficient than full
virtualization with a VMM.
Challenges of Para-Virtualization
1. Compatibility & Portability Issues: Since para-virtualization modifies the guest OS,
supporting unmodified OSes becomes difficult.
2. High Maintenance Costs: OS kernel modifications require ongoing updates and
support.
3. Variable Performance Gains: The performance improvement depends on the
workload and system architecture.
Para-Virtualization Architecture
• Guest OS Modification: The OS kernel is modified, but user applications may also
need changes.
• Hypercalls: Privileged instructions that would normally run at Ring 0 are replaced
with hypercalls to the hypervisor.
• Intelligent Compiler: A specialized compiler assists in identifying and replacing
nonvirtualizable instructions with hypercalls, optimizing performance.
• Improved Efficiency: Compared to full virtualization, para-virtualization
significantly reduces overhead, making VM execution closer to native
performance.
• Limitation: Since the guest OS is modified, it cannot run directly on physical
hardware without a hypervisor.
Due to the inefficiency of binary translation, many virtualization solutions, including Xen,
KVM, and VMware ESX, use para-virtualization.
VMware pioneered the virtualization market, providing solutions for desktops, servers, and
data centers. VMware ESX is a bare-metal hypervisor designed for x86 symmetric
multiprocessing (SMP) servers, enabling efficient virtualization by directly managing
hardware resources.
Para-Virtualization in ESX
• The VMkernel interacts directly with the hardware, bypassing the need for a host OS.
• Para-virtualized drivers (e.g., VMXNET for networking, PVSCSI for disk I/O) improve
performance.
• Provides better efficiency than full virtualization while supporting unmodified guest
OSes via hardware-assisted virtualization (Intel VT, AMD-V).
Modern processors support multiple processes running simultaneously, but they require
protection mechanisms to prevent system crashes. This is achieved by dividing execution into
user mode and supervisor mode:
1. VMware Workstation
o A host-based virtualization software suite for x86 and x86-64 systems.
o Runs multiple VMs simultaneously on a host OS.
2. Xen Hypervisor
o Works on IA-32, x86-64, Itanium, and PowerPC 970 architectures.
o Modifies Linux to function as a hypervisor, controlling guest OSes.
3. KVM (Kernel-Based Virtual Machine)
o Integrated into the Linux kernel as a virtualization infrastructure.
o Supports hardware-assisted virtualization (Intel VT-x, AMD V) and
paravirtualization via the VirtIO framework.
o VirtIO components include:
▪ Paravirtual Ethernet Card (for networking).
▪ Disk I/O Controller (optimized storage access).
▪ Balloon Device (dynamically adjusts VM memory allocation).
Example 3.4: Hardware Support for Virtualization in the Intel x86 Processor
Figure 3.10 provides an overview of Intel’s full virtualization techniques. For processor
virtualization, Intel offers the VT-x or VT-i technique. VT-x adds a privileged mode (VMX
Root Mode) and some instructions to processors. This enhancement traps all sensitive
instructions in the VMM automatically. For memory virtualization, Intel offers the EPT, which
translates the virtual address to the machine’s physical addresses to improve performance. For
I/O virtualization, Intel implements VT-d and VT-c to support this.
A Virtual Machine (VM) replicates a real computer system, executing most instructions on the
host processor in native mode for efficiency. However, critical instructions must be carefully
managed to ensure stability and correctness.
1. Privileged Instructions
o Execute only in privileged mode.
o If executed in user mode, they trigger a trap.
2. Control-Sensitive Instructions
o Modify system resources (e.g., changing memory configuration).
3. Behavior-Sensitive Instructions
• RISC CPUs (e.g., PowerPC, SPARC) are naturally virtualizable since all sensitive
instructions are privileged.
• x86 CPUs were not originally designed for virtualization, as some sensitive instructions
(e.g., SGDT, SMSW) are not privileged.
o These instructions bypass the VMM, making virtualization difficult without
software-based techniques like binary translation.
Performance Considerations
• High efficiency expected, but switching between hypervisor and guest OS causes
overhead.
• Hybrid Approach (used by VMware):
o Offloads some tasks to hardware while keeping others in software.
• Combining Para-Virtualization with Hardware-Assisted Virtualization further boosts
performance.
• Memory Management Unit (MMU) and Translation Lookaside Buffer (TLB) help
optimize performance.
• Guest OS controls virtual-to-physical mapping but cannot directly access machine
memory.
• VMM (Hypervisor) handles actual memory allocation to prevent conflicts.
4. VMware's Approach
• Software-based shadow page tables were inefficient and caused high performance
overhead.
• Frequent memory lookups and context switches slowed down virtualized environments.
• Hardware-assisted memory virtualization that eliminates the need for shadow page
tables.
• Works with Virtual Processor ID (VPID) to optimize Translation Lookaside Buffer
(TLB) usage.
• Reduces memory lookup time and improves performance significantly.
• Translation Process:
1. Guest OS uses Guest CR3 (Control Register 3) to point to L4 page table.
2. CPU must translate Guest Physical Address (GPA) to Host Physical Address
(HPA) using EPT.
3. The CPU first checks the EPT TLB for an existing translation.
4. If not found, it searches the EPT page tables (up to 5 times in the worst case).
5. If still not found, an EPT violation exception is triggered.
6. The CPU will access memory multiple times to resolve the mapping (up to 20
memory accesses).
• Intel increased the size of EPT TLB to store more translations and reduce memory
accesses.
• This dramatically improves memory access speed and virtualization efficiency.
I/O virtualization allows virtual machines (VMs) to share and access physical I/O devices
efficiently. There are three main methods to implement I/O virtualization:
• Intel VT-d technology supports I/O DMA remapping and device interrupt remapping.
• Helps unmodified, specialized, or virtualization-aware guest OSes run efficiently.
Summary
I/O virtualization continues to evolve, with hardware-assisted methods like VT-d and SV-
IO improving efficiency and reducing overhead.
Disadvantages
As multi-core processors become more prevalent, virtualizing them presents unique challenges
compared to single-core processors. While multi-core CPUs offer higher performance by
integrating multiple cores on a single chip, virtualization introduces complexities in task
scheduling, parallel execution, and resource management.
3. Dynamic Heterogeneity
• The integration of different types of cores (fat CPU cores & thin GPU cores) on the
same chip makes resource management more complex.
• As transistor reliability decreases and complexity increases, system designers must
adapt scheduling techniques dynamically.
Conclusion:
Virtual clusters provide flexibility, efficient resource usage, and better fault tolerance.
However, they require careful management for fast deployment, effective load balancing, and
optimized storage. Strategies like automated configuration and optimized migration help
improve performance while reducing overhead.
In a mixed-node cluster, virtual machines (VMs) typically run on physical hosts, but if a host
fails, its VM role can be taken over by another VM on a different host. This enables flexible
failover compared to traditional physical-to-physical failover. However, if a host fails, its VMs
also fail, which can be mitigated through live VM migration.
1. Guest-Based Manager: The cluster manager runs inside the guest OS (e.g., OpenMosix,
Sun’s Oasis).
2. Host-Based Manager: The cluster manager runs on the host OS, supervising VMs (e.g.,
VMware HA).
3. Independent Manager: Both guest and host have separate cluster managers, increasing
complexity.
4. Integrated Cluster Management: A unified manager controls both virtual and physical
resources.
1. Start Migration: Identify the VM and destination host, often triggered by load balancing
or server consolidation strategies.
2. Memory Transfer: The VM’s memory is copied to the destination host in multiple
rounds, ensuring minimal disruption.
3. Suspend and Final Copy: The VM pauses briefly to transfer the last memory portion,
CPU, and network states.
4. Commit and Activate: The destination host loads the VM state and resumes execution.
5. Redirect Network & Cleanup: The network redirects to the new VM, and the old VM
is removed.
Performance Effects:
• The first memory copy takes 63 seconds, reducing network speed from 870 MB/s to
765 MB/s.
• Additional memory copy rounds further reduce speed to 694 MB/s in 9.8 seconds.
• The total downtime is only 165 milliseconds, ensuring minimal service disruption.
Live VM migration enhances cloud computing by enabling seamless workload balancing and
minimizing downtime during host failures. Platforms like VMware and Xen support these
migrations, allowing multiple VMs to run efficiently on a shared physical infrastructure.
Active VM on Host A
Stage 1: Reservation
Stage 4: Commitment
Shared clusters reduce costs and improve resource utilization. When migrating a system to a
new physical node, key factors include memory migration, file system migration, and
network migration.
1. Memory Migration
3. Network Migration
• Precopy Approach:
o Transfers all memory pages first, then iteratively copies only modified pages.
o Reduces downtime but increases total migration time.
• Checkpoint/Recovery & Trace/Replay (CR/TR-Motion):
o Transfers execution logs instead of dirty pages, minimizing migration time.
o Limited by differences in source and target system performance.
• Postcopy Approach:
o Transfers memory pages once but has higher downtime due to fetch delays.
• Memory Compression:
o Uses spare CPU resources to compress memory pages before transfer, reducing
data size.
Key Takeaways
Trade-offs in Migration
• The compression algorithm must be fast and effective for different types of memory
data.
• Using a single compression method for all memory pages is not efficient because
different memory types require different strategies.
Conclusion
Live migration in Xen, enhanced by RDMA, allows seamless VM transfer with minimal impact
on performance. Techniques like precopying, dirty bitmaps, and compression improve
efficiency while ensuring smooth operation.
Note:
• Virtual clusters help efficiently manage and allocate resources.
• COD and VIOLIN show that dynamic adaptation can significantly improve resource
utilization.
• Live migration allows VMs to be moved with minimal downtime.
• These techniques enable scalable, flexible, and cost-effective cloud computing
solutions.
Data centers have expanded rapidly, with major IT companies like Google, Amazon, and
Microsoft investing heavily in automation. This automation dynamically allocates hardware,
software, and database resources to millions of users while ensuring cost-effectiveness and
Quality of Service (QoS). The rise of virtualization and cloud computing has driven this
transformation, with market growth from $1.04 billion in 2006 to a projected $3.2 billion by
2011.
• Chatty workloads (e.g., web video services) that have fluctuating demand.
• Noninteractive workloads (e.g., high-performance computing) that require consistent
resource allocation.
To meet peak demand, resources are often statically allocated, leading to underutilized servers
and wasted costs in hardware, space, and power. Server consolidation—particularly
virtualization-based consolidation—optimizes resource management by reducing physical
servers and improving hardware utilization.
By leveraging virtualization and multicore processing (CMP), data centers can enhance
efficiency, but optimization in memory access, VM reassignment, and power management
remains a challenge.
Parallax is a scalable virtual storage system designed for cluster-based environments. It enables
efficient storage management by using a set of per-host storage appliances that share access to
a common block device.
• Efficient Block Virtualization: Uses Xen’s block tap driver and tapdisk library for
handling block storage requests across VMs.
• Storage Appliance VM: Acts as an intermediary between client VMs and physical
hardware, facilitating live upgrades of block device drivers.
Parallax enhances flexibility, scalability, and ease of storage management in virtualized data
centers by integrating advanced block storage virtualization techniques.
To function as cloud providers, data centers must be virtualized using Virtual Infrastructure
(VI) managers and Cloud OSes. Table 3.6 outlines four such platforms:
1. Nimbus (Open-source)
2. Eucalyptus (Open-source)
3. OpenNebula (Open-source)
4. vSphere 4 (Proprietary, VMware)
• VM Creation & Management: All platforms support virtual machines and virtual
clusters for elastic cloud resources.
• Virtual Networking: Nimbus, Eucalyptus, and OpenNebula offer virtual network
support, enabling flexible communication between VMs.
• Dynamic Resource Provisioning: OpenNebula stands out by allowing advance
reservations of cloud resources.
• Hypervisor Support:
o Nimbus, Eucalyptus, and OpenNebula use Xen & KVM for virtualization.
o vSphere 4 utilizes VMware ESX & ESXi hypervisors.
• Virtual Storage & Data Protection: Only vSphere 4 supports virtual storage along with
networking and data protection.
Eucalyptus is an open-source software system designed for private cloud infrastructure and
Infrastructure as a Service (IaaS). It enables virtual networking and VM management, but does
not support virtual storage.
Eucalyptus provides a flexible and scalable solution for private cloud networking but lacks
some security and general-purpose cloud features.
vSphere 4, released by VMware in April 2009, is a virtualization platform designed for private
cloud management. It extends earlier VMware products like Workstation, ESX, and Virtual
Infrastructure. The system interacts with applications through vCenter and provides
infrastructure and application services.
Users must understand vCenter interfaces to manage applications effectively. More details are
available on the vSphere 4 website.
A Virtual Machine Monitor (VMM) creates and manages Virtual Machines (VMs) by acting
as a software layer between the operating system and hardware. It provides secure isolation
and manages access to hardware resources, making it the foundation of security in virtualized
environments. However, if a hacker compromises the VMM or management VM, the entire
system is at risk. Security issues also arise from random number reuse, which can lead to
encryption vulnerabilities and TCP hijacking attacks.
Intrusion Detection Systems (IDS) help identify unauthorized access. IDS can be:
A VM-based IDS leverages virtualization to isolate VMs, preventing compromised VMs from
affecting others. The Virtual Machine Monitor (VMM) can audit access requests, combining
the strengths of HIDS and NIDS. There are two methods for implementation:
Garfinkel and Rosenblum proposed a VMM-based IDS that monitors guest VMs using a policy
framework and trace-based security enforcement. However, logs used for analysis can be
compromised if the operating system is attacked.
Besides IDS, honeypots and honeynets are used to detect attacks by tricking attackers into
interacting with fake systems. Honeypots can be physical or virtual, and in virtual honeypots,
the host OS and VMM must be protected to prevent attacks from guest VMs.
EMC and VMware collaborated to develop security middleware for trust management in
distributed systems and private clouds. The concept of trusted zones was introduced to enhance
security in virtual clusters, where multiple applications and OS instances for different tenants
operate in separate virtual environments.
The trusted zones ensure secure isolation of VMs while allowing controlled interactions among
tenants, providers, and global communities. This approach strengthens security in private cloud
environments.