0% found this document useful (0 votes)
55 views34 pages

TN-2117-Nutanix-Files-Performance

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views34 pages

TN-2117-Nutanix-Files-Performance

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Nutanix Files

Performance
Nutanix Tech Note

Version 3.1 • October 2020 • TN-2117


Nutanix Files Performance

Copyright
Copyright 2020 Nutanix, Inc.
Nutanix, Inc.
1740 Technology Drive, Suite 150
San Jose, CA 95110
All rights reserved. This product is protected by U.S. and international copyright and intellectual
property laws.
Nutanix is a trademark of Nutanix, Inc. in the United States and/or other jurisdictions. All other
marks and names mentioned herein may be trademarks of their respective companies.

Copyright | 2
Nutanix Files Performance

Contents

1. Executive Summary.................................................................................5

2. Introduction.............................................................................................. 6
2.1. Audience.........................................................................................................................6
2.2. Purpose.......................................................................................................................... 6

3. Nutanix Enterprise Cloud Overview...................................................... 8


3.1. Nutanix HCI Architecture............................................................................................... 9
3.2. Nutanix Files.................................................................................................................. 9

4. Nutanix Files Hardware Resource Usage............................................11


4.1. Nutanix Files Resources.............................................................................................. 11

5. Four Corners Workload........................................................................ 12


5.1. Linear Scalability with One, Four, Eight, and Twelve Nodes ...................................... 13

6. Nutanix Files NFS Performance........................................................... 14


6.1. NFS Workload: Tar and GCC...................................................................................... 14
6.2. NFS Workload: rsync Performance............................................................................. 15

7. Application Workloads.......................................................................... 16
7.1. Share Type: Random and Sequential..........................................................................16
7.2. Random Share Type (Example Workload: VDI).......................................................... 17
7.3. Sequential Share Type (Example Workload: Large File Ingestion)............................. 18
7.4. Single-Client Workloads............................................................................................... 19
7.5. NFS and SMB Workloads: Software Compilations......................................................20
7.6. NFS and SMB Workloads: Electronic Chip Design Simulation....................................21
7.7. NFS and SMB Workloads: Video Data Streams Capture............................................23

8. Nutanix Files SMB Performance.......................................................... 25


8.1. SMB Workload: Drag and Drop and Robocopy Performance..................................... 25

3
Nutanix Files Performance

8.2. Windows Home Directory Scalability........................................................................... 27

9. Conclusion..............................................................................................29

Appendix..........................................................................................................................30
Best Practices Checklist......................................................................................................30
Additional Test Configuration Notes....................................................................................30
About the Authors............................................................................................................... 32
About Nutanix...................................................................................................................... 32

List of Figures................................................................................................................ 33

List of Tables.................................................................................................................. 34

4
Nutanix Files Performance

1. Executive Summary
Nutanix Files is a software-defined, scale-out file storage solution that provides a repository
for unstructured data. This data can include home directories, user profiles, and departmental
shares. It can also include application data, application logs, backups, and archives. Flexible
and responsive to workload requirements, Files is a fully integrated, core component of the
Nutanix enterprise cloud that supports clients and servers connecting over SMB (formerly
known as CIFS) and NFS protocols. Nutanix Files offers native high availability and uses our
distributed storage fabric for intracluster data resiliency and intercluster asynchronous disaster
recovery. The distributed storage fabric also extends data efficiency techniques to Files, including
erasure coding (EC-X) and compression. Nutanix supports Files with both ESXi and AHV.
You can run Nutanix Files on a dedicated cluster or place it on a cluster running user VMs.
Unlike standalone NAS appliances, Files consolidates VM and file storage, eliminating the need
to create an infrastructure silo. Administrators can manage Files with Nutanix Prism, just like
VM services, which unifies and simplifies management. Integration with Active Directory (AD)
enables support for authentication, access-based enumeration, quotas, and the Self-Service
Restore feature. Nutanix Files also supports file server cloning, which lets you back up Files off-
site, as well as run antivirus scans and machine learning without affecting production.
A key question when comparing Nutanix Files to traditional NAS solutions is whether this solution
can provide the performance your users and applications need. We can break performance down
into workloads—sets of processes that a computer (in this case the file server) must complete in
a certain amount of time. In this paper we describe the performance capabilities of Nutanix Files
based on extensive testing with several well-known workload generator and copy tools. With this
performance data, customers can feel confident that Files provides the performance required for
many common workloads.

1. Executive Summary | 5
Nutanix Files Performance

2. Introduction

2.1. Audience
This tech note is part of the Nutanix Solutions Library. We wrote it for storage and systems
administrators and partners who seek to understand Nutanix Files performance before deploying
the solution. Readers should already be familiar with Nutanix Files and the Nutanix architecture.
For additional documents on Nutanix Files, visit portal.nutanix.com.

2.2. Purpose
There are many different applications and client use cases for file server (NAS) solutions. In fact,
new use cases and workloads are created all the time. Nevertheless, there are some standard
workloads commonly used in customer environments. This paper details the performance
possible with Nutanix Files 3.7 on many common workloads, with results based on weeks of
detailed testing and analysis in the Performance Engineering labs at Nutanix. For our testing,
we assessed several of these common workloads for both Linux (NFS) and Windows (SMB)
environments. We ran the following performance tests:
• NFS and SMB:
⁃ Four Corners Microbenchmark performance test.
⁃ Developer software compilations.
⁃ Electronic chip design simulation.
⁃ Video data streams capture.
• NFS:
⁃ Tar and GNU Compiler Collection (GCC) to extract and compile source code data set.
⁃ rsync performance of a large (1.2 million files) data set.
• SMB:
⁃ Windows copy and paste to and from a Nutanix Files share (reads and writes).
⁃ Windows Robocopy.
⁃ Microsoft File Server Capacity Tool (FSCT).
New software releases can change performance for file server systems. Because Nutanix
Files is a software-defined file server, performance results are likely to change and improve as
new software versions and hardware platforms become available. To see these performance

2. Introduction | 6
Nutanix Files Performance

improvements in your own environment, you do not have to add hardware—simply upgrade or
deploy file servers on the latest releases. Check back for further improvements with the next
release of Files software.

Table 1: Document Version History

Version
Published Notes
Number
1.0 March 2019 Original publication.
2.0 October 2019 Updated for Nutanix Files version 3.6.
3.0 September 2020 Updated for Nutanix Files version 3.7.
3.1 October 2020 Updated the Four Corners Workload section.

2. Introduction | 7
Nutanix Files Performance

3. Nutanix Enterprise Cloud Overview


Nutanix delivers a web-scale, hyperconverged infrastructure solution purpose-built for
virtualization and both containerized and private cloud environments. This solution brings the
scale, resilience, and economic benefits of web-scale architecture to the enterprise through the
Nutanix enterprise cloud platform, which combines the core HCI product families—Nutanix AOS
and Nutanix Prism management—along with other software products that automate, secure, and
back up cost-optimized infrastructure.
Available attributes of the Nutanix enterprise cloud OS stack include:
• Optimized for storage and compute resources.
• Machine learning to plan for and adapt to changing conditions automatically.
• Intrinsic security features and functions for data protection and cyberthreat defense.
• Self-healing to tolerate and adjust to component failures.
• API-based automation and rich analytics.
• Simplified one-click upgrades and software life cycle management.
• Native file services for user and application data.
• Native backup and disaster recovery solutions.
• Powerful and feature-rich virtualization.
• Flexible virtual networking for visualization, automation, and security.
• Cloud automation and life cycle management.
Nutanix provides services and can be broken down into three main components: an HCI-
based distributed storage fabric, management and operational intelligence from Prism,
and AHV virtualization. Nutanix Prism furnishes one-click infrastructure management for
virtual environments running on AOS. AOS is hypervisor agnostic, supporting two third-party
hypervisors—VMware ESXi and Microsoft Hyper-V—in addition to the native Nutanix hypervisor,
AHV.

3. Nutanix Enterprise Cloud Overview | 8


Nutanix Files Performance

Figure 1: Nutanix Enterprise Cloud OS Stack

3.1. Nutanix HCI Architecture


Nutanix does not rely on traditional SAN or network-attached storage (NAS) or expensive storage
network interconnects. It combines highly dense storage and server compute (CPU and RAM)
into a single platform building block. Each building block delivers a unified, scale-out, shared-
nothing architecture with no single points of failure.
The Nutanix solution requires no SAN constructs, such as LUNs, RAID groups, or expensive
storage switches. All storage management is VM-centric, and I/O is optimized at the VM virtual
disk level. The software solution runs on nodes from a variety of manufacturers that are either
entirely solid-state storage with NVMe for optimal performance or a hybrid combination of SSD
and HDD storage that provides a combination of performance and additional capacity. The
storage fabric automatically tiers data across the cluster to different classes of storage devices
using intelligent data placement algorithms. For best performance, algorithms make sure the
most frequently used data is available in memory or in flash on the node local to the VM.
To learn more about Nutanix enterprise cloud software, visit the Nutanix Bible and Nutanix.com.

3.2. Nutanix Files


Nutanix Files is a scale-out approach that provides Server Message Block (SMB) and Network
File System (NFS) file services to clients. Nutanix Files server instances are composed of a set

3. Nutanix Enterprise Cloud Overview | 9


Nutanix Files Performance

of VMs (called FSVMs). Files requires at least three FSVMs running on three nodes to satisfy a
quorum for high availability.

Figure 2: Nutanix Files Server Instances Run as VMs for Isolation from the Distributed Storage Fabric

For more information on the Nutanix Files architecture, refer to the Nutanix Files tech note.
For a complete list of Nutanix Files prerequisites, refer to the Nutanix Files Guide.

3. Nutanix Enterprise Cloud Overview | 10


Nutanix Files Performance

4. Nutanix Files Hardware Resource Usage

4.1. Nutanix Files Resources


Reviewing how we allocate CPU and RAM resources between Nutanix Files and the Nutanix
AOS Controller VM (CVM) can help with sizing for performance. The simple model we
provide here can help you think through the layers of the storage stack, but it does not
comprehensively list the services within each operating system. FSVMs primarily use CPUs for
protocol work such as user connections, inodes and file handles, access-control lists (ACLs) and
permissions, metadata and caching, and network traffic. CVM CPUs interact with the underlying
hardware and process the I/O that the file server sends to its volume groups. They also provide
data path access, so workloads with significant I/O and bandwidth requirements use them more
heavily.
To get the performance required for file server workloads, the solution should be sized
properly. However, because Nutanix Files is software-defined, you can add hardware resources
including CPU, memory, nodes, shares, and so on as your performance and capacity needs
grow.
Nutanix Files can run on a dedicated cluster or share compute and storage running
alongside user VMs. Files nodes in shared environments can be useful for light to modest
workloads where extra CPU, memory, and storage are available. The FSVMs share
host CPU and memory with user VMs and CVMs. If the combined workload is high, there
can be resource contention. Provision the high-performance workloads that you move to
Files from a traditional NAS solution on a dedicated cluster. On a dedicated Nutanix Files
system, resource contention is not a concern because no user VMs are competing for CPU, so
you can allocate the maximum amount of CPU and RAM resources to the FSVM (current limits
are 12 CPU and 96 GB of memory). For optimal access speed between memory and CPU, try
not to allocate more than half the RAM on a node to a VM to avoid crossing a NUMA boundary.

Tip: For high-performance Files workloads, either use dedicated Nutanix Files
clusters or make sure your hardware is sufficient for both Nutanix Files and user
VMs.

4. Nutanix Files Hardware Resource Usage | 11


Nutanix Files Performance

5. Four Corners Workload


The first tests we ran were the standard Four Corners Microbenchmark using the fio tool.
The four workloads in order of completion are:
• Random reads
• Sequential reads
• Random writes
• Sequential writes
All random operations reported have a block size of 8 KB, and all sequential numbers are 1 MB
sequential writes reported in throughput (MB/sec). We chose this varied-workload approach
because it gives us insight into the diverse capabilities of a storage system. Sequential workload
tests indicate the maximum throughput achievable on a system. The small random workloads
tend to be more processor intensive, as a smaller I/O size means that the storage subsystems
must process a larger number of individual operations (IOPS).
For the random workloads, the clients saw very low maximum latency. For the single-client
random read and write tests, the maximum latency was 1 ms. For the 24-client tests, the
maximum random read latency was 6 ms and the maximum random write latency was 14 ms.
We obtained these results in a closed lab environment with a clean network and with
clients deployed on a cluster separate from Nutanix Files. We deployed Nutanix Files as a
standalone cluster with four nodes with no other workloads or UVMs running. We used a single
distributed share for the multinode tests under a single namespace. For additional test notes,
refer to the appendix.
To quickly demonstrate the maximum performance per client and per file cluster, we tested with
1 and 24 clients. For SMB, we ran the fio workload with Windows Server 2016 clients mounting
Nutanix Files. We used Nutanix X-Ray to orchestrate the NFS performance testing. The X-Ray
software combines a powerful systems-testing tool with an intuitive user interface. X-Ray creates
and clones multiple Ubuntu Linux NFS clients with 4 vCPU and 4 GB of memory each. Each
client connects over NFS to a distributed share hosted by all Nutanix Files nodes, then runs
the Four Corners workload described earlier against its own 16 GB file. The results show that
whether using SMB or NFS, Nutanix Files can deliver the performance required.

5. Four Corners Workload | 12


Nutanix Files Performance

Table 2: Four Corners Results with 24 Clients

Workload Single Client Four-Node Cluster


Random read 28,600 166,200
Sequential read 1,500 8,400
Random write 9,100 58,380
Sequential write 900 3,000

5.1. Linear Scalability with One, Four, Eight, and Twelve Nodes
The following results (which we have rounded up to the nearest hundred) demonstrate the
clear linear performance scalability of the Nutanix Files solution. We tested performance with
one, four, eight, and twelve file server nodes in a hybrid system with multiple clients (2, 24, 48,
and 72, respectively). For the single-node test, we used one general share to ensure that the
client would act against only one node. For the rest of the tests, we used a distributed share to
spread the workload across all nodes in the cluster. The results demonstrate that as you need
additional performance and capacity, you can easily expand Nutanix Files.

Table 3: Nutanix Files Linear Performance Scalability

Nodes Random Reads Random Writes Sequential Reads Sequential Writes


1 37,000 17,600 2,200 1,200
4 164,500 48,500 9,300 3,050
8 312,000 92,000 16,800 5,800
12 479,000 136,500 20,500 8,500

Note: Random results are IOPS. Sequential results are MB/sec.

5. Four Corners Workload | 13


Nutanix Files Performance

6. Nutanix Files NFS Performance

6.1. NFS Workload: Tar and GCC


NAS workloads typically perform many operation types—not just read and write.
Microbenchmarks like fio are great at stressing the I/O or block subsystems. However, they are
not as adept at heavily utilizing the protocol and filesystem components, which can involve such
activities as reading directories, enumerating and setting user permissions and access, and
deleting files and directories. We used tar and GCC to simulate the kind of intensive operations
used by chip semiconductors and software compilation workflows. We set up a distributed
share and ran a heavy, file-based workload with 25 and 40 users to illustrate how the system
scales under load. Each user performed operations against approximately 100,000 files with the
following workflow:
1. Tar extract a large file into the share containing source code.
2. Run GCC to compile the source code.
This complex set of file operations stressed the Nutanix Files protocol and metadata stacks and
the storage subsystem provided by AOS. The following results measure the time required to
complete the simultaneous jobs running on multiple users. The distributed share is highly
optimized for multiple clients running multiple streams of work. In order to get the most
performance for workloads with high file counts, ensure that the FSVMs are configured with
enough CPU and memory. Check the VM performance graphs in Prism to monitor CPU and RAM
usage and use the FSVM configuration menu to hot upgrade when needed.

Table 4: NFS Workload: 25 Users

Tar extraction 6 minutes, 48 seconds


GCC compilation 37 minutes, 40 seconds

Table 5: NFS Workload: 40 Users

Tar extraction 8 minutes, 3 seconds


GCC compilation 45 minutes, 33 seconds

6. Nutanix Files NFS Performance | 14


Nutanix Files Performance

6.2. NFS Workload: rsync Performance


rsync is a Linux tool that can synchronize two or more copies of a file or group of files and
directories between computers or file systems. We used rsync to copy a data set with 1.7 million
total files spread across 19,000 total directories and subdirectories from a client to a file server.
This workload models the backup or migration of a data set with a large file count to Nutanix
Files. The statistics for this scenario follow:
• Number of files: 1,737,759 (regular: 1,718,244; directories: 19,515)
• Number of created files: 1,737,759 (regular: 1,718,244; directories: 19,515)
• Number of deleted files: 0
• Number of regular files transferred: 1,718,244
• Total file size: 13,784,355,840 bytes
• Total transferred file size: 13,784,355,840 bytes
• Total time: 122 minutes, 29.064 seconds

6. Nutanix Files NFS Performance | 15


Nutanix Files Performance

7. Application Workloads

7.1. Share Type: Random and Sequential


Nutanix Files offers multiple share types to accommodate different workloads. The default
share type provides a good balance between small and large operations and should be optimal
for mixed workloads. In Files 3.6.1 we implemented a new feature to allow for performance
optimization of a share. If the you know the characteristics of a workload, setting the type for the
share to either random or sequential can increase the max performance of that share. Setting the
share type to random has a positive impact on random writes. Similarly, setting the share type
to sequential can increase the amount of sequential throughput possible on the share. Note that
there are trade-offs for these benefits. The random setting causes sequential performance to be
lower, and the sequential setting lowers random performance.
It is important to understand the workload on the share before changing the share type. To
assess your workload, use the following command to obtain information about I/O size on the
share:
afs share.io_size_distribution

7. Application Workloads | 16
Nutanix Files Performance

Figure 3: Output of Share I/O Size Distribution Command

The output in the preceding figure shows a histogram of the count of operation type at each block
size ranging from 512 bytes to 1,048,576 bytes. In thise example above the majority of both read
and write operations on the share dist1 are 512 bytes and 8 KB, so setting the share type to
random should provide increased throughput capability on this share. The random share type
improves performance when overwriting files with small I/O operations (16 KB or smaller). If the
majority of your workload showed 1 MB operation sizes (for example, writing out large files, data
migrations, backups, and so on), then setting the share type to sequential would increase your
performance.

7.2. Random Share Type (Example Workload: VDI)


Setting a share to the random type decreases the filesystem block size, which is beneficial for
operations such as small random overwrites into larger files. An example of this type of workload
is OS operations on a virtual disk in a VDI environment.

7. Application Workloads | 17
Nutanix Files Performance

The chart in the following figure illustrates performance test results that show the potential
benefits of setting a share to the random share type when most operations are 16 KB or smaller.

Figure 4: Random Share Type Comparison

7.3. Sequential Share Type (Example Workload: Large File Ingestion)


Setting the share to sequential increases the filesystem block size, which can improve
performance of large sequential workloads including large file copies, log files, and backups. The
chart in the following figure shows a single Windows client writing a large file sequentially, which
simulates a backup or large file copy from a Windows server to a file server share.

7. Application Workloads | 18
Nutanix Files Performance

Figure 5: Sequential Share Type Comparison

7.4. Single-Client Workloads


One common performance issue with any storage solution is a limitation around a single client
connection. In many cases one client can only connect to a single node of your storage, even
though the storage may contain multiple cluster nodes. Your application may require a single
client to consume more throughput than that single connection can sustain.
The distributed share technology in Nutanix Files is a perfect solution for this issue. The
distributed share type makes the resources of multiple file server nodes available to a single NFS
or SMB namespace. When the files being accessed are contained in multiple directories, Nutanix
Files automatically shards the NAS traffic across multiple nodes. This sharding can dramatically
increase the amount of throughput available to a single client. Beginning in Files 3.7 we have
implemented a feature that allows the administrator to set the shard point anywhere within the
directory tree. If your application performs I/O operations on files in multiple directories, the I/O is
now automatically spread across multiple cluster nodes.
As an example, we mounted an NFSv4 general share to a CentOS 7 client and ran a 1 MB
sequential write workload with fio. Our file server was composed of four nodes. We then created

7. Application Workloads | 19
Nutanix Files Performance

a distributed share with four directories and ran another fio workload with four jobs, each writing
to one of the four directories. The following figure shows the results.

Figure 6: Distributed Share Type Comparison

One common use case is a video streaming workload, primarily made up of sequential write
I/O, with multiple I/O streams from many sources (cameras). When the files are contained in
different directories, a single server can theoretically write up to 2,420 MBps. Assuming 6.8 Mbps
per camera (1080P HD MJPEG at 5 frames per second, using the StarDot Technologies video
bandwidth calculator), the distributed share can support approximately 2,800 cameras. As this
data shows, the distributed share adds a lot of versatility and can enable Files to scale to your
requirements.

7.5. NFS and SMB Workloads: Software Compilations


We tested a simulated developer environment where multiple clients perform software
compilations simultaneously. This test simulates the workload performed in a production
environment by utilities such as Linux make, which automatically builds executable programs and
libraries from source code. The software compilation workload primarily performs statistics and
other metadata operations, so it is metadata heavy. The following table shows the breakdown of
specific operation types in this workload.

7. Application Workloads | 20
Nutanix Files Performance

Table 6: Software Compilation Operation Types

Operation Percentage
read file 6
write file 7
access 6
chmod 5
create 1
stat 70
mkdir 1
readdir 2
unlink 2

We separately tested this workload with one group of 16 Linux NFS clients and one group of 36
Windows SMB clients connected to a data set located on a single Nutanix file server cluster. To
be considered successful, the average storage latency of the compilation workload had to stay
below 10 ms. The following table details the peak workloads achieved.

Table 7: Compilations per Nutanix Node, Four-Node Cluster

Average
Client Type Compilations Achieved IOPS
Latency (ms)
NFS 110 54,124 6.5
SMB 101 49,185 6.5

7.6. NFS and SMB Workloads: Electronic Chip Design Simulation


We simulated a chip design environment where multiple clients perform sets of operations,
modeling the software suites that work on everything from specifications to fabrications. This
workload is heavy on compute operations against millions of small files and a subset of large
files. The following table shows the breakdown of specific operation types in this workload; it is
composed predominantly of metadata operations.

7. Application Workloads | 21
Nutanix Files Performance

Table 8: Chip Design Simulation Operation Types: Workload 1

Operation Percentage
read file 7
write file 10
access 15
chmod 1
create 2
stat 39
mkdir 1
unlink 1 1
unlink 2 1
rand read 8
rand write 15

Table 9: Chip Design Simulation Operation Types: Workload 2

Operation Percentage
read 50
write 50

In this case, one group of 24 Linux NFS clients and one group of 32 Windows SMB clients
connected to a data set contained on a single Nutanix file server cluster. To be considered
successful, the average storage latency of the compilation workload had to stay below 10 ms.
The software compilation workload performing mostly statistics and other metadata operations is
metadata heavy. We tested both NFS and SMB, shown in the following table with the number of
compilations each achieved.

7. Application Workloads | 22
Nutanix Files Performance

Table 10: Chip Design Simulations, Four-Node Cluster

Chip Design Average


Client Type Achieved IOPS
Simulations Latency (ms)
NFS 126 56,387 3.4
SMB 140 61,307 6.1

7.7. NFS and SMB Workloads: Video Data Streams Capture


We tested a video camera repository environment where multiple clients streamed video data
files simultaneously. This workload simulates hundreds of cameras simultaneously writing
out data, with the system then performing a random read checksum of the data sets. The
following table shows the breakdown of specific operation types in this workload; it is composed
predominantly of metadata operations.

Table 11: Video Data Streams Capture Operation Types: Workload 1

Operation Percentage
write 100

Table 12: Video Data Streams Capture Operation Types: Workload 2

Operation Percentage
read 5
rmw 2
readdir 3
rand read 84
create 1
stat 2
unlink 1 1
access 2

7. Application Workloads | 23
Nutanix Files Performance

In this case, one group of 8 NFS clients and one group of 32 SMB clients connected to a data set
contained on a single Nutanix file server cluster. To be considered successful, the system had to
achieve the target write throughput level. We tested both NFS and SMB, shown in the following
table with the number of streams each achieved.

Table 13: Video Data Streams, Four-Node Cluster

Client Type Video Data Streams Write Throughput (KBps)


NFS 445 1,873,800
SMB 375 1,576,600

7. Application Workloads | 24
Nutanix Files Performance

8. Nutanix Files SMB Performance

8.1. SMB Workload: Drag and Drop and Robocopy Performance


The first test many users try with a file server is to copy and paste or drag and drop some data
from a Windows client to a file share, though this process is not an ideal test for a file server, as it
tends to be limited by the client. More sophisticated copy tools on Windows, like Robocopy, allow
for some control of the application, including using multiple threads to increase copy speed.
Nutanix tests both copy methods regularly to measure our performance. With Nutanix Files
3.7 we saw some great gains in performance compared to prior releases that specifically
benefit Windows copy speeds to and from file shares. The chart in the following figures shows
the results we achieved recently when copying a set of large files.

Figure 7: Peak Copy and Paste Speed from Single Windows Client to Nutanix Files Share

8. Nutanix Files SMB Performance | 25


Nutanix Files Performance

Figure 8: Copy and Paste Speed

We also tested Robocopy with large files, achieving speeds similar to the results for copy and
paste. However, sets of small files tend to be slower, as larger data sets require thousands to
millions of metadata operations. These operations include find, create or open, close, and so
on. Combined with reads and writes of the actual data, a large number of metadata operations
can require significant time.
Besides limiting speed factors in the dataset, Robocopy offers several specific options that can
impact performance. Even using the default Robocopy settings that output the console log of all
files and folders copied to the destination can reduce throughput.
The best Robocopy options for your environment depend on your particular business
requirements. For testing, we used some Robocopy flags to optimize performance. We achieved
around 80 MB/sec for a data set of 50 GB of text and images (181,250 files, 57,493 folders). The
following command provides an example of the Robocopy flags we used:
robocopy $sourceDir $targetDir /COPY:DATSO /S /E /DCOPY:T /mt:64 /NFL /NDL /NP /LOG:"C:\log
\small-robocopy-$shortdate.log"

8. Nutanix Files SMB Performance | 26


Nutanix Files Performance

Note: Robocopy settings are independent of file server options. Decisions regarding
these settings or flags are not related to the file server solution you are using.
It’s important to research and understand the Robocopy flags you’re using to ensure
data validity for data migrations or backups.

8.2. Windows Home Directory Scalability


Nutanix Files works well as a storage solution for Windows home directories and user profiles.
As each user connects to a folder or share served by an FSVM, they create an SMB connection.
We regularly stress test Nutanix Files with a Microsoft File Server Capacity Tool (FSCT) workload
to ensure sustainable, consistent home directory performance. The chart in the following figure
breaks down the workload mix of FSCT; note that it is heavy on small metadata operations, with
thousands of connected users.

Figure 9: FSCT SMB Operation Mix

With Files 3.7, we can support 13,800 simultaneous FSCT users with a single four-node Files
cluster.
Based on memory analysis and capacity testing, we have defined the following configuration
limits in software tied to FSVM hardware. We have stress-tested these limits with Windows VDI
clients to ensure that they work well for general single user to single share connections. Note that
the following table presents user count numbers per node; we expect these numbers to scale
with Files cluster node count.

8. Nutanix Files SMB Performance | 27


Nutanix Files Performance

Table 14: Connection Count System Limits

Memory (GB) per


vCPU Count User Count
Nutanix Files Node
12 96 4,000
8 64 3,250
8 40 2,750
6 32 2,000
6 24 1,500
4 16 1,000
4 12 500

It is important to note that the connection count system limits listed in the preceding table do
not apply for terminal servers. A terminal server is a server that enables multiple client systems
to connect to a LAN network without using a modem or a network interface. Two common
terminal server solutions are Windows Terminal Servers and VMware Horizon View. Each of
these solutions allows multiple users to access Nutanix Files through one connection through one
server. Given the density of users on one connection, this use case has higher CPU processing
and memory requirements. We suggest configuring no more than 20 users per Windows Terminal
Server or VMware Horizon View server connecting to a Nutanix Files share. There are no Files
system hardware or software limits that enforce this recommendation; it is up to the architects
to design with this guidance in mind to achieve balanced density and consistent performance.
Terminal servers work well with Nutanix Files when properly designed.

8. Nutanix Files SMB Performance | 28


Nutanix Files Performance

9. Conclusion
Nutanix Files is a robust, software-defined file sharing solution that deploys like an app on top
of the Nutanix enterprise cloud. The workload testing detailed here shows that the solution can
perform well for a wide range of deployments and applications. Contact Nutanix to find out how
you can replace aging NAS appliance silos with a modern Files solution. For any questions or
updates, please contact the Nutanix Files team or use the Nutanix NEXT Community forums.

9. Conclusion | 29
Nutanix Files Performance

Appendix

Best Practices Checklist


• When testing performance, use one dedicated cluster for clients and one for Files and the
CVM only.
• When possible, use balance-slb for active-active load balancing of traffic with AHV. If your
switch is configured for Link Aggregation Control Protocol (LACP), use balance-tcp.
• For heavy I/O (random or sequential) environments, increase the CPU count for CVMs.
• NUMA tune the Files VM to a particular socket and NUMA node by pinning the Files VMs to
the opposite socket of the CVM for dedicated Files clusters.
• Mount FSVMs with asynchronous options from the NFS client if possible, unless the
application specifically requires you to do otherwise. If you use a synchronous mount, all write
operations are then synchronous, which can incur a write performance cost.
• Increase FSVM memory to increase cache for larger data sets.
• For higher file count environments, dedicate more CPU and RAM to the FSVMs.
• Do not configure more than 20 users per Windows Terminal Server connecting to a Nutanix
Files share.
• Use the Nutanix Sizer and the Nutanix Files Sizing Guide when designing a Files solution for
performance.
• For dedicated Files deployments, we recommend assigning all CPU and memory resources
to the CVM and Files. Devoting one NUMA node’s CPU and RAM to the FSVM and the other
NUMA node to the CVM ensures that all hardware resources can be fully utilized.

Additional Test Configuration Notes


Test Environment
We used the following Nutanix hardware and software to host the Nutanix Files file servers and
data for testing.
• Nutanix NX-8055-G6 cluster (4 nodes):
⁃ All-flash storage (Samsung 960 GB or 1.92 TB drives)
⁃ Dual 20-core Intel Intel® Xeon® Gold 6148 CPU @ 2.40 GHz per node

Appendix | 30
Nutanix Files Performance

⁃ Dual 25 GbE Mellanox (CX-4) cards


⁃ CVM: 12 CPU, 32 GB of memory per node
⁃ FSVM: 12 CPU, 64 GB of memory per node
• Nutanix NX-8055-G6 hybrid cluster (4 nodes and 16 nodes):
⁃ 4 x 3.98 TB SSDs and 8 x 10 TB HDDs per node
⁃ Dual 14-core Intel Intel® Xeon® Gold 5120 CPU @ 2.20 GHz per node
⁃ Dual 25 GbE Mellanox (CX-4) cards
⁃ CVM: 18 CPU, 32 GB of memory per node
⁃ FSVM: 12 CPU, 64 GB of memory per node
• Client hosting details:
⁃ Each NFS client VM had 4 vCPU and 4 GB of RAM for a virtual-to-physical core ratio of
1.5:1.
⁃ Windows Server 2016 Build 14393 client for SMB testing.
⁃ Each SMB client VM had 4 vCPU and 4 GB of RAM for a virtual-to-physical core ratio of
4.5:1.
⁃ Nutanix AHV version el7.nutanix.20190916.189
⁃ Nutanix AOS version 5.17.0.2
⁃ Nutanix Files version 3.7 or 3.7.1 (Four Corners)
⁃ X-Ray version 3.7
Unless otherwise stated, we ran all tests against file shares that were evenly balanced across
all four nodes in the cluster. For single-client tests, we ran the workload against a single
node and file share. To maximize performance, we set up the cluster with an active-active
networking configuration. To ensure that we used both physical NICs, we used balance-tcp for
load balancing in AHV. (For more information about AHV network load balancing, see the AHV
Networking best practices guide.) By providing solid hardware and networking, as well as
separate client clusters, we ensured that the clients and network were not bottlenecks. This
approach allowed for consistent performance testing of the Nutanix Files cluster, engaging all
nodes and shares.
We simulated customer workloads to the best of our ability, but real-world workloads vary
across applications or customer environments. Additionally, we obtained these results in a
closed lab setting on an isolated system running no other competing workloads. Because
performance varies based on platform CPU count, CPU speed, physical storage type (SSD or
HDD), data size, and other factors, results obtained in other environments may vary. However,

Appendix | 31
Nutanix Files Performance

you can improve your results for Nutanix Files in any situation by following some of the practices
discussed in this paper.

Note: We performed all tests with standard Nutanix space-saving and data
protection features enabled, including Files share-level compression and fault
tolerance 1 (FT 1).

fio Four Corners Settings


We used a standard set of configuration values for fio whether running it on Linux or
on Windows. Each test run was 10 minutes. We used the default mount options. We
ran random write tests with the fio direct option set to 0. For the write multiclient tests, each
client performed its I/O against a 16 GB file. For the multiclient read tests, we kept the working
set size constant, for a total of 384 GB across all clients. We ran all random read tests
with direct=1 to bypass the client read cache. We initially ran these workloads with varying
amounts of concurrency (outstanding I/O or iodepth) to push the system as close to its maximum
capabilities as possible.
• For X-Ray testing, we used an iodepth of 64 per client for all tests except random write, where
we used an iodepth of 32.

About the Authors


Will Strickland is a Staff Performance Engineer at Nutanix. In this role, he works to improve
customer experience by measuring and improving file server performance and sharing it. Follow
Will on Twitter @ObserverWill.
Dan Chilton is a Staff Performance Engineer at Nutanix. In this role, he partners with the
development team to help drive and measure performance improvements. Follow Dan on Twitter
@dan__chilton.

About Nutanix
Nutanix makes infrastructure invisible, elevating IT to focus on the applications and services that
power their business. The Nutanix enterprise cloud software leverages web-scale engineering
and consumer-grade design to natively converge compute, virtualization, and storage into
a resilient, software-defined solution with rich machine intelligence. The result is predictable
performance, cloud-like infrastructure consumption, robust security, and seamless application
mobility for a broad range of enterprise applications. Learn more at www.nutanix.com or follow us
on Twitter @nutanix.

Appendix | 32
Nutanix Files Performance

List of Figures
Figure 1: Nutanix Enterprise Cloud OS Stack................................................................... 9

Figure 2: Nutanix Files Server Instances Run as VMs for Isolation from the Distributed
Storage Fabric..............................................................................................................10

Figure 3: Output of Share I/O Size Distribution Command............................................. 17

Figure 4: Random Share Type Comparison.................................................................... 18

Figure 5: Sequential Share Type Comparison.................................................................19

Figure 6: Distributed Share Type Comparison................................................................ 20

Figure 7: Peak Copy and Paste Speed from Single Windows Client to Nutanix Files
Share............................................................................................................................ 25

Figure 8: Copy and Paste Speed.................................................................................... 26

Figure 9: FSCT SMB Operation Mix................................................................................27

33
Nutanix Files Performance

List of Tables
Table 1: Document Version History................................................................................... 7

Table 2: Four Corners Results with 24 Clients................................................................ 13

Table 3: Nutanix Files Linear Performance Scalability.................................................... 13

Table 4: NFS Workload: 25 Users................................................................................... 14

Table 5: NFS Workload: 40 Users................................................................................... 14

Table 6: Software Compilation Operation Types............................................................. 21

Table 7: Compilations per Nutanix Node, Four-Node Cluster..........................................21

Table 8: Chip Design Simulation Operation Types: Workload 1...................................... 22

Table 9: Chip Design Simulation Operation Types: Workload 2...................................... 22

Table 10: Chip Design Simulations, Four-Node Cluster.................................................. 23

Table 11: Video Data Streams Capture Operation Types: Workload 1............................ 23

Table 12: Video Data Streams Capture Operation Types: Workload 2............................23

Table 13: Video Data Streams, Four-Node Cluster......................................................... 24

Table 14: Connection Count System Limits.....................................................................28

34

You might also like