Chin Fang <fangchin[at]zettar.com>, Palo Alto, California, U.S.A
Programming is Gardening, not
Engineering
This directory contains two bash scripts:
trsync.shis a wrapper script ofrsync,rsyncd, andparsyncfp. Its main purpose is enable automated testing of the above three as easily as possible. Once it's installed, typetrsync.sh -hfor more info.harness.shis a wrapper script fortrsync.sh. Once it's installed, typeharness.sh -hfor more info.
Recommended installation directory: /usr/local/sbin.
- The
rsyncis installed on both the source and target systems. - The
rsyncd(akarsyncdaemon) is installed and properly set up on the desired target system. - The
parsyncfpis set up according to the tool's instructions. Should you have any questions about this tool, please open a new issue at the tool's github repo.
The two bash scripts have been written to be almost as readable as
English. Both are implemented with a dry-run mode to facilitate the
learning of proper use and possible extension.
[root@zh0 ~]# harness.sh -t wan -i 192.168.15.20 -S /var/local/zettar/zx/src_data -D /var/local/zettar/zx/dst_data -n
trsync.sh -i 192.168.15.20 -S /var/local/zettar/zx/src_data -D /var/local/zettar/zx/dst_data -r s | tee /var/tmp/rsync_wan_losf.out
trsync.sh -i 192.168.15.20 -S /var/local/zettar/zx/src_data -D /var/local/zettar/zx/dst_data -r m | tee /var/tmp/rsync_wan_medium.out
trsync.sh -i 192.168.15.20 -S /var/local/zettar/zx/src_data -D /var/local/zettar/zx/dst_data -r l | tee /var/tmp/rsync_wan_large.out
==> TOTAL test time (with cleaning): 0 second
[root@zh0 ~]# harness.sh -t lan -i 192.168.15.20 -S /var/local/zettar/zx/src_data -D /var/local/zettar/zx/dst_data -n -d
trsync.sh -i 192.168.15.20 -S /var/local/zettar/zx/src_data -D /var/local/zettar/zx/dst_data -d -r s | tee /var/tmp/rsync_lan_losf.out
trsync.sh -i 192.168.15.20 -S /var/local/zettar/zx/src_data -D /var/local/zettar/zx/dst_data -d -r m | tee /var/tmp/rsync_lan_medium.out
trsync.sh -i 192.168.15.20 -S /var/local/zettar/zx/src_data -D /var/local/zettar/zx/dst_data -d -r l | tee /var/tmp/rsync_lan_large.out
==> TOTAL test time (with cleaning): 0 second
[root@nersc-tbn-7 src]# harness.sh -t wan -p 4023 -i 10.3.33.1 -S /data/zettar/zx/src -D /data/zettar/zx/dst -P -I eth200.4001 -n
6 rsync instances
trsync.sh -i 10.3.33.1 -S /data/zettar/zx/src -D /data/zettar/zx/dst -P -I eth200.4001 -r s -N 6 -a 0 | tee /var/tmp/parsyncfp_wan_losf.06
8 rsync instances
trsync.sh -i 10.3.33.1 -S /data/zettar/zx/src -D /data/zettar/zx/dst -P -I eth200.4001 -r s -N 8 -a 0 | tee /var/tmp/parsyncfp_wan_losf.08
10 rsync instances
trsync.sh -i 10.3.33.1 -S /data/zettar/zx/src -D /data/zettar/zx/dst -P -I eth200.4001 -r s -N 10 -a 0 | tee /var/tmp/parsyncfp_wan_losf.010
12 rsync instances
trsync.sh -i 10.3.33.1 -S /data/zettar/zx/src -D /data/zettar/zx/dst -P -I eth200.4001 -r s -N 12 -a 0 | tee /var/tmp/parsyncfp_wan_losf.012
6 rsync instances
trsync.sh -i 10.3.33.1 -S /data/zettar/zx/src -D /data/zettar/zx/dst -P -I eth200.4001 -r s -N 6 -a 1 | tee /var/tmp/parsyncfp_wan_losf.16
8 rsync instances
trsync.sh -i 10.3.33.1 -S /data/zettar/zx/src -D /data/zettar/zx/dst -P -I eth200.4001 -r s -N 8 -a 1 | tee /var/tmp/parsyncfp_wan_losf.18
10 rsync instances
trsync.sh -i 10.3.33.1 -S /data/zettar/zx/src -D /data/zettar/zx/dst -P -I eth200.4001 -r s -N 10 -a 1 | tee /var/tmp/parsyncfp_wan_losf.110
12 rsync instances
trsync.sh -i 10.3.33.1 -S /data/zettar/zx/src -D /data/zettar/zx/dst -P -I eth200.4001 -r s -N 12 -a 1 | tee /var/tmp/parsyncfp_wan_losf.112
[...]
==> TOTAL test time (with cleaning): 1 second
The following is excerpted from a U.S. DOE Technical Report Report, "When to use rsync".
Rsync was co-created by Andrew Tridgell and Paul
Mackerras in the early
1990s, partially for Andrew’s
Ph.D. dissertation and
partially for backing up his wife’s system. From such a modest
beginning, it has gained worldwide popularity and is bundled with all
major Linux distributions. It is often regarded as the go-to data
mover tool by many enterprise IT professionals. Despite its
popularity, the tool’s various problems in dealing with numerous small
files, very large files, and large network latencies are frequently
encountered by users, as a casual search would reveal. There are also
some attempts to address these shortcomings via the approach of
aggregation, typically using a wrapper to run multiple rsync
instances at the same time. A good example is parsyncfp.
Nevertheless, there have been no systematic investigations about
rsync’s proper range of operation. The rsync man
page only states,
quoted "rsync - a fast, versatile, remote (and local) file-copying
tool", but neither "fast" nor "remote" are precisely defined. The
effectiveness of using the aggregation approach via a wrapper is also
not clear. This report intends to address the lack of knowledge in
this regard. Specifically, the report intends to answer this
question: "in today’s data-centric world, when to use rsync" (with or
without using a wrapper)? The answer is based on the results obtained
from a series of automated tests transferring files using two
testbeds. Both LAN and WAN transfer results are analyzed to show the
proper range of use for rsync-based tools.
The major goal is to inform and educate the public, especially
enterprise IT professionals. rsync is still a good tool, but it is
not a tool for moving hyperscale datasets over a network with round
trip time (RTT) >= 10ms, where the term "hyperscale dataset" is
defined as a dataset that has >= 1M files, or
overall size >= 1TB, or both.
The author has witnessed first hand at various data-intensive
enterprises, e.g. large biopharmaceutical businesses, that some IT
people still believe that rsync can be used for file system data
migration involving multiple PBs' of data! This kind of
misconceptions should be purged. We all live once - our life is
too precious to wate!
The aforemtioned DOE Technical Report also has suggestions for alternatives.
Since rsync based tools are really not for high-performance moving
data at scale and speed, so although the investigation carried out for
the aforementioned DOE Technical Report is as rigorous and extensive
as possible, the long running time taken by most of the transfer tests
simply renders the use of the employed test envionments impractical -
they both have other important projects waiting.
Thus, the two tester scripts are made freely available. Interested parties can use them in their respective environment to obtain their own results and draw concrete, number-based conclusions rather than just heresays.
- Chin Fang, R. "Les" A. Cottrell, "Data Movement Categories" - knowing how to map a data movement task into one of the four categories helps select proper data mover tools. The prevalent problem of misusing data movers, which is one of the fundamental causes of regretable poor data utilization, can be greatly mitigated with such knowledge.
- Chin Fang, "High-Performance Data Movement Services - DTNaaS", Rice 2021 Oil & Gas High-Performance Conference Technical Program Lightning talk, March 5, 2021 - any enterprise IT professional who moves data frequently should foremost gain a good understanding about how to benchmark a file storage service, then about computing and networking (including network security basics). This lightning talk provides some basics and best practices. Warning! Anyone who considers moving data at scale and/or speed as a software or network alone task will be running a fool's errand.
- Sven Breuner, "elbencho, A distributed storage benchmark for file systems and block devices with support for GPUs - the best, free, and easiest to use storage benchmark!
- Chin Fang, "elbencho storage sweep tools" - an easy-button approach to gain insight about a complex file storage service.
- Sven Breune, Chin Fang, "elbencho - A new Storage Benchmark for AI et al", PPoPP'21 Workshop: Benchmarking in the Data Center" - a short and sweet presentation about storage benchmarking with elbencho.
- Ezra Kissel, Chin Fang, "Zettar zx Evaluation for ESnet DTNs" - Fig 1 and 5 should show what a modern data mover can do for hyperscale data, even at PB level, over long distance high-bandwidth networks. Note that the same environment used for this rsync investigation is employed. All ESnet data mover evaluations are done with fairness and rigorousness in mind. Fully automated too.
- Ezra Kissel, "100G DTN Experiment: Testing Technologies for Next-Generation File Transfer" - a short and sweet introduction to reference 6. above. It presents some takeaways that dispell a few common misconceptions about moving data at scale and speed.
