Skip to content

Benchmarks

Giora Kosoi edited this page Aug 1, 2023 · 6 revisions

The performance of an external text file sort depends on multiple factors - number of lines, length of lines and available computational resources - CPU, memory, type of disk. Lines cannot be treated as records, and so when a line has to be moved to another location in the file we cannot just swap it with another line but have to move huge amounts of data. Another limiting factor is that the external sort contains at least two sequential parts - split work to concurrent tasks and merge the result. It is not obvious that the concurrent part is of benefit at all, as we have to merge all the data into the result file in the end.

Design

The goal of this benchmark is to verify that there are minimal anomalies when increasing the file size of the file and that adding CPU resources improves the performance. The basic unit work is a text line of approximately 50 bytes so the size of a file of 1 million lines is about 50 MB. We run the sort on files of increasing length and evaluate the growth in sort duration and the scalability to additional CPU.

Analysis

The benchmark was run on a MacBook Pro, without reaching memory or CPU limits. See system details below.

Software

System Software Overview:

  System Version: macOS 12.2.1 (21D62)
  Kernel Version: Darwin 21.3.0
  Boot Volume: Macintosh HD
  Boot Mode: Normal
  Computer Name: *****
  User Name: *****
  Secure Virtual Memory: Enabled
  System Integrity Protection: Enabled
  Time since boot: 55 days 21:54

Hardware

Hardware Overview:

  Model Name: MacBook Pro
  Model Identifier: MacBookPro18,2
  Chip: Apple M1 Max
  Total Number of Cores: 10 (8 performance and 2 efficiency)
  Memory: 32 GB      

Performance

From charts below we can see that when the input is read in large chunks, in our case 100 MB for files ranging from 50 MB to 10 GB, there is a significant decrease in sorting duration from adding CPU tasks to sorting work. However, this is not proportional to the number of tasks added. For example using 8 tasks for a 10 GB file is only 2.5 times faster than using 1 task. This is because a significant part of time is spent reading and writing all the content for purpose of merge. We can also see that merging intermediate results concurrently before merging them int the final result is also reducing the sort time.

Small Files (5 - 100 MB)

Configured to read the input in chunks of 10 MB.

small-files 6fe77)

Medium Files (50 - 1000 MB)

Configured to read the input in chunks of 100 MB.

medium-files

Large Files (0.5 - 10 GB)

Configured to read the input in chunks of 100 MB.

large-files

Next Steps