Skip to content

plasma-umass/coz

Repository files navigation

Coz: Finding Code that Counts with Causal Profiling

by Charlie Curtsinger and Emery Berger

Rust Crate downloads

Coz is a profiler for native code (C/C++/Rust) that unlocks optimization opportunities missed by traditional profilers. Coz employs a novel technique called causal profiling that measures optimization potential. It predicts what the impact of optimizing code will have on overall throughput or latency.

Profiles generated by Coz show the "bang for buck" of optimizing a line of code in an application. In the below profile, almost every effort to optimize the performance of this line of code directly leads to an increase in overall performance, making it an excellent candidate for optimization efforts.

Example Coz profile

Coz's measurement matches developers' assumptions about profilers: that optimizing highly-ranked code will have the greatest impact on performance. Causal profiling measures optimization potential for serial, parallel, and asynchronous programs without instrumentation of special handling for library calls and concurrency primitives. Instead, a causal profiler uses performance experiments to predict the effect of optimizations. This allows the profiler to establish causality: "optimizing function X will have effect Y," exactly the measurement developers had assumed they were getting all along.

Full details of Coz are available in our paper, Coz: Finding Code that Counts with Causal Profiling (pdf), SOSP 2015, October 2015 (recipient of a Best Paper Award).

Coz presentation at SOSP

Installation

On Debian and Ubuntu, you can install Coz via apt:

sudo apt install coz-profiler

An OpenSUSE package was prepared by user @zethra and is available at https://build.opensuse.org/package/show/home:zethra/coz-profiler.

Coz works on Linux systems (running version 2.6.32 or later, with support for the perf_event_open system call) and macOS (using Apple's kperf framework). Both platforms require a Python 3.x interpreter.

macOS Note: The macOS port uses Apple's private kperf framework for sampling. This requires either running with elevated privileges or adjusting System Integrity Protection settings. The kperf API is undocumented and may change in future macOS versions.

Libraries/Wrappers

By default, Coz works for C, C++, and Rust programs. It has been ported or has wrappers for several other languages, listed below:

Language Link
Java JCoz: https://github.com/Decave/JCoz
Go Cozgo: https://github.com/urjitbhatia/cozgo
Swift Swift Coz: https://github.com/funcmike/swift-coz

Building Coz From Source

Install build prerequisites

On Debian/Ubuntu this covers everything (including the TypeScript viewer tooling and docs):

sudo apt-get update
sudo apt-get install -y build-essential cmake docutils-common git python3 pkg-config
sudo apt-get install -y nodejs npm
# Optional, but required if you plan to build the bundled benchmarks
sudo apt-get install -y libbz2-dev libsqlite3-dev

The repository vendors libelfin, so you do not need to build or install it separately.

Configure and build

Use the standard out-of-source workflow (shown with build/, but any directory works):

cmake -S . -B build          # Configure (defaults to Release with debug info)
cmake --build build -j       # Build libcoz, the CLI, and the tests
ctest --test-dir build -V    # Optional: run the regression tests
cmake --install build        # Optional: install into CMAKE_INSTALL_PREFIX

Before running Coz on Linux, relax perf_event_paranoid so sampling works:

sudo sh -c 'echo 1 >/proc/sys/kernel/perf_event_paranoid'

Building the Benchmarks

The benchmark suite is off by default because it pulls in extra dependencies. Enable it when configuring:

cmake -S . -B build-bench -DBUILD_BENCHMARKS=ON
cmake --build build-bench -j

When BUILD_BENCHMARKS is set, CMake automatically switches the build type to RelWithDebInfo (or keeps Debug) so DWARF line tables are available. Benchmark binaries live under build-bench/benchmarks/<name>/.

A number of the benchmarks are from the Phoenix benchmark suite, and several require data files. These are available for download via links in the README from the Phoenix repository.

Viewer

After profiling, open the results locally (coz plot, which launches the bundled HTML UI) or visit https://coz-profiler.github.io/coz-ui/ and drop in your profile.coz.

If you are on a remote system, you can open the Coz viewer in your browser: https://coz-profiler.github.io/coz-ui/ and then load the file profile.coz, which you will have to transfer to your local machine.

(You may need to move the "Minimum Points" slider on the left side to see the results.)

Using Coz

Using Coz requires a small amount of setup, but you can jump ahead to the section on the included sample applications in this repository if you want to try Coz right away.

To run your program with Coz, you will need to build it with debug information (-g). Coz now supports modern DWARF versions (including DWARF 5), so you can use your compiler's default debug format. You do not need to include debug symbols in the main executable: coz uses the same procedure as gdb to locate debug information for stripped binaries.

Once you have your program built with debug information, you can run it with Coz using the command coz run {coz options} --- {program name and arguments}. But, to produce a useful profile you need to decide which part(s) of the application you want to speed up by specifying one or more progress points.

Profiling Modes

Coz departs from conventional profiling by making it possible to view the effect of optimizations on both throughput and latency. To profile throughput, you must specify a progress point. To profile latency, you must specify a pair of progress points.

Throughput Profiling: Specifying Progress Points

To profile throughput you must indicate a line in the code that corresponds to the end of a unit of work. For example, a progress point could be the point at which a transaction concludes, when a web page finishes rendering, or when a query completes. Coz then measures the rate of visits to each progress point to determine any potential optimization's effect on throughput.

To place a progress point, include coz.h (under the include directory in this repository) and add the COZ_PROGRESS macro to at least one line you would like to execute more frequently. Don't forget to link your program with libdl: use the -ldl option.

By default, Coz uses the source file and line number as the name for your progress points. If you use COZ_PROGRESS_NAMED("name for progress point") instead, you can provide an informative name for your progress points. This also allows you to mark multiple source locations that correspond to the same progress point.

Latency Profiling: Specifying Progress Points

To profile latency, you must place two progress points that correspond to the start and end of an event of interest, such as when a transaction begins and completes. Simply mark the beginning of a transaction with the COZ_BEGIN("transaction name") macro, and the end with the COZ_END("transaction name") macro. Unlike regular progress points, you always need to specify a name for your latency progress points. Don't forget to link your program with libdl: use the -ldl option.

When coz tests a hypothetical optimization it will report the effect of that optimization on the average latency between these two points. Coz can track this information without any knowledge of individual transactions thanks to Little's Law.

Specifying Progress Points on the Command Line

Coz has command line options to specify progress points when profiling the application instead of modifying its source. This feature is currently disabled because it did not work particularly well. Adding support for better command line-specified progress points is planned in the near future.

Processing Results

To plot profile results, go to http://plasma-umass.github.io/coz/ and load your profile. This page also includes several sample profiles from PARSEC benchmarks.

Sample Applications

The benchmarks/ directory includes several small programs with progress points already wired up. Once you configure with -DBUILD_BENCHMARKS=ON (see above), you can run them straight from the build tree:

./build-bench/benchmarks/toy/toy
coz run --- ./build-bench/benchmarks/toy/toy

These programs may need several runs before Coz accumulates enough samples to emit a useful profile. Upload profile.coz to the viewer when you are done.

CMake

When you install coz it installs a cmake config file. To add coz to a cmake project simply use the command find_package(coz-profiler). This will import a target for the library and includes called coz::coz and a target for the coz binary coz::profiler. For guidance on how to use these targets refer to the CMake documentation.

Limitations

Coz currently does not support interpreted or JIT-compiled languages such as Python, Ruby, or JavaScript. Interpreted languages will likely not be supported at any point, but support for JIT-compiled languages that produce debug information could be added in the future.

License

All source code is licensed under the BSD 2-clause license unless otherwise indicated. See LICENSE.md for details.

Sample applications (in the benchmarks directory) include several Phoenix programs and pbzip2, which are licensed separately and included with this release for convenience.