DatAFLow is a fuzzer built on top of AFL++. However, instead of a control-flow-based feedback mechanism (e.g., based on control-flow edge coverage), datAFLow uses a data-flow-based feedback mechanism; specifically, data flows based on def-use associations.
To enable performant fuzzing, datAFLow uses a custom low-fat pointer memory
allocator for efficiently tracking data flows at runtime. This is achieved via
two mechanisms: a runtime replacement for malloc and friends, libfuzzalloc,
and a set of LLVM passes to transform your target to use libfuzzalloc.
More details are available in our registered report, published at the 1st International Fuzzing Workshop (FUZZING) 2022. You can read our report here.
The datAFLow fuzzer requires a custom version of clang. Once this is built,
the fuzzalloc toolchain can be built. FUZZALLOC_SRC variable refers to this
directory.
fuzzalloc requires a patch to the clang compiler to disable turning constant
arrays into packed constant structs.
To build the custom clang:
# Get the LLVM source code and update the clang source code
mkdir llvm
cd llvm
$FUZZALLOC_SRC/llvm-scripts/get_llvm_src.sh
$FUZZALLOC_SRC/llvm-scripts/update_clang_src.sh
# Build and install LLVM/clang/etc.
mkdir build
mkdir install
cd build
# If debugging you can also add -DCMAKE_BUILD_TYPE=Debug -DCOMPILER_RT_DEBUG=On
# Note that if you're going to use gclang, things seem to work better if you use
# the gold linker (https://llvm.org/docs/GoldPlugin.html)
cmake ../llvm -DLLVM_ENABLE_PROJECTS="clang;compiler-rt" \
-DLLVM_BUILD_EXAMPLES=Off -DLLVM_INCLUDE_EXAMPLES=Off \
-DLLVM_TARGETS_TO_BUILD="X86" -DCMAKE_INSTALL_PREFIX=$(realpath ../install)
cmake --build .
cmake --build . --target install
# Add the install directory to your path so that you use the correct clang
export PATH=$(realpath ../install):$PATHFuzzing is typically performed in conjunction with a
sanitizer so that "silent" bugs can
be uncovered. Sanitizers such as
ASan typically
hook and replace dynamic memory allocation routines such as malloc/free so
that they can detect buffer over/under flows, use-after-frees, etc.
Unfortunately, this means that we lose the ability to track dataflow (as we
rely on the memory allocator to do this). Therefore, we must use a custom
version of ASan in order to (a) detect bugs and (b) track dataflow.
To build the custom ASan, run the following after running get_llvm_src.sh and
update_clang_src.sh above:
cd llvm
$FUZZALLOC_SRC/llvm-scripts/update_compiler_rt_src.sh
$FUZZALLOC_SRC/llvm-scripts/update_llvm_src.sh
# Build and install LLVM/clang/etc.
cd build
# If debugging you can also add -DCMAKE_BUILD_TYPE=Debug -DCOMPILER_RT_DEBUG=On
cmake ../llvm -DLLVM_ENABLE_PROJECTS="clang;compiler-rt" \
-DFUZZALLOC_ASAN=On -DLIBFUZZALLOC_PATH=/path/to/libfuzzalloc.so \
-DLLVM_BUILD_EXAMPLES=Off -DLLVM_INCLUDE_EXAMPLES=Off \
-DLLVM_TARGETS_TO_BUILD="X86" -DCMAKE_INSTALL_PREFIX=$(realpath ../install)
cmake --build .
cmake --build . --target install
# Make sure the install path is available in $PATHNote that after building LLVM with the custom ASan, you will have to rebuild
fuzzalloc with the new clang/clang++ (found under install/bin).
mkdir build
cd build
cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DAFL_PATH=/path/to./afl++/source $FUZZALLOC_SRC
make -jlibfuzzalloc is a drop-in replacement for malloc and friends. When using
gcc, it's safest to pass in the flags
-fno-builtin-malloc -fno-builtin-calloc -fno-builtin-realloc -fno-builtin-freeAll you have to do is link your target with -lfuzzalloc.
The dataflow-cc (and dataflow-cc++) tools can be used as dropin replacements
for clang (and clang++).
Note that this typically requires running dataflow-preprocess before running
dataflow-cc to collect the allocation sites to tag.
If the target uses custom memory allocation routines (i.e., wrapping malloc,
calloc, etc.), then a special case
list containing a
list of these routines should be provided to dataflow-preprocess. Doing so
ensures dynamically-allocated variable def sites are appropriately tagged. The
list is provided via the FUZZALLOC_MEM_FUNCS environment variable; i.e.,
FUZZALLOC_MEM_FUNCS=/path/to/special/case/list. The special case list must be
formatted as:
[fuzzalloc]
fun:malloc_wrapper
fun:calloc_wrapper
fun:realloc_wrapper
The locations of variable tag sites are stored in a file specified by the
FUZZALLOC_TAG_LOG environment variable.
dataflow-cc is a drop-in replacement for clang. To use the tag list
generated by dataflow-preprocess, set it in the FUZZALLOC_TAG_LOG
environment variable (e.g., FUZZALLOC_TAG_LOG=/path/to/tags).
Other useful environment variables include:
-
FUZZALLOC_FUZZER: Sets the fuzzer instrumentation to use. Valid fuzzers include:debug-log(logging tostderr. This requiresfuzzallocbe built in debug mode; i.e., with-DCMAKE_BUILD_TYPE=Debug),AFL, andlibfuzzer. -
FUZZALLOC_SENSITIVITY: Sets the use site sensitivity. Valid sensitivities are:mem-read,mem-write,mem-access,mem-read-offset,mem-write-offset, andmem-access-offset.
The following flags are added to libFuzzer:
use_dataflow: Enable dataflow-based coverageprint_dataflows: Print out covered def/use chainsjob_prefix: fuzz-JOB.log prefix