This project demonstrates image processing using CUDA and NVIDIA Performance Primitives (NPP) to apply a Gaussian filter on images. It showcases parallel processing by applying the filter to multiple images concurrently using threads and saves the processed images in the output directory.
- Applies Gaussian filter on input images using NPP.
- Supports batch processing with multithreading for faster performance.
- Configurable input and output directories via command-line arguments.
- CUDA Toolkit (version 11.4 or later)
- FreeImage library for image I/O operations
- C++17 compatible compiler
- NVIDIA GPU with Compute Capability 3.5 or higher
-
Install the required libraries:
sudo apt-get install libfreeimage-dev
-
Clone the repository:
git clone https://github.com/snandasena/cuda-at-scale-for-the-enterprise.git cd cuda-at-scale-for-the-enterprise
-
Build the project:
mkdir build cd build cmake .. make
- I haven't tried on Windows platforms due to dependencies issues.
- TODO: Try on Windows later
You can specify custom input and output directories:
./GaussFilter --input /path/to/input --output /path/to/output
--input
: Path to the input directory containing.bmp
images.--output
: Path to the output directory for the filtered images.
If no arguments are provided, the default directories are:
- Input Directory:
data/
- Output Directory:
output/
./GaussFilter --input ../data/ --output ../output/
To clean up the build directory:
make clean
- CUDA: Supports multiple GPU architectures (compute capabilities 5.0 to 8.0).
- External Dependencies: Uses FreeImage for image processing and NPP for GPU acceleration.
CMAKE_CUDA_ARCHITECTURES
: Specifies the supported CUDA architectures.FETCHCONTENT_QUIET
: Enables detailed logging for external dependencies.
Feel free to fork, contribute, or file issues for bug fixes and feature improvements.
- Libraries Used:
- CUDA Runtime
- NPP (NVIDIA Performance Primitives)
- FreeImage
- Main Components:
- Gaussian Filter Application: Uses NPP to filter
.bmp
images with a Gaussian kernel. - Multithreading: Processes images in parallel for faster execution.
- Directory Management: Input and output directories are customizable via command-line flags.
- Error Handling: Catches and reports exceptions during image processing.
- Gaussian Filter Application: Uses NPP to filter
printfNPPinfo()
: Prints detailed information about CUDA and NPP versions.applyGaussFilter()
: Applies a Gaussian filter to a single image.cleanupOutputDirectory()
: Deletes all files in the output directory before processing.processBatch()
: Processes a batch of images concurrently using threads.processImagesInDirectory()
: Processes all images in a given directory, organizing them into batches for parallel processing.parseInputOutputDirs()
: Parses command-line arguments to get custom input and output directories.
The project is configured with CMake for CUDA, ensuring compatibility with multiple GPU architectures. It also integrates FreeImage for image loading and saving.
cmake_minimum_required(VERSION 3.30)
# Set CMake policy CMP0104 to NEW to ensure CUDA architectures are set properly
cmake_policy(SET CMP0104 NEW)
# Project details
project(GaussFilter LANGUAGES C CXX CUDA)
# Set CMake variables for better readability and maintainability
set(CUDA_ROOT /usr/local/cuda) # Path to CUDA installation (adjust as needed)
set(CMAKE_CUDA_STANDARD 17) # Specify the CUDA standard to use
set(CMAKE_CUDA_STANDARD_REQUIRED ON) # Ensure the CUDA standard is required
set(CMAKE_CUDA_ARCHITECTURES 50 60 70 75 80) # Target GPU architectures
# Enable detailed FetchContent logging for clarity
set(FETCHCONTENT_QUIET OFF)
# Include FetchContent module for external dependency management
include(FetchContent)
# Fetch NVIDIA CUDA samples repository for common utilities
FetchContent_Declare(
CudaDependencies
GIT_REPOSITORY https://github.com/NVIDIA/cuda-samples.git
GIT_TAG master
)
# Populate the fetched content (downloads and prepares the dependency)
FetchContent_MakeAvailable(CudaDependencies)
# Include the common utilities directory from the fetched CUDA samples
include_directories(${FETCHCONTENT_BASE_DIR}/cudadependencies-src/Common)
# Include the CUDA include directory for required headers
include_directories(${CUDA_ROOT}/include)
# Add the source file for the Gauss filter as the main executable target
add_executable(${PROJECT_NAME} gauss_filter.cu)
# Link required CUDA libraries
target_link_libraries(${PROJECT_NAME}
freeimage # Image processing library
${CUDA_ROOT}/lib64/libcudart.so # CUDA runtime
${CUDA_ROOT}/lib64/libnppc.so # Core NPP library
${CUDA_ROOT}/lib64/libnppial.so # NPP linear algebra
${CUDA_ROOT}/lib64/libnppif.so # NPP filtering
${CUDA_ROOT}/lib64/libnppicc.so # NPP color conversion
${CUDA_ROOT}/lib64/libnppig.so # NPP geometry
${CUDA_ROOT}/lib64/libnppisu.so # NPP signal processing
)
# Suppress warnings about deprecated GPU targets
add_compile_options(-Wno-deprecated-gpu-targets)
# Print a summary of key configurations for transparency
message(STATUS "Project Name: ${PROJECT_NAME}")
message(STATUS "CUDA Root Directory: ${CUDA_ROOT}")
message(STATUS "CUDA Standard: ${CMAKE_CUDA_STANDARD}")
message(STATUS "CUDA Architectures: ${CMAKE_CUDA_ARCHITECTURES}")
message(STATUS "Using FetchContent for CUDA Dependencies from NVIDIA CUDA Samples.")