Skip to content
/ parrot Public

Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.

License

Notifications You must be signed in to change notification settings

NVlabs/parrot

Repository files navigation

Sean Parrot

Parrot

Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.

✨ Features

  • Fused Operations - Operations are fused when possible
  • GPU Acceleration - Built on CUDA/Thrust for high performance
  • Chainable API - Clean, expressive syntax for complex operations
  • Header-Only - Simple integration with #include "parrot.hpp"

¹ Lazy-ish means that any operation that can be lazily evaluated is lazily evaluated.

🚀 Quick Start

#include "parrot.hpp"

int main() {
    // Create arrays
    auto A = parrot::array({3, 4, 0, 8, 2});
    auto B = parrot::array({6, 7, 2, 1, 8});
    auto C = parrot::array({2, 5, 7, 4, 3});
    
    // Chain operations
    (B * C + A).print();  // Output: 15 39 14 12 26
}

Advanced Example: Row-wise Softmax

#include "parrot.hpp"

using namespace parrot::literals;

auto softmax(auto matrix) {
    auto cols = matrix.shape()[1];
    auto z    = matrix - matrix.maxr(2_ic).replicate(cols);
    auto num  = z.exp();
    auto den  = num.sum(2_ic);
    return num / den.replicate(cols);
}

int main() {
    auto matrix = parrot::range(6).as<float>().reshape({2, 3});
    softmax(matrix).print();
}

🏗️ Building

Prerequisites

  • CMake (3.10+)
  • C++ compiler with C++20 support
  • NVIDIA CUDA Toolkit 13.0 or later
  • NVIDIA GPU with compute capability 7.0+

Build Steps

# Clone the repository
git clone <repository-url>
cd parrot

# Create build directory
mkdir build && cd build

# Configure and build
cmake ..
cmake --build . -j$(nproc)

# Run tests
ctest

For detailed build instructions, see BUILDING.md.

📊 Comparisons

Parrot provides significant code reduction compared to other CUDA libraries:

Library Code Reduction
Thrust ~10x less code

See detailed comparisons in our documentation.

📚 Documentation

🧪 Examples

The examples/ directory contains:

  • getting_started/ - Simple examples to get started
  • thrust/ - Parrot implementations of Thrust examples

🛠️ Development

Running Tests

# All tests
ctest

# Individual test categories
./test_basic      # Basic operations
./test_sorting    # Sorting algorithms
./test_math       # Mathematical operations
./test_reductions # Reduction operations

Code Quality

# Run clang-tidy
./scripts/run-clang-tidy.sh

# Auto-fix issues
./scripts/run-clang-tidy.sh --fix

Building Documentation

# Install dependencies
pip install -r requirements.txt

# Build docs
cd script && ./build-docs.sh

📁 Repository Structure

parrot/
├── parrot.hpp              # Main header (single-file library)
├── thrustx.hpp             # Extended Thrust utilities
├── examples/               # Example code
│   ├── getting_started/    # Simple getting-started examples
│   ├── thrust/             # Parrot versions of Thrust examples
│   ├── real_world/         # Parrot versions of examples from open source projects
├── tests/                  # Unit tests
├── docs/                   # Documentation source
├── scripts/                # Development scripts

🤝 Contributing

We welcome contributions! Please see our CONTRIBUTING.md guide for details on:

  • Developer Certificate of Origin (DCO) requirements
  • Setting up the development environment
  • Code standards and review process
  • Running tests and code quality checks
  • Building documentation

All contributions must be signed off under the DCO and comply with the Apache License 2.0.

📄 License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Third-Party Licenses

This project includes third-party software components. See THIRD_PARTY_LICENSES for complete license information and attributions.

🙏 Acknowledgments

Built on top of NVIDIA Thrust and CUDA.


📖 Read the Full Documentation →

About

Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published