Uses new Rust 2024

0.1.5	Oct 6, 2025

MIT license

27KB
707 lines

CTranslate2-rs

This library provides Rust bindings for OpenNMT/CTranslate2.

Usage

Add this crate to your Cargo.toml with selecting the backends you want to use as the features:

[dependencies]
ctranslate2 = { version = "2" }
ctranslate2-sys = { version = "0.1.5", features = ["cuda", "cudnn"] }

The installation of CMake is required to compile the library.

Setting the environment variable RUSTFLAGS=-C target-feature=+crt-static might be required.

vendor: Use prebuilt binaries
shared: Build with ctranslate2 as shared library
crt-dynamic: crt is statically linked on Windows-static builds. to link crt dynamically, use crt-dynamic

These features only do something if vendor is not used

cuda: Enables CUDA support
- cudnn: Enables cuDNN support
- cuda-dynamic-loading: Enables dynamic loading of CUDA libraries at runtime instead of static linking (requires CUDA >= 11)
  - cuda-small-binary: Reduces binary size by compressing device code
mkl: Enables Intel MKL support
openblas: Enables OpenBLAS support (OpenBLAS needs to be installed manually via vcpkg on Windows)
ruy: Enables Ruy support
accelerate: Enables Apple Accelerate support (macOS only)
dnnl: Enables oneDNN support
openmp-runtime-comp: Enables OpenMP runtime support
openmp-runtime-intel: Enables OpenMP runtime support for Intel compilers
msse4_1: Enables MSSE4.1 support
os-defaults
tensor-parallel: Enables Tensor Parallelism
- flash-attention: Enables Flash Attention

macos static x86_64[openmp_intel, dnnl, mkl]
macos static aarch64[accelerate, ruy]
linux static x86_64[openmp_comp, cuda, cudnn, cuda_small_binary, cuda-dynamic-loading, dnnl, mkl, tensor_parallel, msse4_1]
linux static aarch64[openmp_comp, ruy, openblas]
windows static x86_64[openmp_intel, cuda, cudnn, cuda_small_binary, cuda-dynamic-loading, dnnl, mkl]
windows static dynamic-crt x86_64[openmp_intel, cuda, cudnn, cuda_small_binary, cuda-dynamic-loading, dnnl, mkl]
macos shared aarch64[accelerate, ruy]
linux shared x86_64[openmp_comp, cuda, cudnn, cuda_small_binary, cuda-dynamic-loading, dnnl, mkl, tensor_parallel, msse4_1]
linux shared aarch64[openmp_comp, ruy, openblas]
windows shared x86_64[openmp_intel, cuda, cudnn, cuda_small_binary, cuda-dynamic-loading, dnnl, mkl]

~0–4.5MB
~83K SLoC