
This crate provides a lightweight Rust implementation for loading and inference of Model2Vec static embedding models. For distillation and training, the Python Model2Vec package can be used.
Add the crate:
cargo add model2vec-rs
Make embeddings:
use anyhow::Result;
use model2vec_rs::model::StaticModel;
fn main() -> Result<()> {
// Load a model from the Hugging Face Hub or a local path
// args = (repo_or_path, token, normalize, subfolder)
let model = StaticModel::from_pretrained("minishlab/potion-base-8M", None, None, None)?;
// Prepare a list of sentences
let sentences = vec![
"Hello world".to_string(),
"Rust is awesome".to_string(),
];
// Create embeddings
let embeddings = model.encode(&sentences);
println!("Embeddings: {:?}", embeddings);
Ok(())
}
Make embeddings with the CLI:
# Single sentence
cargo run -- encode "Hello world" minishlab/potion-base-8M
# Multiple lines from a file
echo -e "Hello world\nRust is awesome" > input.txt
cargo run -- encode input.txt minishlab/potion-base-8M --output embeds.json
Make embeddings with custom encode args:
let embeddings = model.encode_with_args(
&sentences, // input texts
Some(512), // max length
1024, // batch size
);
We provide a number of models that can be used out of the box. These models are available on the HuggingFace hub and can be loaded using the from_pretrained
method. The models are listed below.
Model | Language | Sentence Transformer | Params | Task |
---|---|---|---|---|
potion-base-32M | English | bge-base-en-v1.5 | 32.3M | General |
potion-base-8M | English | bge-base-en-v1.5 | 7.5M | General |
potion-base-4M | English | bge-base-en-v1.5 | 3.7M | General |
potion-base-2M | English | bge-base-en-v1.5 | 1.8M | General |
potion-retrieval-32M | English | bge-base-en-v1.5 | 32.3M | Retrieval |
M2V_multilingual_output | Multilingual | LaBSE | 471M | General |
We compared the performance of the Rust implementation with the Python version of Model2Vec. The benchmark was run single-threaded on a CPU.
Implementation | Throughput |
---|---|
Rust | 8000 samples/second |
Python | 4650 samples/second |
The Rust version is roughly 1.7× faster than the Python version.
MIT