Description
We are using the org.springframework.ai.transformers.TransformersEmbeddingModel to compute embedding a large corpus.
While this works fine, the computations of the embeddings use all available cores on our machine. As we have other processes on the server-machine running this is a problem for us as now all cores are used.
When looking at the source code this might be easily remedied in TransformersEmbeddingModel#afterPropertiesSet:
try (var sessionOptions = new OrtSession.SessionOptions()) {
if (this.gpuDeviceId >= 0) {
sessionOptions.addCUDA(this.gpuDeviceId); // Run on a GPU or with another
// provider
}
this.session = this.environment.createSession(getCachedResource(this.modelResource).getContentAsByteArray(),
sessionOptions);
}
The TransformersEmbeddingModel leaves the user no option to create or manipulate. the OrtSession.SessionOptions. Perhaps it might be a solution to enrich the bean with a Factory-Method which creates the SessionOptions-objects, e.g. (Pseudo-Code):
Supplier<SessionOptions> supplier = () -> {
// default-Impl - same behaviour as in try-catch in aftePropertiesSet
}
public void setSessionOptionsSupplier(Supplier<SessionOptions> s) {
...
}
While we can create a copy from the class and manipulate it to our wishes this might be a good enhancement.