Skip to content

TransformersEmbeddingModel and configuration of Threads #3761

Open
@tschoellhorn

Description

@tschoellhorn

We are using the org.springframework.ai.transformers.TransformersEmbeddingModel to compute embedding a large corpus.

While this works fine, the computations of the embeddings use all available cores on our machine. As we have other processes on the server-machine running this is a problem for us as now all cores are used.

When looking at the source code this might be easily remedied in TransformersEmbeddingModel#afterPropertiesSet:

try (var sessionOptions = new OrtSession.SessionOptions()) {
	if (this.gpuDeviceId >= 0) {
		sessionOptions.addCUDA(this.gpuDeviceId); // Run on a GPU or with another
		// provider
	}
	this.session = this.environment.createSession(getCachedResource(this.modelResource).getContentAsByteArray(),
			sessionOptions);
}

The TransformersEmbeddingModel leaves the user no option to create or manipulate. the OrtSession.SessionOptions. Perhaps it might be a solution to enrich the bean with a Factory-Method which creates the SessionOptions-objects, e.g. (Pseudo-Code):

Supplier<SessionOptions> supplier = () -> {
  // default-Impl - same behaviour as in try-catch in aftePropertiesSet
}

public void setSessionOptionsSupplier(Supplier<SessionOptions> s)  {
...
}

While we can create a copy from the class and manipulate it to our wishes this might be a good enhancement.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions