TransformersEmbeddingModel and configuration of Threads

We are using the org.springframework.ai.transformers.TransformersEmbeddingModel to compute embedding a large corpus. 

While this works fine, the computations of the embeddings use *all* available cores on our machine. As we have other processes on the server-machine running this is a problem for us as now all cores are used. 

When looking at the source code this might be easily remedied in TransformersEmbeddingModel#afterPropertiesSet: 

```
try (var sessionOptions = new OrtSession.SessionOptions()) {
	if (this.gpuDeviceId >= 0) {
		sessionOptions.addCUDA(this.gpuDeviceId); // Run on a GPU or with another
		// provider
	}
	this.session = this.environment.createSession(getCachedResource(this.modelResource).getContentAsByteArray(),
			sessionOptions);
}
```

The TransformersEmbeddingModel leaves the user no option to create or manipulate. the OrtSession.SessionOptions. Perhaps it might be a solution to enrich the bean with a Factory-Method which creates the SessionOptions-objects, e.g. (Pseudo-Code): 

```
Supplier<SessionOptions> supplier = () -> {
  // default-Impl - same behaviour as in try-catch in aftePropertiesSet
}

public void setSessionOptionsSupplier(Supplier<SessionOptions> s)  {
...
}
```

While we can create a copy from the class and manipulate it to our wishes this might be a good enhancement. 




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TransformersEmbeddingModel and configuration of Threads #3761

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TransformersEmbeddingModel and configuration of Threads #3761

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions