[ML] Integrate SageMaker with OpenAI Embeddings #126856

prwhelan · 2025-04-15T15:44:19Z

Integrating with SageMaker.

Current design:

SageMaker accepts any byte payload, which can be text, csv, or json. api represents the structure of the payload that we will send, for example openai, elastic, common, probably cohere or huggingface as well.
api implementations are extensions of SageMakerSchemaPayload, which supports:
- "extra" service and task settings specific to the payload structure, so cohere would require embedding_type and openai would require dimensions in the service_settings
- conversion logic from model, service settings, task settings, and input to SdkBytes
- conversion logic from responding SdkBytes to InferenceServiceResults
Everything else is tunneling, there are a number of base service_settings and task_settings that are independent of the api format that we will store and set
We let the SDK do the bulk of the work in terms of connection details, rate limiting, retries, etc.

elasticsearchmachine · 2025-04-15T15:44:44Z

Hi @prwhelan, I've created a changelog YAML for you.

...asticsearch/xpack/inference/services/sagemaker/schema/openai/OpenAiTextEmbeddingPayload.java

jonathan-buttner

Looking good! Just left a few thoughts.

...ence/src/main/java/org/elasticsearch/xpack/inference/services/sagemaker/SageMakerClient.java

...ava/org/elasticsearch/xpack/inference/services/sagemaker/model/SageMakerServiceSettings.java

jonathan-buttner · 2025-04-18T13:44:47Z

...ava/org/elasticsearch/xpack/inference/services/sagemaker/model/SageMakerServiceSettings.java

+        return builder.endObject();
+    }
+
+    private static <T> void optionalField(String name, T value, XContentBuilder builder) throws IOException {


Nice, might be helpful to have this in a utility class somewhere eventually because we have to do stuff like this a lot.

...c/main/java/org/elasticsearch/xpack/inference/services/sagemaker/schema/SageMakerSchema.java

...rg/elasticsearch/xpack/inference/services/sagemaker/schema/SageMakerStoredServiceSchema.java

...rg/elasticsearch/xpack/inference/services/sagemaker/schema/SageMakerStreamSchemaPayload.java

...asticsearch/xpack/inference/services/sagemaker/schema/openai/OpenAiTextEmbeddingPayload.java

...ence/src/test/java/org/elasticsearch/xpack/inference/services/InferenceSettingsTestCase.java

...asticsearch/xpack/inference/services/sagemaker/schema/openai/OpenAiTextEmbeddingPayload.java

elasticsearchmachine · 2025-04-25T17:14:56Z

Pinging @elastic/ml-core (Team:ML)

…sponse

jonathan-buttner

Looks good! Just a reminder to add docs in the elasticsearch-specification repo.

jonathan-buttner · 2025-04-29T18:28:39Z

...ference/src/main/java/org/elasticsearch/xpack/inference/common/amazon/AwsSecretSettings.java

-                return Collections.unmodifiableMap(configurationMap);
-            });
+            new LazyInitializable<>(
+                () -> configuration(EnumSet.of(TaskType.TEXT_EMBEDDING, TaskType.COMPLETION)).collect(


nit: Would Map.of() work instead of using a stream?

Oh I see we're combining multiple streams in a separate place 👍

davidkyle

LGTM

davidkyle · 2025-04-30T08:16:57Z

...ence/src/main/java/org/elasticsearch/xpack/inference/services/sagemaker/SageMakerClient.java

+        } else {
+            ExceptionsHelper.maybeError(t).ifPresent(ExceptionsHelper::maybeDieOnAnotherThread);
+            log.atWarn().withThrowable(t).log("Unknown failure calling SageMaker.");
+            listener.onFailure(new RuntimeException("Unknown failure calling SageMaker."));


Suggested change

listener.onFailure(new RuntimeException("Unknown failure calling SageMaker."));

listener.onFailure(new RuntimeException("Unknown failure calling SageMaker.", t));

davidkyle · 2025-04-30T09:45:17Z

...ence/src/main/java/org/elasticsearch/xpack/inference/services/sagemaker/SageMakerClient.java

+        public void subscribe(Flow.Subscriber<? super ResponseStream> subscriber) {
+            if (holder.compareAndSet(null, Tuple.tuple(null, subscriber)) == false) {
+                log.debug("Subscriber connecting to publisher.");
+                var publisher = holder.getAndSet(null).v1();


Other implementations of this method call onError() if a subscriber is already set, should this do the same?

https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/amazonbedrock/client/AmazonBedrockStreamingChatProcessor.java#L48

davidkyle · 2025-04-30T09:51:18Z

...nce/src/main/java/org/elasticsearch/xpack/inference/services/sagemaker/SageMakerService.java

+        Map<String, Object> config,
+        ActionListener<Model> parsedModelListener
+    ) {
+        ActionListener.completeWith(parsedModelListener, () -> modelBuilder.fromRequest(modelId, taskType, NAME, config));


davidkyle · 2025-04-30T09:51:52Z

...nce/src/main/java/org/elasticsearch/xpack/inference/services/sagemaker/SageMakerService.java

+
+public class SageMakerService implements InferenceService {
+    public static final String NAME = "sagemaker";
+    private static final int DEFAULT_BATCH_SIZE = 2048;


Seems like is a big number. 2048 may be an optimal size for SageMaker but a batch this size would use quite a lot of memory and isn't sympathetic with how the inference API works

davidkyle · 2025-05-01T07:37:03Z

...ava/org/elasticsearch/xpack/inference/services/sagemaker/model/SageMakerServiceSettings.java

+            Map.entry(
+                API,
+                new SettingsConfiguration.Builder(supportedTaskTypes).setDescription("The API format that your SageMaker Endpoint expects.")
+                    .setLabel("Api")


Suggested change

.setLabel("Api")

.setLabel("API")

davidkyle · 2025-05-01T09:02:12Z

...ence/src/test/java/org/elasticsearch/xpack/inference/services/InferenceSettingsTestCase.java

+    public final void testXContentRoundTrip() throws IOException {
+        var instance = createTestInstance();
+        var instanceAsMap = toMap(instance);
+        var roundTripInstance = fromMutableMap(new HashMap<>(instanceAsMap));


elasticsearchmachine · 2025-05-01T16:58:27Z

💔 Backport failed

Status	Branch	Result
❌	8.19	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 126856

Integrating with SageMaker. Current design: - SageMaker accepts any byte payload, which can be text, csv, or json. `api` represents the structure of the payload that we will send, for example `openai`, `elastic`, `common`, probably `cohere` or `huggingface` as well. - `api` implementations are extensions of `SageMakerSchemaPayload`, which supports: - "extra" service and task settings specific to the payload structure, so `cohere` would require `embedding_type` and `openai` would require `dimensions` in the `service_settings` - conversion logic from model, service settings, task settings, and input to `SdkBytes` - conversion logic from responding `SdkBytes` to `InferenceServiceResults` - Everything else is tunneling, there are a number of base `service_settings` and `task_settings` that are independent of the api format that we will store and set - We let the SDK do the bulk of the work in terms of connection details, rate limiting, retries, etc.

sagemaker

acdca34

prwhelan added >enhancement :ml Machine learning Team:ML Meta label for the ML team v9.1.0 labels Apr 15, 2025

Update docs/changelog/126856.yaml

350cc3a

[CI] Auto commit changes from spotless

f4ddb12

prwhelan commented Apr 17, 2025

View reviewed changes

...asticsearch/xpack/inference/services/sagemaker/schema/openai/OpenAiTextEmbeddingPayload.java Outdated Show resolved Hide resolved

jonathan-buttner reviewed Apr 18, 2025

View reviewed changes

prwhelan and others added 7 commits April 21, 2025 21:04

more tests; address comments

01eaabd

Merge branch 'main' into sagemaker

761d2ef

add transportversion, javadocs

f60d951

[CI] Auto commit changes from spotless

14b358e

Remove forbidden API

204bc9f

[CI] Auto commit changes from spotless

0ece5f9

Remove forbidden API in tests

9901353

prwhelan commented Apr 22, 2025

View reviewed changes

...asticsearch/xpack/inference/services/sagemaker/schema/openai/OpenAiTextEmbeddingPayload.java Outdated Show resolved Hide resolved

...asticsearch/xpack/inference/services/sagemaker/schema/openai/OpenAiTextEmbeddingPayload.java Outdated Show resolved Hide resolved

prwhelan added 2 commits April 25, 2025 10:04

Add more error handling; fix bugs with OpenAi; Configuration

1f519d8

Merge branch 'main' into sagemaker

77f3a45

prwhelan changed the title ~~[Draft][Not For Checkin] Current SageMaker work~~ [ML] Integrate SageMaker with OpenAI Embeddings Apr 25, 2025

prwhelan marked this pull request as ready for review April 25, 2025 17:14

prwhelan requested a review from a team as a code owner April 25, 2025 17:14

prwhelan added auto-backport Automatically create backport pull requests when merged v8.19.0 labels Apr 29, 2025

prwhelan added 3 commits April 29, 2025 10:06

Merge branch 'main' into sagemaker

e9edfd4

Update dimensionsSetByUser when dimensions are taken from validate re…

68ebae7

…sponse

Drop dimensions when it isn't set by users

057797c

jonathan-buttner approved these changes Apr 29, 2025

View reviewed changes

Merge branch 'main' into sagemaker

4f8f41c

davidkyle self-requested a review April 30, 2025 11:31

Merge branch 'main' into sagemaker

07b8680

davidkyle approved these changes May 1, 2025

View reviewed changes

prwhelan added 3 commits May 1, 2025 09:48

Merge branch 'main' into sagemaker

abaa9cb

Address comments

8ea40bb

Merge branch 'main' into sagemaker

21f5bca

brianseeders approved these changes May 1, 2025

View reviewed changes

Merge branch 'main' into sagemaker

19ab2d2

prwhelan enabled auto-merge (squash) May 1, 2025 15:54

prwhelan merged commit 245f5ee into elastic:main May 1, 2025
16 of 17 checks passed

elasticsearchmachine added the backport pending label May 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Integrate SageMaker with OpenAI Embeddings #126856

[ML] Integrate SageMaker with OpenAI Embeddings #126856

prwhelan commented Apr 15, 2025 •

edited

Loading

elasticsearchmachine commented Apr 15, 2025

jonathan-buttner left a comment

jonathan-buttner Apr 18, 2025

elasticsearchmachine commented Apr 25, 2025

jonathan-buttner left a comment

jonathan-buttner Apr 29, 2025

jonathan-buttner Apr 29, 2025

davidkyle left a comment

davidkyle Apr 30, 2025

davidkyle Apr 30, 2025

davidkyle Apr 30, 2025

davidkyle Apr 30, 2025

davidkyle May 1, 2025

davidkyle May 1, 2025

elasticsearchmachine commented May 1, 2025

	listener.onFailure(new RuntimeException("Unknown failure calling SageMaker."));
	listener.onFailure(new RuntimeException("Unknown failure calling SageMaker.", t));

[ML] Integrate SageMaker with OpenAI Embeddings #126856

[ML] Integrate SageMaker with OpenAI Embeddings #126856

Conversation

prwhelan commented Apr 15, 2025 • edited Loading

elasticsearchmachine commented Apr 15, 2025

jonathan-buttner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticsearchmachine commented Apr 25, 2025

jonathan-buttner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidkyle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticsearchmachine commented May 1, 2025

💔 Backport failed

prwhelan commented Apr 15, 2025 •

edited

Loading