[ML] Custom Inference Service #125679

davidkyle · 2025-03-26T15:47:50Z

See this PR: #127939

elasticsearchmachine · 2025-03-26T15:48:16Z

Hi @davidkyle, I've created a changelog YAML for you.

...ce/src/main/java/org/elasticsearch/xpack/inference/services/custom/CustomSecretSettings.java

...ence/src/main/java/org/elasticsearch/xpack/inference/services/custom/CustomTaskSettings.java

weizijun · 2025-03-28T05:57:21Z

...inference/src/main/java/org/elasticsearch/xpack/inference/services/custom/CustomService.java

+        TimeValue timeout,
+        ActionListener<List<ChunkedInference>> listener
+    ) {
+        listener.onFailure(new ElasticsearchStatusException("Chunking not supported by the {} service", RestStatus.BAD_REQUEST, NAME));


we will see how to support this in the next PR, to support semantic_text

…nference-service

davidkyle · 2025-04-11T15:10:40Z

@weizijun the Elasticsearch security team have advised against adding JsonPATH as a dependency, the last commit to the GitHub project was over a year ago and the project does not appear to be activity maintained. If a critical vulnerability was found in JsonPATH Elasticsearch would be exposed to it and there are no guarantees that the CVE would be fixed.

The team at Elastic considered using another Json path library but have decided to implement the features we need ourselves. The Elasticsearch code base already contains a lot of code for parsing JSON that we can use and writing our own implementation avoids adding another dependency.

…nference-service

weizijun · 2025-04-17T02:14:08Z

@weizijun the Elasticsearch security team have advised against adding JsonPATH as a dependency, the last commit to the GitHub project was over a year ago and the project does not appear to be activity maintained. If a critical vulnerability was found in JsonPATH Elasticsearch would be exposed to it and there are no guarantees that the CVE would be fixed.

The team at Elastic considered using another Json path library but have decided to implement the features we need ourselves. The Elasticsearch code base already contains a lot of code for parsing JSON that we can use and writing our own implementation avoids adding another dependency.

Yeah, It's ok.

…icsearch into custom-inference-service

…nference-service

…icsearch into custom-inference-service

jonathan-buttner · 2025-05-02T18:38:50Z

...src/main/java/org/elasticsearch/xpack/inference/external/http/retry/BaseResponseHandler.java

@@ -36,7 +36,7 @@ public abstract class BaseResponseHandler implements ResponseHandler {
    public static final String METHOD_NOT_ALLOWED = "Received a method not allowed status code";

    protected final String requestType;
-    private final ResponseParser parseFunction;
+    protected final ResponseParser parseFunction;


Making this available so the custom response handler can immediately return on a parse failure instead of retrying.

jonathan-buttner · 2025-05-02T19:02:40Z

...inference/src/main/java/org/elasticsearch/xpack/inference/services/custom/CustomService.java

+        private static final LazyInitializable<InferenceServiceConfiguration, RuntimeException> configuration = new LazyInitializable<>(
+            () -> {
+                var configurationMap = new HashMap<String, SettingsConfiguration>();
+                // TODO revisit this


We'll need to create some more complex configuration types to support the fields (like maps, lists of lists etc). Maybe for now we don't expose this in the services API?

jonathan-buttner · 2025-05-02T19:06:11Z

...e/src/main/java/org/elasticsearch/xpack/inference/services/custom/CustomServiceSettings.java

+
+        Map<String, Object> headers = extractOptionalMap(map, HEADERS, ModelConfigurations.SERVICE_SETTINGS, validationException);
+        removeNullValues(headers);
+        var stringHeaders = validateMapStringValues(headers, HEADERS, validationException, false);


This should limit the values in the header map to only strings.

jonathan-buttner · 2025-05-02T19:07:09Z

...ence/src/main/java/org/elasticsearch/xpack/inference/services/custom/CustomTaskSettings.java

+        removeNullValues(parameters);
+        validateMapValues(
+            parameters,
+            List.of(String.class, Integer.class, Double.class, Float.class, Boolean.class),


Restricting the task settings to these types (no nested fields aka maps or lists).

jonathan-buttner · 2025-05-02T19:07:58Z

...ference/src/main/java/org/elasticsearch/xpack/inference/services/custom/QueryParameters.java

+    public static final String QUERY_PARAMETERS = "query_parameters";
+
+    public static QueryParameters fromMap(Map<String, Object> map, ValidationException validationException) {
+        List<Tuple<String, String>> queryParams = extractOptionalListOfStringTuples(


Query parameters can have duplicate keys which is why I'm not using a map here.

jonathan-buttner · 2025-05-02T19:11:09Z

...e/src/main/java/org/elasticsearch/xpack/inference/services/custom/request/CustomRequest.java

+        uri = buildUri();
+    }
+
+    private static void addStringParams(Map<String, String> stringParams, Map<String, ?> paramsToAdd) {


Fields like the url, query parameters, and headers should not have their values converted to json format. This only accepts strings and doesn't manipulate them.

jonathan-buttner · 2025-05-02T19:11:32Z

...e/src/main/java/org/elasticsearch/xpack/inference/services/custom/request/CustomRequest.java

+        }
+    }
+
+    private static void addJsonStringParams(Map<String, String> jsonStringParams, Map<String, ?> params) {


Fields like the request body need to be a valid json object so we'll convert the values into json

jonathan-buttner · 2025-05-02T19:14:53Z

.../main/java/org/elasticsearch/xpack/inference/services/settings/SerializableSecureString.java

+import java.io.IOException;
+import java.util.Objects;
+
+public class SerializableSecureString implements ToXContentFragment, Writeable {


If we need to serialize the api key or some secrets to the body of a request this class will make that process a little easier by implementing toXContent()

jonathan-buttner · 2025-05-02T20:41:45Z

...inference/src/test/java/org/elasticsearch/xpack/inference/services/AbstractServiceTests.java

+import static org.hamcrest.Matchers.is;
+import static org.mockito.Mockito.mock;
+
+/**


This class is an attempt to push a lot of the duplicate logic in the inference service tests into a central place. If we create more services we should leverage this base class to remove the copy/paste.

elasticsearchmachine · 2025-05-02T21:06:16Z

Pinging @elastic/ml-core (Team:ML)

…nference-service

…r' into custom-inference-service

jonathan-buttner · 2025-05-08T17:28:46Z

...ain/java/org/elasticsearch/xpack/inference/services/custom/response/ErrorResponseParser.java

@@ -97,7 +105,14 @@ public ErrorResponse apply(HttpResult httpResult) {
            var errorText = toType(MapPathExtractor.extract(map, messagePath).extractedObject(), String.class, messagePath);
            return new ErrorResponse(errorText);
        } catch (Exception e) {
-            // swallow the error
+            logger.info(


I ran into a scenario where azure openai didn't return a json response. Normally if that happens we'd swallow the parse error but wouldn't report anything useful back. With this change we'll log the parse failure. The error parsing logic should only be called if we receive a failure status code. If many requests fail, and we are unable to parse the error we could log many errors.

…nference-service

jonathan-buttner · 2025-05-08T19:42:41Z

I'll open a different PR that doesn't have 500+ changes 😆

Huaixinww and others added 9 commits March 26, 2025 15:43

add inference custom model

4f4c603

add unit test

e53b2e4

spotless apply

c593b1a

add custom validation

0a851c1

xpack core spotless apply

e240f06

update commons-lang3's version

3ea3053

Fix compilation after rebase

83daf69

Add missing licences and fix build checks

3cb0cfb

Remove some unused code

a3c862c

davidkyle added >enhancement :ml Machine learning v9.1.0 labels Mar 26, 2025

Update docs/changelog/125679.yaml

2b7e6fe

davidkyle mentioned this pull request Mar 27, 2025

[Inference API] Add Custom Model support to Inference API #124299

Closed

Fix services it

6cc593d

weizijun reviewed Mar 28, 2025

View reviewed changes

jonathan-buttner added 6 commits April 7, 2025 14:34

Merge branch 'main' of github.com:elastic/elasticsearch into custom-i…

a4630e3

…nference-service

Contuing refactor of service settings

95f23f0

Merge branch 'main' of github.com:elastic/elasticsearch into custom-i…

014f95b

…nference-service

Moving classes to reflect new structure

189edba

Refactoring service settings

4fe3a1f

Refactoring the request

4ef37f5

jonathan-buttner added 4 commits April 11, 2025 16:14

Adding files to handle generic error response

6bac18b

Making progress on tests

f644471

Merge branch 'main' of github.com:elastic/elasticsearch into custom-i…

11cf7cc

…nference-service

Adding more tests

f962d74

jonathan-buttner mentioned this pull request Apr 17, 2025

[ML] Implement JSONPath replacement for Inference API #127036

Merged

jonathan-buttner and others added 8 commits April 29, 2025 17:09

Removing licenses

d13191c

Adding custom service tests

12d46d7

Merge branch 'custom-inference-service' of github.com:davidkyle/elast…

0134346

…icsearch into custom-inference-service

[CI] Auto commit changes from spotless

e6fefc4

Correcting tranport version number

eef7188

Merge branch 'main' of github.com:elastic/elasticsearch into custom-i…

83837c8

…nference-service

Merge branch 'custom-inference-service' of github.com:davidkyle/elast…

dc02425

…icsearch into custom-inference-service

Cleaning up

59f75b9

jonathan-buttner reviewed May 2, 2025

View reviewed changes

jonathan-buttner marked this pull request as ready for review May 2, 2025 21:05

jonathan-buttner added Team:ML Meta label for the ML team auto-backport Automatically create backport pull requests when merged v8.19.0 labels May 2, 2025

jonathan-buttner added 7 commits May 5, 2025 10:39

Fixing counts

8a82163

Merge branch 'main' of github.com:elastic/elasticsearch into custom-i…

5f13d28

…nference-service

Fixing rerank and chat completions

c211d83

Missing a few changes

133ef4e

Passing request to the error response handler

5c28ee8

Merge remote-tracking branch 'origin/ml-expose-request-in-error-parse…

be84291

…r' into custom-inference-service

Adding inference id to error parser log message

8d1bd22

jonathan-buttner reviewed May 8, 2025

View reviewed changes

jonathan-buttner added 3 commits May 8, 2025 14:50

Reverting exposing request to error parsing logic

a0984c7

Refactoring the error parsing logic

4242a37

Merge branch 'main' of github.com:elastic/elasticsearch into custom-i…

6492cd7

…nference-service

jonathan-buttner force-pushed the custom-inference-service branch from 1ba25d3 to 6492cd7 Compare May 8, 2025 19:40

jonathan-buttner requested review from a team as code owners May 8, 2025 19:40

jonathan-buttner closed this May 8, 2025

[ML] Custom Inference Service #125679

[ML] Custom Inference Service #125679

Uh oh!

Conversation

davidkyle commented Mar 26, 2025 • edited by jonathan-buttner Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Mar 26, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidkyle commented Apr 11, 2025

Uh oh!

weizijun commented Apr 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented May 2, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonathan-buttner commented May 8, 2025

Uh oh!

Uh oh!

davidkyle commented Mar 26, 2025 •

edited by jonathan-buttner

Loading