Skip to content

[ML] Delay copying chunked input strings #125837

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 1, 2025

Conversation

davidkyle
Copy link
Member

@davidkyle davidkyle commented Mar 28, 2025

The chunker stores the position of each chunk's text in the original text rather than making copies. However, when an inference request is made the actual chunk text is required, at this point a copy must be made. The copying is done when String#subString() is called.

The PR reduces the lifetime of the string copies but returning a string Supplier in from the chunker and performing the copy closer to where the request will be made. See RequestExecutorService

.execute(task.getInferenceInputs(), requestSender, task.getRequestCompletedFunction(), task.getListener());

As a follow up once #125567 is merged EmbeddingInputs will be moved to Request#createHttpRequest() so that the string copy will be made at the point the http request is constructed further reducing the lifespan of the copy.

I had to change the logic around InferenceInput#inputSize() to avoid calling the supplier function early of find out if there was more than 1 input.

@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Mar 28, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Contributor

@jan-elastic jan-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidkyle davidkyle added the auto-backport Automatically create backport pull requests when merged label Mar 31, 2025
@davidkyle davidkyle enabled auto-merge (squash) March 31, 2025 10:02
@davidkyle davidkyle merged commit c521264 into elastic:main Apr 1, 2025
17 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 125837

davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request Apr 7, 2025
The chunked text is only required when the actual inference request is made,
using a string supplier means the string creation can be done much much closer
to where the request is made reducing the lifespan of the copied string.

(cherry picked from commit c521264)

# Conflicts:
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/chunking/EmbeddingRequestChunkerTests.java
@davidkyle
Copy link
Member Author

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

elasticsearchmachine pushed a commit that referenced this pull request Apr 7, 2025
The chunked text is only required when the actual inference request is made,
using a string supplier means the string creation can be done much much closer
to where the request is made reducing the lifespan of the copied string.

(cherry picked from commit c521264)

# Conflicts:
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/chunking/EmbeddingRequestChunkerTests.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged backport pending :ml Machine learning >refactoring Team:ML Meta label for the ML team v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants