Skip to content

Add max.chunks to EmbeddingRequestChunker to prevent OOM #123150

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Mar 13, 2025

Conversation

jan-elastic
Copy link
Contributor

Fixes: #116022

@jan-elastic jan-elastic added :ml Machine learning Team:ML Meta label for the ML team labels Feb 21, 2025
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.1.0 labels Feb 21, 2025
@jan-elastic jan-elastic added >feature and removed needs:triage Requires assignment of a team area label labels Feb 21, 2025
@jan-elastic jan-elastic marked this pull request as draft February 21, 2025 14:38
@jan-elastic jan-elastic force-pushed the EmbeddingRequestChunker-max-chunks-2 branch from a90c5b1 to 1140130 Compare February 21, 2025 14:49
@jan-elastic jan-elastic force-pushed the EmbeddingRequestChunker-max-chunks-2 branch from 786c9cb to 70787fc Compare March 3, 2025 08:58
@jan-elastic jan-elastic marked this pull request as ready for review March 5, 2025 09:48
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine
Copy link
Collaborator

Hi @jan-elastic, I've created a changelog YAML for you.

@jan-elastic jan-elastic force-pushed the EmbeddingRequestChunker-max-chunks-2 branch 4 times, most recently from 2477af2 to ebbcd12 Compare March 7, 2025 08:34
@jan-elastic jan-elastic force-pushed the EmbeddingRequestChunker-max-chunks-2 branch from bce99c3 to fd5fc22 Compare March 7, 2025 11:38
Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wwang500 wwang500 added the cloud-deploy Publish cloud docker image for Cloud-First-Testing label Mar 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @jan-elastic, I've created a changelog YAML for you.

@jan-elastic jan-elastic merged commit a503497 into main Mar 13, 2025
19 checks passed
@jan-elastic jan-elastic deleted the EmbeddingRequestChunker-max-chunks-2 branch March 13, 2025 10:38
albertzaharovits pushed a commit to albertzaharovits/elasticsearch that referenced this pull request Mar 13, 2025
)

* add max number of chunks

* wire merge function

* implement sparse merge function

* move tests to correct package/file

* float merge function

* bytes merge function

* more accurate byte average

* spotless

* Fix/improve EmbeddingRequestChunkerTests

* Remove TODO

* remove unnecessary field

* remove Chunk generic

* add TODO

* Remove specialized chunks

* add comment

* Update docs/changelog/123150.yaml

* update changelog
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Mar 13, 2025
)

* add max number of chunks

* wire merge function

* implement sparse merge function

* move tests to correct package/file

* float merge function

* bytes merge function

* more accurate byte average

* spotless

* Fix/improve EmbeddingRequestChunkerTests

* Remove TODO

* remove unnecessary field

* remove Chunk generic

* add TODO

* Remove specialized chunks

* add comment

* Update docs/changelog/123150.yaml

* update changelog
@davidkyle
Copy link
Member

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

davidkyle pushed a commit to davidkyle/elasticsearch that referenced this pull request Apr 7, 2025
)

* add max number of chunks

* wire merge function

* implement sparse merge function

* move tests to correct package/file

* float merge function

* bytes merge function

* more accurate byte average

* spotless

* Fix/improve EmbeddingRequestChunkerTests

* Remove TODO

* remove unnecessary field

* remove Chunk generic

* add TODO

* Remove specialized chunks

* add comment

* Update docs/changelog/123150.yaml

* update changelog

(cherry picked from commit a503497)

# Conflicts:
#	x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/results/SparseEmbeddingResults.java
#	x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/results/TextEmbeddingByteResults.java
#	x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/results/TextEmbeddingFloatResults.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/amazonbedrock/AmazonBedrockServiceTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/cohere/action/CohereEmbeddingsActionTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/googleaistudio/action/GoogleAiStudioEmbeddingsActionTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/huggingface/action/HuggingFaceActionCreatorTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/openai/action/OpenAiEmbeddingsActionTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/voyageai/VoyageAIServiceTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/voyageai/action/VoyageAIActionCreatorTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/voyageai/action/VoyageAIEmbeddingsActionTests.java
elasticsearchmachine added a commit that referenced this pull request Apr 7, 2025
…) (#126383)

* Add max.chunks to EmbeddingRequestChunker to prevent OOM (#123150)

* add max number of chunks

* wire merge function

* implement sparse merge function

* move tests to correct package/file

* float merge function

* bytes merge function

* more accurate byte average

* spotless

* Fix/improve EmbeddingRequestChunkerTests

* Remove TODO

* remove unnecessary field

* remove Chunk generic

* add TODO

* Remove specialized chunks

* add comment

* Update docs/changelog/123150.yaml

* update changelog

(cherry picked from commit a503497)

# Conflicts:
#	x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/results/SparseEmbeddingResults.java
#	x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/results/TextEmbeddingByteResults.java
#	x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/results/TextEmbeddingFloatResults.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/amazonbedrock/AmazonBedrockServiceTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/cohere/action/CohereEmbeddingsActionTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/googleaistudio/action/GoogleAiStudioEmbeddingsActionTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/huggingface/action/HuggingFaceActionCreatorTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/openai/action/OpenAiEmbeddingsActionTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/voyageai/VoyageAIServiceTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/voyageai/action/VoyageAIActionCreatorTests.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/services/voyageai/action/VoyageAIEmbeddingsActionTests.java

* [CI] Auto commit changes from spotless

* Revert deleting esql

---------

Co-authored-by: Jan Kuipers <[email protected]>
Co-authored-by: elasticsearchmachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cloud-deploy Publish cloud docker image for Cloud-First-Testing >feature :ml Machine learning Team:ML Meta label for the ML team v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OOM when performing inference on an extremely large document with a semantic_text field
4 participants