Skip to content

Add index_options to semantic_text field mappings #119967

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 96 commits into from
Jun 17, 2025

Conversation

kderusso
Copy link
Member

@kderusso kderusso commented Jan 10, 2025

Adds index_options support for semantic_text fields using dense models.

Example:

PUT _inference/text_embedding/my-e5-model
{
  "service": "elasticsearch",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1,
    "model_id": ".multilingual-e5-small"
  }
}

PUT my-semantic-index
{
  "mappings": {
    "properties": {
      "inference_field": {
        "type": "semantic_text",
        "inference_id": "my-e5-model",
        "index_options": {
          "dense_vector": {
            "type": "bbq_hnsw",
            "ef_construction": 100
           }
        }
      }
    }
  }
}

@kderusso kderusso force-pushed the kderusso/semantic-text-index-options branch from e096a61 to 342d769 Compare January 10, 2025 16:17
@kderusso kderusso force-pushed the kderusso/semantic-text-index-options branch from 342d769 to d822301 Compare January 10, 2025 16:29
@kderusso kderusso added >enhancement auto-backport Automatically create backport pull requests when merged :SearchOrg/Relevance Label for the Search (solution/org) Relevance team v8.18.0 labels Jan 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @kderusso, I've created a changelog YAML for you.

@kderusso kderusso added the :Search Relevance/Search Catch all for Search Relevance label Jan 10, 2025
@kderusso kderusso marked this pull request as ready for review January 10, 2025 16:38
@kderusso kderusso requested review from jimczi, Mikep86 and a team January 10, 2025 16:39
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-eng (Team:SearchOrg)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-relevance (Team:Search - Relevance)

Copy link
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start to this! I have a bunch of comments, but they're mostly interrelated, so it's not as much as it seems.

@kderusso kderusso requested review from jimczi and Mikep86 June 16, 2025 20:16
Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one comment regarding validation but this feels very close @kderusso !

Copy link
Contributor

github-actions bot commented Jun 17, 2025

🔍 Preview links for changed docs:

🔔 The preview site may take up to 3 minutes to finish building. These links will become live once it completes.

Copy link
Contributor

@Samiul-TheSoccerFan Samiul-TheSoccerFan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice implementation and tests, left a few nitpick comments.

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kderusso kderusso merged commit 813814b into elastic:main Jun 17, 2025
27 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 119967

kderusso added a commit to kderusso/elasticsearch that referenced this pull request Jun 17, 2025
* Add index_options parameter to semantic_text field mapping

* Cleanup & tests

* Update docs

* Update docs/changelog/119967.yaml

* Addressed some PR feedbak

* Update yaml tests

* Refactoring

* Cleanup

* Fix some tests

* Hack in inferring text_embedding task type from index options

* [CI] Auto commit changes from spotless

* Fix error inferring model settings

* Update docs

* Update tests

* Update docs/reference/mapping/types/semantic-text.asciidoc

Co-authored-by: Mike Pellegrini <[email protected]>

* Address some minor PR feedback

* Remove partial model_settings with inferred task type

* Cleanup

* Remove unnecessary changes

* Fix errors from merge

* [CI] Auto commit changes from spotless

* Cleanup

* Checkpoint, saving changes before merge

* Update parsing

* [CI] Auto commit changes from spotless

* Stash changes

* Fix compile errors

* [CI] Auto commit changes from spotless

* Cleanup error

* fix test

* fix test

* Fix another test

* A bit of cleanup

* Fix tests

* Spotless

* Respect index options if set over defaults

* Cleanup

* [CI] Auto commit changes from spotless

* Support updating to compatible versions, add some cleanup and validation

* Remove test that can't be done here - needs to be unit test

* Add validation

* Cleanup

* Fix some yaml tests

* [CI] Auto commit changes from spotless

* Happy path early index validation works now; edge cases surrounding default BBQ remain

* Always emit index options, even when using defaults

* Minor cleanup

* Fix test compilation failures

* Fix some tests

* Continue to iterate on test failures

* Remove index options from inference field metadata as it is only needed at field creation time

* Fix some tests

* Remove transport version, no longer needed

* Fix yaml tests

* Add tests

* IndexOptions don't need to implement Writeable

* [CI] Auto commit changes from spotless

* Refactor - move SemanticTextIndexOptions

* Remove writeable

* Move index_options parsing to semantic text field mapper

* Cleanup

* Fix test compilation issue

* Cleanup

* Remove whitespace

* Remove writeables from index options

* Disable merging null options?

* Add docs

* [CI] Auto commit changes from spotless

* Revert "Disable merging null options?"

This reverts commit 2ef8b1d.

* Remove default serialization

* Include default index option type to defaults

* [CI] Auto commit changes from spotless

* Go back to allowing null updateS

* Cleanup

* Fix validation error

* Revert "Include default index option type to defaults"

This reverts commit b08e2a1.

* Update tests

* Revert "Update tests"

This reverts commit aedfafe.

* Better fix for null inputs

* Remove redundant merge validation

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Co-authored-by: Mike Pellegrini <[email protected]>
(cherry picked from commit 813814b)

# Conflicts:
#	docs/reference/elasticsearch/mapping-reference/semantic-text.md
#	server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapper.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapperTests.java
@kderusso
Copy link
Member Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

@kderusso
Copy link
Member Author

Had to create a manual backport: #129626

kderusso added a commit that referenced this pull request Jun 18, 2025
…129626)

* Add index_options to semantic_text field mappings (#119967)

* Add index_options parameter to semantic_text field mapping

* Cleanup & tests

* Update docs

* Update docs/changelog/119967.yaml

* Addressed some PR feedbak

* Update yaml tests

* Refactoring

* Cleanup

* Fix some tests

* Hack in inferring text_embedding task type from index options

* [CI] Auto commit changes from spotless

* Fix error inferring model settings

* Update docs

* Update tests

* Update docs/reference/mapping/types/semantic-text.asciidoc

Co-authored-by: Mike Pellegrini <[email protected]>

* Address some minor PR feedback

* Remove partial model_settings with inferred task type

* Cleanup

* Remove unnecessary changes

* Fix errors from merge

* [CI] Auto commit changes from spotless

* Cleanup

* Checkpoint, saving changes before merge

* Update parsing

* [CI] Auto commit changes from spotless

* Stash changes

* Fix compile errors

* [CI] Auto commit changes from spotless

* Cleanup error

* fix test

* fix test

* Fix another test

* A bit of cleanup

* Fix tests

* Spotless

* Respect index options if set over defaults

* Cleanup

* [CI] Auto commit changes from spotless

* Support updating to compatible versions, add some cleanup and validation

* Remove test that can't be done here - needs to be unit test

* Add validation

* Cleanup

* Fix some yaml tests

* [CI] Auto commit changes from spotless

* Happy path early index validation works now; edge cases surrounding default BBQ remain

* Always emit index options, even when using defaults

* Minor cleanup

* Fix test compilation failures

* Fix some tests

* Continue to iterate on test failures

* Remove index options from inference field metadata as it is only needed at field creation time

* Fix some tests

* Remove transport version, no longer needed

* Fix yaml tests

* Add tests

* IndexOptions don't need to implement Writeable

* [CI] Auto commit changes from spotless

* Refactor - move SemanticTextIndexOptions

* Remove writeable

* Move index_options parsing to semantic text field mapper

* Cleanup

* Fix test compilation issue

* Cleanup

* Remove whitespace

* Remove writeables from index options

* Disable merging null options?

* Add docs

* [CI] Auto commit changes from spotless

* Revert "Disable merging null options?"

This reverts commit 2ef8b1d.

* Remove default serialization

* Include default index option type to defaults

* [CI] Auto commit changes from spotless

* Go back to allowing null updateS

* Cleanup

* Fix validation error

* Revert "Include default index option type to defaults"

This reverts commit b08e2a1.

* Update tests

* Revert "Update tests"

This reverts commit aedfafe.

* Better fix for null inputs

* Remove redundant merge validation

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Co-authored-by: Mike Pellegrini <[email protected]>
(cherry picked from commit 813814b)

# Conflicts:
#	docs/reference/elasticsearch/mapping-reference/semantic-text.md
#	server/src/main/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldMapper.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapper.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/mapper/SemanticTextFieldMapperTests.java

* Fix errors in backport merge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged backport pending >enhancement :Search Relevance/Search Catch all for Search Relevance :SearchOrg/Relevance Label for the Search (solution/org) Relevance team Team:Search - Relevance The Search organization Search Relevance team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch Team:SearchOrg Meta label for the Search Org (Enterprise Search) v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants