Skip to content

IllegalStateException thrown in SemanticQueryBuilder #116106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bpintea opened this issue Nov 1, 2024 · 16 comments
Open

IllegalStateException thrown in SemanticQueryBuilder #116106

bpintea opened this issue Nov 1, 2024 · 16 comments
Labels
>bug priority:normal A label for assessing bug priority to be used by ES engineers :SearchOrg/Relevance Label for the Search (solution/org) Relevance team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch

Comments

@bpintea
Copy link
Contributor

bpintea commented Nov 1, 2024

Description

For a _field_caps request with params: {types=, ignore_unavailable=true, expand_wildcards=open, allow_no_indices=true, index=*,-.*, serverlessRequest=true, include_empty_fields=false}, resulting in a status: 500, the following suppressed exception is logged:

java.lang.IllegalStateException: No inference results set for [semantic_text] field [description_embedding]
	at [email protected]/org.elasticsearch.xpack.inference.queries.SemanticQueryBuilder.doRewriteBuildSemanticQuery(SemanticQueryBuilder.java:169)
	at [email protected]/org.elasticsearch.xpack.inference.queries.SemanticQueryBuilder.doRewrite(SemanticQueryBuilder.java:155)
	at [email protected]/org.elasticsearch.index.query.AbstractQueryBuilder.rewrite(AbstractQueryBuilder.java:281)
	at [email protected]/org.elasticsearch.index.query.BoolQueryBuilder.rewriteClauses(BoolQueryBuilder.java:399)
	at [email protected]/org.elasticsearch.index.query.BoolQueryBuilder.doRewrite(BoolQueryBuilder.java:353)
	at [email protected]/org.elasticsearch.index.query.AbstractQueryBuilder.rewrite(AbstractQueryBuilder.java:281)
	at [email protected]/org.elasticsearch.index.query.BoolQueryBuilder.rewriteClauses(BoolQueryBuilder.java:399)
	at [email protected]/org.elasticsearch.index.query.BoolQueryBuilder.doRewrite(BoolQueryBuilder.java:353)
	at [email protected]/org.elasticsearch.index.query.AbstractQueryBuilder.rewrite(AbstractQueryBuilder.java:281)
	at [email protected]/org.elasticsearch.search.builder.SubSearchSourceBuilder.rewrite(SubSearchSourceBuilder.java:90)
	at [email protected]/org.elasticsearch.search.builder.SubSearchSourceBuilder.rewrite(SubSearchSourceBuilder.java:38)
	at [email protected]/org.elasticsearch.index.query.Rewriteable.rewrite(Rewriteable.java:57)
	at [email protected]/org.elasticsearch.index.query.Rewriteable.rewrite(Rewriteable.java:40)
	at [email protected]/org.elasticsearch.index.query.Rewriteable.rewrite(Rewriteable.java:125)
	at [email protected]/org.elasticsearch.search.builder.SearchSourceBuilder.rewrite(SearchSourceBuilder.java:1178)
	at [email protected]/org.elasticsearch.search.builder.SearchSourceBuilder.rewrite(SearchSourceBuilder.java:94)
	at [email protected]/org.elasticsearch.index.query.Rewriteable.rewrite(Rewriteable.java:57)
	at [email protected]/org.elasticsearch.index.query.Rewriteable.rewrite(Rewriteable.java:40)
	at [email protected]/org.elasticsearch.search.internal.ShardSearchRequest$RequestRewritable.rewrite(ShardSearchRequest.java:588)
	at [email protected]/org.elasticsearch.search.internal.ShardSearchRequest$RequestRewritable.rewrite(ShardSearchRequest.java:577)
	at [email protected]/org.elasticsearch.index.query.Rewriteable.rewrite(Rewriteable.java:57)
	at [email protected]/org.elasticsearch.search.SearchService.queryStillMatchesAfterRewrite(SearchService.java:1740)
	at [email protected]/org.elasticsearch.action.fieldcaps.FieldCapabilitiesFetcher.canMatchShard(FieldCapabilitiesFetcher.java:245)
	at [email protected]/org.elasticsearch.action.fieldcaps.FieldCapabilitiesFetcher.doFetch(FieldCapabilitiesFetcher.java:113)
	at [email protected]/org.elasticsearch.action.fieldcaps.FieldCapabilitiesFetcher.fetch(FieldCapabilitiesFetcher.java:77)
	at [email protected]/org.elasticsearch.action.fieldcaps.TransportFieldCapabilitiesAction$NodeTransportHandler.lambda$messageReceived$0(TransportFieldCapabilitiesAction.java:591)
	at [email protected]/org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:356)
	at [email protected]/org.elasticsearch.action.fieldcaps.TransportFieldCapabilitiesAction$NodeTransportHandler.messageReceived(TransportFieldCapabilitiesAction.java:570)
	at [email protected]/org.elasticsearch.action.fieldcaps.TransportFieldCapabilitiesAction$NodeTransportHandler.messageReceived(TransportFieldCapabilitiesAction.java:565)
	at [email protected]/org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:579)
	at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
	at [email protected]/org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$3.onResponse(SecurityServerTransportInterceptor.java:632)
	at [email protected]/org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$3.onResponse(SecurityServerTransportInterceptor.java:621)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:640)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:634)
	at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.ResizeRequestInterceptor.intercept(ResizeRequestInterceptor.java:105)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:638)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:634)
	at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.FieldAndDocumentLevelSecurityRequestInterceptor.intercept(FieldAndDocumentLevelSecurityRequestInterceptor.java:79)
	at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.ValidateRequestInterceptor.intercept(ValidateRequestInterceptor.java:20)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:638)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:634)
	at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.FieldAndDocumentLevelSecurityRequestInterceptor.intercept(FieldAndDocumentLevelSecurityRequestInterceptor.java:79)
	at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.ShardSearchRequestInterceptor.intercept(ShardSearchRequestInterceptor.java:23)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:638)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:634)
	at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.FieldAndDocumentLevelSecurityRequestInterceptor.intercept(FieldAndDocumentLevelSecurityRequestInterceptor.java:79)
	at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.UpdateRequestInterceptor.intercept(UpdateRequestInterceptor.java:27)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:638)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:634)
	at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.FieldAndDocumentLevelSecurityRequestInterceptor.intercept(FieldAndDocumentLevelSecurityRequestInterceptor.java:79)
	at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.SearchRequestInterceptor.intercept(SearchRequestInterceptor.java:20)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:638)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:634)
	at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.IndicesAliasesRequestInterceptor.intercept(IndicesAliasesRequestInterceptor.java:127)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:638)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:634)
	at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.SearchRequestCacheDisablingInterceptor.intercept(SearchRequestCacheDisablingInterceptor.java:53)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:638)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:634)
	at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.BulkShardRequestInterceptor.intercept(BulkShardRequestInterceptor.java:85)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:638)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$1.onResponse(AuthorizationService.java:634)
	at [email protected]/org.elasticsearch.xpack.security.authz.interceptor.DlsFlsLicenseRequestInterceptor.intercept(DlsFlsLicenseRequestInterceptor.java:106)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService.runRequestInterceptors(AuthorizationService.java:634)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService.handleIndexActionAuthorizationResult(AuthorizationService.java:619)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService.lambda$authorizeAction$13(AuthorizationService.java:517)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$AuthorizationResultListener.onResponse(AuthorizationService.java:1033)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService$AuthorizationResultListener.onResponse(AuthorizationService.java:999)
	at [email protected]/org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:33)
	at [email protected]/org.elasticsearch.xpack.security.authz.RBACEngine.authorizeIndexAction(RBACEngine.java:384)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService.authorizeAction(AuthorizationService.java:510)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService.maybeAuthorizeRunAs(AuthorizationService.java:442)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService.lambda$authorize$3(AuthorizationService.java:329)
	at [email protected]/org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:257)
	at [email protected]/org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:33)
	at [email protected]/org.elasticsearch.xpack.security.authz.RBACEngine.lambda$resolveAuthorizationInfo$0(RBACEngine.java:156)
	at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:247)
	at [email protected]/org.elasticsearch.xpack.security.authz.store.CompositeRolesStore.lambda$getRoles$4(CompositeRolesStore.java:202)
	at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:247)
	at [email protected]/org.elasticsearch.xpack.security.authz.store.CompositeRolesStore.lambda$getRole$5(CompositeRolesStore.java:220)
	at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:247)
	at [email protected]/org.elasticsearch.xpack.core.security.authz.store.RoleReferenceIntersection.lambda$buildRole$0(RoleReferenceIntersection.java:49)
	at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:247)
	at [email protected]/org.elasticsearch.action.support.GroupedActionListener.onResponse(GroupedActionListener.java:57)
	at [email protected]/org.elasticsearch.xpack.security.authz.store.CompositeRolesStore.buildRoleFromRoleReference(CompositeRolesStore.java:317)
	at [email protected]/org.elasticsearch.xpack.core.security.authz.store.RoleReferenceIntersection.lambda$buildRole$1(RoleReferenceIntersection.java:53)
	at java.base/java.lang.Iterable.forEach(Iterable.java:75)
	at [email protected]/org.elasticsearch.xpack.core.security.authz.store.RoleReferenceIntersection.buildRole(RoleReferenceIntersection.java:53)
	at [email protected]/org.elasticsearch.xpack.security.authz.store.CompositeRolesStore.getRole(CompositeRolesStore.java:218)
	at [email protected]/org.elasticsearch.xpack.security.authz.store.CompositeRolesStore.getRoles(CompositeRolesStore.java:195)
	at [email protected]/org.elasticsearch.xpack.security.authz.RBACEngine.resolveAuthorizationInfo(RBACEngine.java:152)
	at [email protected]/org.elasticsearch.xpack.security.authz.AuthorizationService.authorize(AuthorizationService.java:345)
	at [email protected]/org.elasticsearch.xpack.security.transport.ServerTransportFilter.lambda$inbound$1(ServerTransportFilter.java:114)
	at [email protected]/org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:247)
	at [email protected]/org.elasticsearch.action.ActionListenerImplementations$MappedActionListener.onResponse(ActionListenerImplementations.java:97)
	at [email protected]/org.elasticsearch.xpack.security.authc.AuthenticatorChain.authenticate(AuthenticatorChain.java:93)
	at [email protected]/org.elasticsearch.xpack.security.authc.AuthenticationService.authenticate(AuthenticationService.java:264)
	at [email protected]/org.elasticsearch.xpack.security.authc.AuthenticationService.authenticate(AuthenticationService.java:201)
	at [email protected]/org.elasticsearch.xpack.security.transport.ServerTransportFilter.authenticate(ServerTransportFilter.java:127)
	at [email protected]/org.elasticsearch.xpack.security.transport.ServerTransportFilter.inbound(ServerTransportFilter.java:105)
	at [email protected]/org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:643)
	at [email protected]/org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:76)
	at [email protected]/org.elasticsearch.transport.TransportService$6.doRun(TransportService.java:1098)
	at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1023)
	at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
	at [email protected]/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:34)
	at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1023)
	at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1570)\n

@bpintea bpintea added :Search Relevance/Vectors Vector search :Search/Search Search-related issues that do not fall into other categories >bug labels Nov 1, 2024
@elasticsearchmachine elasticsearchmachine added Team:Search Meta label for search team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Nov 1, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@benwtrent
Copy link
Member

It is strange that field caps is making a semantic query at all :/ I wonder what is going on there.

@javanna javanna removed the :Search/Search Search-related issues that do not fall into other categories label Nov 4, 2024
@elasticsearchmachine elasticsearchmachine removed the Team:Search Meta label for search team label Nov 4, 2024
@benwtrent
Copy link
Member

LOL, looking at the comments on that line of code this is throwing on:

                // This should never happen, but throw on it in case it ever does
                throw new IllegalStateException(
                    "No inference results set for [" + semanticTextFieldType.typeName() + "] field [" + fieldName + "]"
                );

The curse of "This should never happen" has struck again!

@benwtrent
Copy link
Member

@Mikep86 @carlosdelest ^ I wonder if this is because FieldCaps isn't rewriting queries like we expect for can_match?

Basically, field_caps allows index_filter which does a can_match phase to determine if all the shards for a given index will rewrite to a match_none.

This makes me wonder if we should return a MatchAllQuery if inference results is null?

@Mikep86
Copy link
Contributor

Mikep86 commented Nov 4, 2024

++, smells like a missing rewrite to me. It's plausible that we missed this during testing since the can_match phase is skipped most of the time.

@Mikep86 Mikep86 self-assigned this Nov 5, 2024
@benwtrent benwtrent added low-risk An open issue or test failure that is a low risk to future releases :SearchOrg/Relevance Label for the Search (solution/org) Relevance team and removed :Search Relevance/Vectors Vector search labels Dec 5, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-eng (Team:SearchOrg)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/search-relevance (Team:Search - Relevance)

@benwtrent benwtrent added priority:normal A label for assessing bug priority to be used by ES engineers and removed low-risk An open issue or test failure that is a low risk to future releases labels Dec 5, 2024
@Mikep86
Copy link
Contributor

Mikep86 commented Jan 24, 2025

@benwtrent I investigated this issue and was able to reproduce it with these steps:

PUT field-caps-index
{
  "mappings": {
    "properties": {
      "semantic": {
        "type": "semantic_text"
      },
      "text_field": {
        "type": "text"
      }
    }
  }
}

POST field-caps-index/_doc/
{
  "semantic": "some test data",
  "text_field": "other test data"
} 

GET */_field_caps?fields=*&ignore_unavailable=true&expand_wildcards=open&allow_no_indices=true&include_empty_fields=false
{
  "index_filter": {
    "semantic": {
      "field": "semantic",
      "query": "test"
    }
  }
}

What's fascinating about the field caps API in this case is that it executes a subset of the normal rewrite rounds. Most importantly, the coordinator node rewrite round (where inference is performed) is skipped and it jumps straight to the can-match phase using a full SearchExecutionContext. This triggers the bug since the semantic query assumes that the coordinator node rewrite round is always executed first.

I think your proposed fix is the right one, if inferenceResults is null we should rewrite to MatchAllQuery. This assumes that the only way we get here is via an index_filter on the field caps API. Are there any other API paths you're aware of where the coordinator node rewrite round is skipped?

@benwtrent
Copy link
Member

Are there any other API paths you're aware of where the coordinator node rewrite round is skipped?

Not off hand, but I would assume can_match, fetch, field_caps all skip that rewrite. Though, the first two are just part of the normal search phase.

Maybe we should check _count as well? Its an optimized search path, so it shouldn't do any skipping but let's double check.

@Mikep86
Copy link
Contributor

Mikep86 commented Feb 4, 2025

This bug is part of a larger issue with how the Field Caps API rewrites queries. See #121709.

@marfinbirt
Copy link

marfinbirt commented Apr 21, 2025

Hello everyone, any progress on this? We're currently experiencing the same error in a Replicated ES Cluster and don't know how to proceed.

@Mikep86
Copy link
Contributor

Mikep86 commented Apr 21, 2025

@marfinbirt No progress on this, but you can likely work around this by using an exists query on the semantic_text field instead. Since semantic queries perform a vector search operation, they have high recall (i.e. most, if not all, docs with a value for the field will be returned). Thus, an exists query on the field is effectively the same operation.

@marfinbirt
Copy link

I'm not sure what kind of matrix operation the query is performing under the hood when using semantic_text; I'm not an expert on it yet. I'd assume the query_string passed in the query is transformed into an array, which is then compared to each of the documents in the index.

"should": [
  {
    "semantic": {
  	"field": "embeddingSparse",
  	"query": "{{query_string}}",
    }
  }...
]

If this is true, I don't quite understand how replacing the query with exists will generate the same result, especially when exist doesn't receive any query_string. Can you please elaborate?

@kderusso
Copy link
Member

Hey there @marfinbirt it does a vector search under the hood and the exists will filter to ensure only compatible fields are searched.

@Mikep86
Copy link
Contributor

Mikep86 commented Apr 26, 2025

@marfinbirt The semantic query takes the query string and uses the model associated with the semantic_text field to generate an embedding, which is then used to determine document relevance using vector search.

Vector search operations are known to be high recall because every vector has some non-zero similarity to another vector. What matters in vector search is the degree of that similarity, which is why we use parameters like k in the knn query to restrict our results to documents that are similar enough to actually be relevant.

When vector search is used as a filter (as it is in this case), a field existence check (i.e. what the exists query does) is effectively the same due to the high recall nature of the search operation. Put another way, it doesn't matter what query string you use when using vector search as a filter. It will have some non-zero similarity to every embedding indexed, making the operation the same as a field existence check.

@Mikep86 Mikep86 removed their assignment Apr 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug priority:normal A label for assessing bug priority to be used by ES engineers :SearchOrg/Relevance Label for the Search (solution/org) Relevance team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

7 participants