-
Notifications
You must be signed in to change notification settings - Fork 25.2k
IllegalStateException thrown in SemanticQueryBuilder #116106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
Pinging @elastic/es-search (Team:Search) |
It is strange that field caps is making a semantic query at all :/ I wonder what is going on there. |
LOL, looking at the comments on that line of code this is throwing on:
The curse of "This should never happen" has struck again! |
@Mikep86 @carlosdelest ^ I wonder if this is because FieldCaps isn't rewriting queries like we expect for can_match? Basically, field_caps allows This makes me wonder if we should return a |
++, smells like a missing rewrite to me. It's plausible that we missed this during testing since the can_match phase is skipped most of the time. |
Pinging @elastic/search-eng (Team:SearchOrg) |
Pinging @elastic/search-relevance (Team:Search - Relevance) |
@benwtrent I investigated this issue and was able to reproduce it with these steps: PUT field-caps-index
{
"mappings": {
"properties": {
"semantic": {
"type": "semantic_text"
},
"text_field": {
"type": "text"
}
}
}
}
POST field-caps-index/_doc/
{
"semantic": "some test data",
"text_field": "other test data"
}
GET */_field_caps?fields=*&ignore_unavailable=true&expand_wildcards=open&allow_no_indices=true&include_empty_fields=false
{
"index_filter": {
"semantic": {
"field": "semantic",
"query": "test"
}
}
} What's fascinating about the field caps API in this case is that it executes a subset of the normal rewrite rounds. Most importantly, the coordinator node rewrite round (where inference is performed) is skipped and it jumps straight to the can-match phase using a full I think your proposed fix is the right one, if |
Not off hand, but I would assume Maybe we should check |
This bug is part of a larger issue with how the Field Caps API rewrites queries. See #121709. |
Hello everyone, any progress on this? We're currently experiencing the same error in a Replicated ES Cluster and don't know how to proceed. |
@marfinbirt No progress on this, but you can likely work around this by using an exists query on the |
I'm not sure what kind of matrix operation the query is performing under the hood when using semantic_text; I'm not an expert on it yet. I'd assume the query_string passed in the query is transformed into an array, which is then compared to each of the documents in the index.
If this is true, I don't quite understand how replacing the query with exists will generate the same result, especially when exist doesn't receive any query_string. Can you please elaborate? |
Hey there @marfinbirt it does a vector search under the hood and the exists will filter to ensure only compatible fields are searched. |
@marfinbirt The Vector search operations are known to be high recall because every vector has some non-zero similarity to another vector. What matters in vector search is the degree of that similarity, which is why we use parameters like When vector search is used as a filter (as it is in this case), a field existence check (i.e. what the |
Description
For a
_field_caps
request withparams: {types=, ignore_unavailable=true, expand_wildcards=open, allow_no_indices=true, index=*,-.*, serverlessRequest=true, include_empty_fields=false}
, resulting in astatus: 500
, the following suppressed exception is logged:The text was updated successfully, but these errors were encountered: