-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Bugfix: fixed scroll with knn query #126035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
@weizijun Can you please provide the evidence that scroll doesn't work with knn (your provided test fails for different reasons). For me scroll works well and restricted by k docs: PUT index1
{
"mappings": {
"properties": {
"my_field": {
"type": "dense_vector",
"dims": 2,
"similarity": "l2_norm"
}
}
}
}
OST index1/_bulk
{"index": {"_index": "index1", "_id": "0"}}
{"my_field": [0.0, 0.10]}
{"index": {"_index": "index1", "_id": "1"}}
{"my_field": [0.1, 0.1]}
{"index": {"_index": "index1", "_id": "2"}}
{"my_field": [0.2, 0.2]}
{"index": {"_index": "index1", "_id": "3"}}
{"my_field": [0.3, 0.3]}
{"index": {"_index": "index1", "_id": "4"}}
{"my_field": [0.4, 0.4]}
{"index": {"_index": "index1", "_id": "5"}}
{"my_field": [0.5, 0.5]}
{"index": {"_index": "index1", "_id": "6"}}
{"my_field": [0.6, 0.6]}
{"index": {"_index": "index1", "_id": "7"}}
{"my_field": [0.7, 0.7]}
{"index": {"_index": "index1", "_id": "8"}}
{"my_field": [0.8, 0.8]}
{"index": {"_index": "index1", "_id": "9"}}
{"my_field": [0.9, 0.9]}
{"index": {"_index": "index1", "_id": "10"}}
{"my_field": [1.0, 1.0]}
{"index": {"_index": "index1", "_id": "11"}}
{"my_field": [1.1, 1.1]}
{"index": {"_index": "index1", "_id": "12"}}
{"my_field": [1.2, 1.2]}
{"index": {"_index": "index1", "_id": "13"}}
{"my_field": [1.3, 1.3]}
{"index": {"_index": "index1", "_id": "14"}}
{"my_field": [1.4, 1.4]}
{"index": {"_index": "index1", "_id": "15"}}
{"my_field": [1.5, 1.5]}
{"index": {"_index": "index1", "_id": "16"}}
{"my_field": [1.6, 1.6]}
{"index": {"_index": "index1", "_id": "17"}}
{"my_field": [1.7, 1.7]}
{"index": {"_index": "index1", "_id": "18"}}
{"my_field": [1.8, 1.8]}
{"index": {"_index": "index1", "_id": "19"}}
{"my_field": [1.9, 1.9]}
{"index": {"_index": "index1", "_id": "20"}}
{"my_field": [2.0, 2.0]}
{"index": {"_index": "index1", "_id": "21"}}
{"my_field": [2.1, 2.1]}
{"index": {"_index": "index1", "_id": "22"}}
{"my_field": [2.2, 2.2]}
{"index": {"_index": "index1", "_id": "23"}}
{"my_field": [2.3, 2.3]}
{"index": {"_index": "index1", "_id": "24"}}
{"my_field": [2.4, 2.4]}
POST index1/_search?scroll=1m //retrieve first 10 docs
{
"query": {
"knn": {
"field": "my_field",
"query_vector": [0.0, 0.0],
"k" : 25
}
}
}
POST _search/scroll. // retrieves following 10 docs
{
"scroll": "1m",
"scroll_id": "FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFjhhRTBDcDVsUV9HeElEdU1UdkJ1R1EAAAAAAWGnwxYxRTRoMzdndFRyNnd0TGU5anlyY0p3"
} |
In your case, create 2 datanodes, the shard of index1 is in node1, and call the api from node2. |
set k=10, you will find that the total_hits is 25, not 10. |
@weizijun Thanks for your PR. We discussed with the team, and decided that it is useful to have knn with scroll. The patch LGTM, I will add to the tests. |
@elasticmachine generate changelog |
@elasticsearchmachine test this please |
@elasticsearchmachine test this please |
@elasticsearchmachine run elasticsearch-ci/part-2 |
@elasticsearchmachine test this please |
💚 Backport successful
|
* Bugfix: fixed scroll with knn query (#126035) Although scrolling is not recommended for knn queries, it is effective. But I found a bug that when use scroll in the knn query, the But I found a bug that when using scroll in knn query, knn_score_doc will be lost in query phase, which means knn query does not work. In addition, the operations for directly querying the node where the shard is located and querying the node with transport are different. It can be reproduced on the local node. Because the query phase uses the previous ShardSearchRequest object stored before the dfs phase. But when it run in the local node, it don't do the encode and decode processso the operation is correct. I wrote an IT to reproduce it and fixed it by adding the new source to the LegacyReaderContext. * Fix compile issue --------- Co-authored-by: weizijun <[email protected]>
Thanks to @mayya-sharipova for helping improve this PR. |
Although scrolling is not recommended for knn queries, it is effective.
But I found a bug that when use scroll in the knn query, the But I found a bug that when using scroll in knn query, knn_score_doc will be lost in query phase, which means knn query does not work.
In addition, the operations for directly querying the node where the shard is located and querying the node with transport are different.
It can be reproduced on the local node.
Because the query phase uses the previous ShardSearchRequest object stored before the dfs phase.
But when it run in the local node, it don't do the encode and decode processso the operation is correct.
I wrote an IT to reproduce it and fixed it by adding the new source to the LegacyReaderContext.