-
Notifications
You must be signed in to change notification settings - Fork 107
[KNN] Adding default value for oversampling in 9.1.0 #1290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Samiul-TheSoccerFan
wants to merge
4
commits into
main
Choose a base branch
from
update_documentation_for_oversample
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
865e5ee
adding default value for oversampling in the documentation
Samiul-TheSoccerFan 251291c
updating the documentation for bbq and oversample
Samiul-TheSoccerFan 85feae8
Update documentation wording and adding version
Samiul-TheSoccerFan 5a34ea3
Merge branch 'main' into update_documentation_for_oversample
Samiul-TheSoccerFan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
updating the documentation for bbq and oversample
- Loading branch information
commit 251291c6c5fd8b78cee0ba66a66c26db440406dc
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -901,7 +901,7 @@ Approximate kNN search always uses the [`dfs_query_then_fetch`](https://www.elas | |||||
|
||||||
When using [quantized vectors](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization) for kNN search, you can optionally rescore results to balance performance and accuracy, by doing: | ||||||
|
||||||
* **Oversampling**: Retrieve more candidates per shard. Starting in `9.1.0`, the default value for oversample is `3.0`. | ||||||
* **Oversampling**: Retrieve more candidates per shard. The default is `3.0` in `bbq`. | ||||||
* **Rescoring**: Use the original vector values for re-calculating the score on the oversampled candidates. | ||||||
|
||||||
As the non-quantized, original vectors are used to calculate the final score on the top results, rescoring combines: | ||||||
|
@@ -913,7 +913,7 @@ All forms of quantization will result in some accuracy loss and as the quantizat | |||||
|
||||||
* `int8` requires minimal if any rescoring | ||||||
* `int4` requires some rescoring for higher accuracy and larger recall scenarios. Generally, oversampling by 1.5x-2x recovers most of the accuracy loss. | ||||||
* `bbq` requires rescoring except on exceptionally large indices or models specifically designed for quantization. We have found that between 3x-5x oversampling is generally sufficient. But for fewer dimensions or vectors that do not quantize well, higher oversampling may be required. | ||||||
* `bbq` requires rescoring except on exceptionally large indices or models specifically designed for quantization. We have found that between 3x-5x oversampling is generally sufficient. But for fewer dimensions or vectors that do not quantize well, higher oversampling may be required. As noted above, we default to an oversampling factor of `3.0`. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, that works. Thank you :) |
||||||
|
||||||
You can use the `rescore_vector` [preview] option to automatically perform reranking. When a rescore `oversample` parameter is specified, the approximate kNN search will: | ||||||
|
||||||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.