Skip to content

Fix bbq quantization algorithm but for differently distributed components #126778

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 14, 2025

Conversation

benwtrent
Copy link
Member

We had a silly bug in quantizing vectors in bbq where we were scaling the initial quantile optimization parameters incorrectly given the vector component distribution.

In distributions where this has a major impact, the recall results were abysmal and rendered the quantization technique useless.

In modern, well distributed components, this change is almost a no-op.

@benwtrent benwtrent added >bug auto-backport Automatically create backport pull requests when merged :Search Relevance/Vectors Vector search v8.19.0 v9.0.1 v9.1.0 labels Apr 14, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Apr 14, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine
Copy link
Collaborator

Hi @benwtrent, I've created a changelog YAML for you.

Copy link
Contributor

@john-wagster john-wagster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this was the same fix you made here but applied to ES correct: apache/lucene#14374

lgtm no concerns

@benwtrent benwtrent added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Apr 14, 2025
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x
9.0

benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Apr 14, 2025
…ents (elastic#126778)

We had a silly bug in quantizing vectors in bbq where we were scaling
the initial quantile optimization parameters incorrectly given the
vector component distribution. 

In distributions where this has a major impact, the recall results were
abysmal and rendered the quantization technique useless. 

In modern, well distributed components, this change is almost a no-op.
elasticsearchmachine pushed a commit that referenced this pull request Apr 14, 2025
…ents (#126778) (#126794)

We had a silly bug in quantizing vectors in bbq where we were scaling
the initial quantile optimization parameters incorrectly given the
vector component distribution. 

In distributions where this has a major impact, the recall results were
abysmal and rendered the quantization technique useless. 

In modern, well distributed components, this change is almost a no-op.
elasticsearchmachine pushed a commit that referenced this pull request Apr 14, 2025
…ents (#126778) (#126793)

We had a silly bug in quantizing vectors in bbq where we were scaling
the initial quantile optimization parameters incorrectly given the
vector component distribution. 

In distributions where this has a major impact, the recall results were
abysmal and rendered the quantization technique useless. 

In modern, well distributed components, this change is almost a no-op.
@weizijun
Copy link
Contributor

weizijun commented May 6, 2025

Hi, @benwtrent, will this change in quantification algorithm affect existing bbq quantification data? Do old vector segments need to be repaired?

@benwtrent
Copy link
Member Author

Do old vector segments need to be repaired?

For older segments that were getting good results, this change won't effect them.

For older segments that were getting very poor results, they will need to be reindexed to take advantage of the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >bug :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.19.0 v9.0.1 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants