Skip to content

Introduce an int4 off-heap vector scorer #129824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 23, 2025

Conversation

iverase
Copy link
Contributor

@iverase iverase commented Jun 23, 2025

In our IVF implementation, we are currently scoring centroid quantized using int4 on heap. We know that scoring vectors directly from the Index can yield good speedups. Therefore this commit introduces a new class that scorers vectors quantized using int4 off-heap data structures.

this indeed yields a nice speed up:

Benchmark                                             (dims)   Mode  Cnt   Score   Error   Units
Int4ScorerBenchmark.scoreFromArray                       384  thrpt    5  15.384 ± 0.445  ops/ms
Int4ScorerBenchmark.scoreFromArray                       702  thrpt    5   8.908 ± 0.697  ops/ms
Int4ScorerBenchmark.scoreFromArray                      1024  thrpt    5   7.605 ± 0.149  ops/ms
Int4ScorerBenchmark.scoreFromMemorySegmentOnlyVector     384  thrpt    5  16.463 ± 0.456  ops/ms
Int4ScorerBenchmark.scoreFromMemorySegmentOnlyVector     702  thrpt    5   9.854 ± 0.276  ops/ms
Int4ScorerBenchmark.scoreFromMemorySegmentOnlyVector    1024  thrpt    5   8.701 ± 0.865  ops/ms

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jun 23, 2025
Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good step one towards bulk scoring!

I wonder if we can be even faster if we did "two pass" approximations e.g. dot-product against the centroids with their higher bits, and then only refined with lower bits if within some acceptable error threshold. This would complicate the centroid query logic as we would need to do multiple passes over the centroids. But since we score all of them right now, it seems pretty simple to do.

@iverase iverase added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jun 23, 2025
@iverase iverase enabled auto-merge (squash) June 23, 2025 16:43
@iverase iverase disabled auto-merge June 23, 2025 16:44
@iverase iverase merged commit ffea6ca into elastic:main Jun 23, 2025
31 of 32 checks passed
@iverase iverase deleted the ES91Int4VectorsScorer branch June 23, 2025 16:44
mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jun 25, 2025
* Introduce an int4 off-heap vector scorer

* iter

* Update server/src/main/java/org/elasticsearch/index/codec/vectors/DefaultIVFVectorsReader.java

Co-authored-by: Benjamin Trent <[email protected]>

---------

Co-authored-by: Benjamin Trent <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >non-issue :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants