Skip to content

Leverage One Layer of Hierarchical KMeans Structure on Read #129950

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 52 commits into
base: main
Choose a base branch
from

Conversation

john-wagster
Copy link
Contributor

Leverages the hkmeans algorithms structure to create an additional layer of high level centroids (parent partitions) that get queries prior to the full list of centroids themselves.

@benwtrent
Copy link
Member

interesting assert tripped in these PR tests: #130472 (comment) (assuming its been fixed)

@john-wagster
Copy link
Contributor Author

interesting assert tripped in these PR tests: #130472 (comment) (assuming its been fixed)

I get the feeling I know which assert it was and I do believe it's been fixed. I had no idea draft PR's could mute tests on main though. That's a little surprising to me. I'll have to try to be more careful about what I push up in terms of asserts in the future.

@benwtrent
Copy link
Member

I'll have to try to be more careful about what I push up in terms of asserts in the future.

@john-wagster no, I wouldn't. Your PR is your PR. Add all the assertions you want. I think the automation is completely mistaken in taking your PR into account when it comes to a test failure.

@john-wagster john-wagster marked this pull request as ready for review July 14, 2025 11:31
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jul 14, 2025
int centroidOrdinal = centroidQueue.pop();
scorer.resetPostingsScorer(centroidOrdinal, centroidQueryScorer.centroid(centroidOrdinal));
if (priorVisitedCentroids.getAndSet(centroidOrdinal)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be wrong, the centroid ordinals are always unique so we might be visiting the same centroid more than once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The centroids can be duplicated across parent partitions because of the SOAR logic now. So this bitset is meant to deduplicate when a new parent partition is included and it may cause us to re-explore centroids unnecessarily. Are you observing that this logic doesn't work? I can double check it.

@@ -325,9 +342,12 @@ private void mergeOneFieldIVF(FieldInfo fieldInfo, MergeState mergeState) throws

CentroidSupplier centroidSupplier = createCentroidSupplier(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a while to understand this, I think we should not reuse the written centroids now that we have a complex structure but write them in the temp file in order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've been confused about this code too. I'm not sure I fully understood why we have the temp file. I think it's been like this for awhile? It looks like it just gets written and then immediately copied into ivfCentroids. I think that hasn't changed. Do you happen to know if there is some reason we were doing this previously?

And are you suggesting that above we should be passing in both centroidTemp and ivfCentroids in calculateAndWriteCentroids and then writing the in order centroids to centroidTemp. If so I like this idea and am happy to refactor to do so. If I'm misunderstanding please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs:triage Requires assignment of a team area label v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants