Skip to content

Avoid using direct I/O during vector merges #127406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 28, 2025

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Apr 25, 2025

The vector file isn't reopened during merge operations, so relying on IOContext to disable direct I/O during merges is ineffective. This change updates the strategy to explicitly open the vector file twice when direct I/O is enabled: once for default reads and once for direct I/O. We then switch to the appropriate index input in getMergeInstance, following the same approach used by other formats.

Direct I/O support for vectors is not yet released so marking this PR as non-issue for now.

jimczi added 2 commits April 25, 2025 17:35
The vector file isn't reopened during merge operations, so relying on `IOContext` to disable direct I/O during merges is ineffective.
This change updates the strategy to explicitly open the vector file twice when direct I/O is enabled: once for default reads and once for direct I/O.
We then switch to the appropriate index input in `getMergeInstance`, following the same approach used by other formats.
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Apr 25, 2025
Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jim, LGTM

@thecoop
Copy link
Member

thecoop commented Apr 28, 2025

Ideally this wouldn't modify DirectIOLucene99FlatVectorsReader, as that will go away in time. Could we instead have a wrapper that creates the right sort of reader from getMergeInstance, and delegates for the other methods?

@thecoop thecoop requested a review from ChrisHegarty April 28, 2025 12:59
@jimczi jimczi merged commit 45d321d into elastic:main Apr 28, 2025
17 checks passed
@jimczi jimczi deleted the bbq_direct_io_merge_instance branch April 28, 2025 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants