Copy Lucene99FlatVectorsReader allowing direct IO to be specified directly #125921

thecoop · 2025-03-31T08:36:24Z

Copy and modify Lucene99FlatVectorsReader allowing direct IO to be used for raw vector data.

This has a 10-15% increase in performance on searches for the so_vector rally track, but a few tests have dropped ~10% (knn-search-10-50-match-all, knn-search-10-50-css, knn-search-10-50-match-all-force-merge)

…ectly

server/src/main/java/org/elasticsearch/index/codec/vectors/es818/ES818FlatVectorsFormat.java

server/src/main/java/org/elasticsearch/index/store/FsDirectoryFactory.java

server/src/main/java/org/elasticsearch/index/codec/vectors/es818/DirectIODirectory.java

server/src/main/java/org/elasticsearch/index/store/FsDirectoryFactory.java

elasticsearchmachine · 2025-04-08T12:27:01Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

ChrisHegarty

LGTM. @jimczi @benwtrent any concerns?

server/src/main/java/org/elasticsearch/index/codec/vectors/es818/ES818FlatVectorsFormat.java

…r-direct-io

server/src/main/java/org/elasticsearch/index/codec/vectors/es818/ES818FlatVectorsFormat.java

server/src/main/java/org/elasticsearch/index/codec/vectors/es818/ES818FlatVectorsReader.java

server/src/main/java/org/elasticsearch/index/store/FsDirectoryFactory.java

ChrisHegarty · 2025-04-16T08:18:35Z

@thecoop The use of direct I/O for BBQ rescoring by default is good 👍. However, there can be some cases where there is sufficient RAM available which could be somewhat negatively affected by this change - the float32 vectors would eventually become fully paged into memory. Let's add a system property that allows to revert to the previous behaviour. That way, for some narrow set of environments, a user could restore the previous behaviour.

thecoop · 2025-04-16T08:42:36Z

Is there a way to detect if we do actually have enough memory to keep all the floats paged in? Or will that come later with the memory usage monitoring?

…r-direct-io

jimczi

LGTM

thecoop requested a review from ChrisHegarty March 31, 2025 08:36

elasticsearchmachine added the v9.1.0 label Mar 31, 2025

Copy Lucene99FlatVectorsReader allowing direct IO to be specified dir…

a9be5ea

…ectly

thecoop force-pushed the modified-flat-vector-direct-io branch from e39b14a to a9be5ea Compare March 31, 2025 09:04

Merge branch 'main' into modified-flat-vector-direct-io

359fe69

ChrisHegarty reviewed Mar 31, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/codec/vectors/es818/ES818FlatVectorsFormat.java Outdated Show resolved Hide resolved

ChrisHegarty reviewed Mar 31, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/store/FsDirectoryFactory.java Outdated Show resolved Hide resolved

thecoop added 4 commits March 31, 2025 13:13

Use DirectIODirectory implementation

991dd6d

Add entitlement

a2ed4c8

Sometimes DirectIO is not available

08d2a76

Allow for subpaths

9294c04

ChrisHegarty reviewed Apr 4, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/codec/vectors/es818/DirectIODirectory.java Outdated Show resolved Hide resolved

server/src/main/java/org/elasticsearch/index/store/FsDirectoryFactory.java Outdated Show resolved Hide resolved

Specify buffer size

e2f3693

thecoop marked this pull request as ready for review April 8, 2025 12:26

thecoop requested a review from a team as a code owner April 8, 2025 12:26

Merge branch 'main' into modified-flat-vector-direct-io

446caf8

thecoop added :Search Relevance/Vectors Vector search >refactoring labels Apr 8, 2025

elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Apr 8, 2025

thecoop added 3 commits April 14, 2025 12:06

Only use direct IO in searches

b0325ce

Merge branch 'main' into modified-flat-vector-direct-io

dd3300f

Update copyright year and comments

00682ca

ChrisHegarty approved these changes Apr 15, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/codec/vectors/es818/ES818FlatVectorsFormat.java Outdated Show resolved Hide resolved

thecoop added 2 commits April 15, 2025 13:40

Update comment

b51548b

Merge remote-tracking branch 'upstream/main' into modified-flat-vecto…

30092c8

…r-direct-io

jimczi reviewed Apr 15, 2025

View reviewed changes

Merge branch 'main' into modified-flat-vector-direct-io

9c86a05

thecoop added 7 commits April 16, 2025 10:38

Use a system property to turn off direct IO if needed

22f740d

Merge branch 'main' into modified-flat-vector-direct-io

e258015

Merge branch 'main' into modified-flat-vector-direct-io

2b8d4fd

Tweak the names

c7149ff

Add comment

94dbac0

Merge remote-tracking branch 'upstream/main' into modified-flat-vecto…

521245e

…r-direct-io

Merge branch 'main' into modified-flat-vector-direct-io

b8b714a

thecoop requested a review from jimczi April 23, 2025 15:46

Merge branch 'main' into modified-flat-vector-direct-io

93a9bdb

jimczi approved these changes Apr 23, 2025

View reviewed changes

thecoop added 2 commits April 24, 2025 08:33

Merge branch 'main' into modified-flat-vector-direct-io

27c96b3

Add conditions for new reader

3f90ea5

thecoop merged commit c5ada66 into elastic:main Apr 24, 2025
17 checks passed

thecoop deleted the modified-flat-vector-direct-io branch April 24, 2025 10:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Copy Lucene99FlatVectorsReader allowing direct IO to be specified directly #125921

Copy Lucene99FlatVectorsReader allowing direct IO to be specified directly #125921

thecoop commented Mar 31, 2025 •

edited

Loading

elasticsearchmachine commented Apr 8, 2025

ChrisHegarty left a comment

ChrisHegarty commented Apr 16, 2025

thecoop commented Apr 16, 2025

jimczi left a comment

Copy Lucene99FlatVectorsReader allowing direct IO to be specified directly #125921

Copy Lucene99FlatVectorsReader allowing direct IO to be specified directly #125921

Conversation

thecoop commented Mar 31, 2025 • edited Loading

elasticsearchmachine commented Apr 8, 2025

ChrisHegarty left a comment

Choose a reason for hiding this comment

ChrisHegarty commented Apr 16, 2025

thecoop commented Apr 16, 2025

jimczi left a comment

Choose a reason for hiding this comment

thecoop commented Mar 31, 2025 •

edited

Loading