Reduce final clustering pass sample size #130451

benwtrent · 2025-07-02T12:12:27Z

Figuring out the right balance on index throughput and speed is tricky. Initially I was digging into reducing the "neighborhood" size for the fix up. This actually harmed recall a bit too much in my tests, while it did speed things up. While I do think there is ground to be covered there, I pivoted to reducing the sample size since now we actually have true random sampling (instead of the first N docs).

In the extreme case, this improves force-merge time by 25% with zero change in recall. On the lower end, it only improves about 8%.

I really do think there is ground to be recovered in the "fix up phase", but this is a nice improvement :).

index_name                           index_type  num_docs  index_time(ms)  force_merge_time(ms)  num_segments
-----------------------------------  ----------  --------  --------------  --------------------  ------------
corpus-quora-E5-small.fvec.flat             ivf    500000           17443                 18422             0
cohere-wikipedia-docs-768d.vec              ivf   2000000          156320                193383             0
corpus-dbpedia-entity-arctic-0.fvec         ivf   1000000           92902                 82131             0

index_name                           index_type  n_probe  latency(ms)  net_cpu_time(ms)  avg_cpu_count      QPS  recall   visited
-----------------------------------  ----------  -------  -----------  ----------------  -------------  -------  ------  --------
corpus-quora-E5-small.fvec.flat             ivf       10         0.95              0.00           0.00  1052.63    0.83   5713.06
corpus-quora-E5-small.fvec.flat             ivf       20         0.69              0.00           0.00  1449.28    0.89  10620.80
corpus-quora-E5-small.fvec.flat             ivf       30         0.81              0.00           0.00  1234.57    0.92  15498.94
corpus-quora-E5-small.fvec.flat             ivf       40         0.94              0.00           0.00  1063.83    0.93  20088.68
corpus-quora-E5-small.fvec.flat             ivf       50         1.11              0.00           0.00   900.90    0.94  24801.41
cohere-wikipedia-docs-768d.vec              ivf       10         1.20              0.00           0.00   833.33    0.66   2824.19
cohere-wikipedia-docs-768d.vec              ivf       20         1.33              0.00           0.00   751.88    0.74   4875.23
cohere-wikipedia-docs-768d.vec              ivf       30         1.44              0.00           0.00   694.44    0.79   6974.69
cohere-wikipedia-docs-768d.vec              ivf       40         1.56              0.00           0.00   641.03    0.81   9147.20
cohere-wikipedia-docs-768d.vec              ivf       50         1.66              0.00           0.00   602.41    0.83  11478.62
cohere-wikipedia-docs-768d.vec              ivf       60         1.80              0.00           0.00   555.56    0.85  13863.93
cohere-wikipedia-docs-768d.vec              ivf       70         1.96              0.00           0.00   510.20    0.87  16301.12
cohere-wikipedia-docs-768d.vec              ivf       80         2.05              0.00           0.00   487.80    0.88  18761.24
cohere-wikipedia-docs-768d.vec              ivf       90         2.18              0.00           0.00   458.72    0.89  21185.38
cohere-wikipedia-docs-768d.vec              ivf      100         2.27              0.00           0.00   440.53    0.90  23648.77
corpus-dbpedia-entity-arctic-0.fvec         ivf       10         0.79              0.00           0.00  1265.82    0.52   3654.77
corpus-dbpedia-entity-arctic-0.fvec         ivf       20         0.97              0.00           0.00  1030.93    0.61   7170.57
corpus-dbpedia-entity-arctic-0.fvec         ivf       30         1.13              0.00           0.00   884.96    0.67  10761.73
corpus-dbpedia-entity-arctic-0.fvec         ivf       40         1.27              0.00           0.00   787.40    0.70  14550.00
corpus-dbpedia-entity-arctic-0.fvec         ivf       50         1.42              0.00           0.00   704.23    0.72  18149.22
corpus-dbpedia-entity-arctic-0.fvec         ivf       60         1.61              0.00           0.00   621.12    0.74  21971.72
corpus-dbpedia-entity-arctic-0.fvec         ivf       70         1.74              0.00           0.00   574.71    0.76  25612.96
corpus-dbpedia-entity-arctic-0.fvec         ivf       80         1.94              0.00           0.00   515.46    0.77  29311.67
corpus-dbpedia-entity-arctic-0.fvec         ivf       90         2.05              0.00           0.00   487.80    0.78  33034.66
corpus-dbpedia-entity-arctic-0.fvec         ivf      100         2.23              0.00           0.00   448.43    0.80  36743.77

elasticsearchmachine · 2025-07-02T12:12:52Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

iverase

I observed the same. Reducing by half the number of sample size by half in the last clustering pass feels a good trade off.

Figuring out the right balance on index throughput and speed is tricky. Initially I was digging into reducing the "neighborhood" size for the fix up. This actually harmed recall a bit too much in my tests, while it did speed things up. While I do think there is ground to be covered there, I pivoted to reducing the sample size since now we actually have true random sampling (instead of the first N docs). In the extreme case, this improves force-merge time by 25% with zero change in recall. On the lower end, it only improves about 8%. I really do think there is ground to be recovered in the "fix up phase", but this is a nice improvement :). ``` index_name index_type num_docs index_time(ms) force_merge_time(ms) num_segments ----------------------------------- ---------- -------- -------------- -------------------- ------------ corpus-quora-E5-small.fvec.flat ivf 500000 17443 18422 0 cohere-wikipedia-docs-768d.vec ivf 2000000 156320 193383 0 corpus-dbpedia-entity-arctic-0.fvec ivf 1000000 92902 82131 0 index_name index_type n_probe latency(ms) net_cpu_time(ms) avg_cpu_count QPS recall visited ----------------------------------- ---------- ------- ----------- ---------------- ------------- ------- ------ -------- corpus-quora-E5-small.fvec.flat ivf 10 0.95 0.00 0.00 1052.63 0.83 5713.06 corpus-quora-E5-small.fvec.flat ivf 20 0.69 0.00 0.00 1449.28 0.89 10620.80 corpus-quora-E5-small.fvec.flat ivf 30 0.81 0.00 0.00 1234.57 0.92 15498.94 corpus-quora-E5-small.fvec.flat ivf 40 0.94 0.00 0.00 1063.83 0.93 20088.68 corpus-quora-E5-small.fvec.flat ivf 50 1.11 0.00 0.00 900.90 0.94 24801.41 cohere-wikipedia-docs-768d.vec ivf 10 1.20 0.00 0.00 833.33 0.66 2824.19 cohere-wikipedia-docs-768d.vec ivf 20 1.33 0.00 0.00 751.88 0.74 4875.23 cohere-wikipedia-docs-768d.vec ivf 30 1.44 0.00 0.00 694.44 0.79 6974.69 cohere-wikipedia-docs-768d.vec ivf 40 1.56 0.00 0.00 641.03 0.81 9147.20 cohere-wikipedia-docs-768d.vec ivf 50 1.66 0.00 0.00 602.41 0.83 11478.62 cohere-wikipedia-docs-768d.vec ivf 60 1.80 0.00 0.00 555.56 0.85 13863.93 cohere-wikipedia-docs-768d.vec ivf 70 1.96 0.00 0.00 510.20 0.87 16301.12 cohere-wikipedia-docs-768d.vec ivf 80 2.05 0.00 0.00 487.80 0.88 18761.24 cohere-wikipedia-docs-768d.vec ivf 90 2.18 0.00 0.00 458.72 0.89 21185.38 cohere-wikipedia-docs-768d.vec ivf 100 2.27 0.00 0.00 440.53 0.90 23648.77 corpus-dbpedia-entity-arctic-0.fvec ivf 10 0.79 0.00 0.00 1265.82 0.52 3654.77 corpus-dbpedia-entity-arctic-0.fvec ivf 20 0.97 0.00 0.00 1030.93 0.61 7170.57 corpus-dbpedia-entity-arctic-0.fvec ivf 30 1.13 0.00 0.00 884.96 0.67 10761.73 corpus-dbpedia-entity-arctic-0.fvec ivf 40 1.27 0.00 0.00 787.40 0.70 14550.00 corpus-dbpedia-entity-arctic-0.fvec ivf 50 1.42 0.00 0.00 704.23 0.72 18149.22 corpus-dbpedia-entity-arctic-0.fvec ivf 60 1.61 0.00 0.00 621.12 0.74 21971.72 corpus-dbpedia-entity-arctic-0.fvec ivf 70 1.74 0.00 0.00 574.71 0.76 25612.96 corpus-dbpedia-entity-arctic-0.fvec ivf 80 1.94 0.00 0.00 515.46 0.77 29311.67 corpus-dbpedia-entity-arctic-0.fvec ivf 90 2.05 0.00 0.00 487.80 0.78 33034.66 corpus-dbpedia-entity-arctic-0.fvec ivf 100 2.23 0.00 0.00 448.43 0.80 36743.77 ```

Reduce final clustering pass sample size

daf6316

benwtrent requested a review from iverase July 2, 2025 12:12

benwtrent added >non-issue :Search Relevance/Vectors Vector search v9.2.0 labels Jul 2, 2025

elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jul 2, 2025

iverase approved these changes Jul 2, 2025

View reviewed changes

tteofili approved these changes Jul 2, 2025

View reviewed changes

[CI] Auto commit changes from spotless

b22d7d1

benwtrent added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jul 2, 2025

elasticsearchmachine merged commit a9625ce into elastic:main Jul 2, 2025
32 checks passed

benwtrent deleted the ivf/reduce-final-iteration-sample-size branch July 2, 2025 14:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reduce final clustering pass sample size #130451

Reduce final clustering pass sample size #130451

Uh oh!

benwtrent commented Jul 2, 2025

Uh oh!

elasticsearchmachine commented Jul 2, 2025

Uh oh!

iverase left a comment

Uh oh!

Uh oh!

Uh oh!

Reduce final clustering pass sample size #130451

Reduce final clustering pass sample size #130451

Uh oh!

Conversation

benwtrent commented Jul 2, 2025

Uh oh!

elasticsearchmachine commented Jul 2, 2025

Uh oh!

iverase left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!