Open
Description
Description
For HNSW, building the bit-set up-front for filtering is pretty much required as vector ordinals will never be iterated in doc-id order.
For IVF, we do iterate each of the postings list in doc-id order. meaning, we can simply apply the filter as each posting list is scored.
I think to test this, we should:
- have completely random filters
- positively correlated filters (e.g. filtered docs are also the nearest...)
- negatively correlated filters (e.g. filtered docs are the furthest)
It can be costly to iterate ALL docs to gather a filtered bit set and then apply it, this optimization could be quite good for filtered vector search.
//cc @jimczi