Skip to content

Investigate lazy filter iteration for IVF #129652

Open
@benwtrent

Description

@benwtrent

Description

For HNSW, building the bit-set up-front for filtering is pretty much required as vector ordinals will never be iterated in doc-id order.

For IVF, we do iterate each of the postings list in doc-id order. meaning, we can simply apply the filter as each posting list is scored.

I think to test this, we should:

  • have completely random filters
  • positively correlated filters (e.g. filtered docs are also the nearest...)
  • negatively correlated filters (e.g. filtered docs are the furthest)

It can be costly to iterate ALL docs to gather a filtered bit set and then apply it, this optimization could be quite good for filtered vector search.

//cc @jimczi

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions